Add FPGA Optimized Register File Version
Created by: ganoam
Add a register file, optimized for synthesis on FPGAs supporting distributed RAM.
Principle:
The baseline implementation implements the register file as an array of flip-flops and implements large multiplexers for read- and write- accesses. On FPGAs, we have a more efficient implementation for data storage: By using distributed RAM for memory storage, we can store up to 64 bits in just one LUT (depending on the memory layout and FPGA device). In addition, distributed RAM comes with integrated address decoders. The register file features one distributed RAM block per implemented sync write port, each with the parametrized number of async read ports. The read access is arbitrated depending on which block was last written to. For this purpose an additional array of NUM_WORDS registers is maintained keeping track of write accesses.
Since both FFs and multiplexers are an expensive structure on FPGA technology, the achieved savings are considerable. The register file is used for the FPU and general purpose register files.
Concrete Savings: (Xilinx Kintex-7, xc7k325tffg900-2)
LUT FF LUTRAM
---------------------------------
baseline: 40499 22799 0
optimized: 36350 18806 440
---------------------------------
Diff -4149 -3993 +440
-10.2% -17.5%
Signed-off-by: ganoam gnoam@live.com