一、综述
参考ug473.pdf。
常用Memory 资源:
在IP核中,Block memory(distributed memory为CLB中的资源):
通常选用Native,而不用AXI接口:
Block RAM可配置单端口RAM、伪双端口RAM、双端口RAM、单端口ROM、双端口ROM、FIFO。
各个模式调用时可承受的最高频率,参考pg058.pdf:
二、主要功能
学习资源: http://www.asic-world.com/examples/verilog/memories.html
A-RAM/ROM
主要功能:
每一个Block RAM都可配置为1个36Kb的BRAM或1个36Kb的FIFO;同时也可以将其配置为2个单独的18Kb的BRAM或1个18KbBRAM + 1个18Kb的BRAM。
为什么是18k而不是16k(2的整次幂)?因为每8bit一个校验位。2*8 + 2 =18bit。
结论:无论是单端口RAM、伪双端口RAM还是双端口RAM,他们都只有一块Memory。
Single-port RAM:
同步示例:
module ram_sp_sr_sw ( clk , // Clock Input address , // Address Input data , // Data bi-directional cs , // Chip Select we , // Write Enable/Read Enable oe // Output Enable ); parameter DATA_WIDTH = 8 ; parameter ADDR_WIDTH = 8 ; parameter RAM_DEPTH = 1 << ADDR_WIDTH;//--------------Input Ports----------------------- input clk ; input [ADDR_WIDTH-1:0] address ; input cs ; input we ; input oe ; //--------------Inout Ports----------------------- inout [DATA_WIDTH-1:0] data ;//--------------Internal variables---------------- reg [DATA_WIDTH-1:0] data_out ; reg [DATA_WIDTH-1:0] mem [0:RAM_DEPTH-1]; reg oe_r;//--------------Code Starts Here------------------ // Tri-State Buffer control // output : When we = 0, oe = 1, cs = 1 assign data = (cs && oe && !we) ? data_out : 8'bz; // Memory Write Block // Write Operation : When we = 1, cs = 1 always @ (posedge clk) begin : MEM_WRITEif ( cs && we ) beginmem[address] = data;end end// Memory Read Block // Read Operation : When we = 0, oe = 1, cs = 1 always @ (posedge clk) begin : MEM_READif (cs && !we && oe) begindata_out = mem[address];oe_r = 1;end else beginoe_r = 0;end endendmodule // End of Module ram_sp_sr_sw
异步(异步读、同步写)示例:
module ram_sp_ar_sw ( clk , // Clock Input address , // Address Input data , // Data bi-directional cs , // Chip Select we , // Write Enable/Read Enable oe // Output Enable ); parameter DATA_WIDTH = 8 ; parameter ADDR_WIDTH = 8 ; parameter RAM_DEPTH = 1 << ADDR_WIDTH;//--------------Input Ports----------------------- input clk ; input [ADDR_WIDTH-1:0] address ; input cs ; input we ; input oe ; //--------------Inout Ports----------------------- inout [DATA_WIDTH-1:0] data ;//--------------Internal variables---------------- reg [DATA_WIDTH-1:0] data_out ; reg [DATA_WIDTH-1:0] mem [0:RAM_DEPTH-1];//--------------Code Starts Here------------------ // Tri-State Buffer control // output : When we = 0, oe = 1, cs = 1 assign data = (cs && oe && !we) ? data_out : 8'bz; // Memory Write Block // Write Operation : When we = 1, cs = 1 always @ (posedge clk) begin : MEM_WRITEif ( cs && we ) beginmem[address] = data;end end// Memory Read Block // Read Operation : When we = 0, oe = 1, cs = 1 always @ (address or cs or we or oe) begin : MEM_READif (cs && !we && oe) begindata_out = mem[address];end endendmodule // End of Module ram_sp_ar_sw
对应电路:
可以看出2^8 = 256由4个64拼接拼接而成:
这里其实调用的是CLB中SliceM下的Distributed RAM资源:
B-FIFO
FIFO的能力:
FIFO的IP核使用,具体可参考:pg057.pdf。
FIFO可调用shift reg、distributed RAM、Block RAM、BulitIn FIFO,关于使用,xilinx论坛有相关说法:
Q:
From PG057 (Fifo generator) I understand FIFO's can be implemented in 4 ways, using :
- block RAM
- distributed RAM
- shift register
- built-in FIFO (using FIFO18 / FIFO36)
is there any simple document / app note / overview describing on what basis you typically decide between the 4 implementations. What are the main tradeoffs, advantages, ... of each underlying memory type used?
I can imagine a few, but not sure if these are correct and complete :
- block RAM is interesting for large, deep fifo's
- distributed RAM is interesting for smaller fifo's
- shift register is interesting for smaller fifo's, with short word width
- built-in FIFO allow for the fastest fifo's
but that's just intuition ... so any corrections or further insights are welcome here!
A:
Its based your application , requirement and available resources in your target FPGA. The most of the points you mention correct. I would recommend you to refer target FPGA resource guide LUTs have lowest access time, FIFO18/FIFO36 good timing performance but require effort in design migration, BRAM very good for scalable memory requirement . You can also check about URAM/ultraram available in ultrascale devices
三、IP核调用
A-simple dual RAM
参考IP_bram笔记。
这里涉及到位宽的计算,以6通道,160MHz采样率,12bit有效位AD举例,现在需要转化为:240MHz的FPGA工作时钟。
12bit*6*160/240 = 48bit
位宽由12*6 = 72bit转化为48bit,:
- Port-A为写数据,width:位宽12*6 = 72bit,depth = 160
- Port-B为读数据,width:48bit,depth = 240
但Port-B的width只能是:72bit/(2^n),n = 0, ±1, ±2, ...,因此通常都是二次转化:
Step1:72*160 = 48*240 < x * 240,x = 72*2^n >48,此处n = 0;x工作在240MHz。
Step2:x * M = 48 * N,M、N都是整数。M = 1,N = 2,完成转化。
勾选primitive output register:
可见该选项延迟了1拍。
B-FIFO
主要参考:
关于IP核参数设置,可参考FIFO generator笔记。
与BRAM同样的例子,很多时候数据时钟域转换用dual-port RAM而不用FIFO,说是前者含地址,存在时延变量取数方便,但改为FIFO实现其实也可以,后者含有计数功能,同样可以进行定位。
从读使能给出,到数据输出,经过6个周期,计算: