cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Contributor
Contributor
244 Views
Registered: ‎07-07-2019

TDC on FPGA

Hello,

Small background-

For a physics experiment, our lab is building an FPGA based DAQ system. We have several kits available with us. VC707 is one of them.  A basic DAQ is built, but we are trying to include some other features in it too. Our experiment requires measuring the life-time of some particles in physics. The already consists of old school TDC. But to learn more about FPGA prototyping we started exploring can we implement a TDC on FPGA(VC707 for this scope). So we did a little literature review and found the most common approach was to use the ‘Tapped-delay line’ concept where a chain of inbuilt carry element is connected to flip-flops and the start signal is propagated through the delay line of carry elements. Stop signal would be the clock input to the flip flops connected to the delay line, which will sample the data.  So we get a thermometric code indicating how many delay elements the signal has passed and if we know the propagation delay of each element then we could calculate the time by multiplying these two factors.

TimeDiff= prop.delay X N  ..eq(1)

N-number of carry elements the start signal has crossed. This is the block diagram of the design-

 

delay_line-1.jpg

 

To understand the process of HDL implementation, we used a readymade code. (we are planning to implement our own ideas later, this is for understanding purpose only) This code uses the same concept, just the addition of some input filters and a thermometric to a binary decoder. I will post the code here for reference. But we had some questions regarding this process. I searched on the existing forum, many have used this same approach, but we had some unanswered questions.

Picture1.png

Question-

We started with how we can calculate the propagation delay of one carry element in equation 1, so we did the post-implementation simulation Hoping that this simulation would at least give an estimation of the delay and we could get a hang of the concept in a much detailed manner. What we found was surprising (cropped screenshot above). What you see with yellow arrows marked with 53ps is the output of the carry element(unreg signal in the code). What surprised us is this pattern. All 4 bits of the elements are giving outputs at the same instant. Is this possible? Since the inner circuitry of the carry4 element is having MUX and EX-OR gates.  

  1. Won’t they add their own delay?
  2. We also wanted to understand is it our approach or thought process is correct or what can be done to find out the delay of the individual carry4 element.
  3. Is it the limitation of the tool to measure the resolution below 53 ps or its anything else.

 

Pardon our inexperience. Kindly suggest if you guys have any better ideas!

Cheers!

..........................................................................................................................................................................................................

code- 

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.std_logic_arith.ALL;
USE ieee.math_real.ALL;

LIBRARY unisim;
USE unisim.vcomponents.ALL;

ENTITY fine_tdc IS
GENERIC (
STAGES : INTEGER:=64;
Xoff : INTEGER:=0;
Yoff : INTEGER:=0);
PORT (
trigger : IN std_logic; -- START signal input (triggers carry chain)
reset : IN std_logic;
clock : IN std_logic; -- STOP signal input (assumed to be clock synchronous)
--unreg_out : out std_logic_vector(STAGES-1 DOWNTO 0);
latched_output : OUT std_logic_vector(STAGES-1 DOWNTO 0)); -- Carrychain output, to be converted to binary
END fine_tdc;

ARCHITECTURE behaviour OF fine_tdc IS

-- To place the delay line in a particular spot (best for linearities and resolution), the LOC constraint is used.
ATTRIBUTE LOC : string;
ATTRIBUTE keep_hierarchy : string;
ATTRIBUTE keep_hierarchy OF behaviour : ARCHITECTURE IS "true";

SIGNAL unreg : std_logic_vector(STAGES-1 DOWNTO 0);
SIGNAL reg : std_logic_vector(STAGES-1 DOWNTO 0);

BEGIN

-- Generation of the carry chain, starting at the specified X, Y coordinate.
carry_delay_line: FOR i IN 0 TO STAGES/4-1 GENERATE

first_carry4: IF i = 0 GENERATE

ATTRIBUTE LOC OF delayblock : LABEL IS "SLICE_X"&INTEGER'image(Xoff)&"Y"&INTEGER'image(Yoff+i);

BEGIN

delayblock: CARRY4
PORT MAP(
CO => unreg(3 DOWNTO 0),
CI => '0',
CYINIT => trigger,
DI => "0000",
S => "1111");
END GENERATE;

next_carry4: IF i > 0 GENERATE

ATTRIBUTE LOC OF delayblock : LABEL IS "SLICE_X"&INTEGER'image(Xoff)&"Y"&INTEGER'image(Yoff+i);

BEGIN

delayblock: CARRY4
PORT MAP(
CO => unreg(4*(i+1)-1 DOWNTO 4*i),
CI => unreg(4*i-1),
CYINIT => '0',
DI => "0000",
S => "1111");
END GENERATE;
END GENERATE;

--unreg_out<=unreg;

-- The output is latched two times for stability reasons.
latch: FOR j IN 0 TO STAGES-1 GENERATE

--ATTRIBUTE LOC OF FDR_1 : LABEL IS "SLICE_X"&INTEGER'image(Xoff)&"Y"&INTEGER'image(Yoff+integer(floor(real(j/4))));
--ATTRIBUTE LOC OF FDR_2 : LABEL IS "SLICE_X"&INTEGER'image(Xoff+1)&"Y"&INTEGER'image(Yoff+integer(floor(real(j/4))));

BEGIN

FDR_1: FDR
GENERIC MAP(
INIT => '0')
PORT MAP(
C => clock,
R => reset,
D => unreg(j),
Q => reg(j));
FDR_2: FDR
GENERIC MAP(
INIT => '0')
PORT MAP(
C => clock,
R => reset,
D => reg(j),
Q => latched_output(j));
END GENERATE;

END behaviour;

..........................................................................................................................................................................................................

0 Kudos
1 Reply
Highlighted
Guide
Guide
225 Views
Registered: ‎01-23-2009

First, I don't think you will get complete answers to these questions. The implementation of TDC in FPGAs is almost exclusively an academic topic; the FPGA and the FPGA tools were never designed for these kinds of applications...

But there are two things that can explain what you are seeing in simulation.

The first, is that the CARRY4 cell is not a ripple carry internally, it is a fast carry lookahead implementation. As a result, I wouldn't expect to see any kind of linearity between the different outputs of the CARRY4 cell. The only "regular" pattern you can count on is the CI->CO path of a given CARRY4, which propagates from slice to slice vertically in the FPGA.

Second, the simulation model of the CARRY4 is just that, a simulation model. When using back annotated timing, the values for the timing arcs in the simulation model are provided by the SDF export process, but, even here, these are just estimates provided by databases within the tools. For the more "important" sources of delay (the sub-components of the slice, but more importantly the individual routing channels and switch matricies in the routing fabric), each of these is individually characterized, and hence the tools can come up with a safe number for the delays of aggregates of these; i.e. a single routed net may traverse many route segments and switch matricies; the delay on the net is therefore annotated as the sum of all these individual delays. But these are still models. I can speculate that in the case of the CARRY4 module, the model simply ascribes the same delay to all outputs; probably because the variation between them isn't large enough to significantly affect "normal" static timing analysis through these cells.

What will they really be in the FPGA silicon? Who knows? But they probably won't be significantly different from what the model shows. I certainly would not expect them to be distributed linearly between some lower limit and 53ps, and they may not even be monotonic...

All this to say, I don't think you can get meaningful resolution below the CI->CO delay value - these 53ps (at [SLOW_MAX]) delays are probably the smallest increment you can count on.

Avrum

Tags (2)