cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
joe306
Scholar
Scholar
661 Views
Registered: ‎12-07-2018

How to Constraint a Clock created by State Machine?

Jump to solution

Hello, I was given a project and I need to do some Timing Closure. I need some guidance on how to create a constraint for an output clock that is generated by a state machine.

always@( posedge sclk_x4 )begin

if(byte_cnt == 'd6) begin

full_frame_reg <= test_array;

end

if(rst)begin
sclk_reg <= 'd1;
sclk_up_cnt <= 'd0;
sclk_dn_cnt <= 'd0;
clk_state <= 'd0;
frame_cnt <= 'd0;
sclk_osc_flag <= 'd0;
byte_cnt <= 4'd0;
sclk_bit_cnt <= 'd0;
frame_dly_cnt <= 16'd0;
sclk_pace_cnt <= 'd0;
sclk_pace <= 'd0;
end
else begin

case(clk_state)

BYTE_DELAY:begin
sclk_up_cnt <= sclk_up_cnt + 1'b1;

if(sclk_up_cnt >= 'd39 && byte_cnt < 'd6)begin
clk_state <= CLOCKING; // next state
sclk_up_cnt <= 'd0;
sclk_osc_flag <= 'd0;
sclk_reg <= 'd0;
end
else if (sclk_up_cnt >= 'd39 && byte_cnt >= 'd6)begin
clk_state <= FRAME_DELAY; // next state
sclk_up_cnt <= 'd0;
sclk_reg <= 'd1;
byte_cnt <= 'd0;
end


end

CLOCKING:begin
if(~sclk_osc_flag)begin
sclk_dn_cnt <= sclk_dn_cnt + 'd1;
sclk_up_cnt <= 'd0;
end
else begin
sclk_dn_cnt <= 'd0;
sclk_up_cnt <= sclk_up_cnt + 'd1;
end

if(sclk_dn_cnt >= 'd9)begin
sclk_reg <= 1'b1;
sclk_osc_flag <= 1'b1;
sclk_bit_cnt <= sclk_bit_cnt + 'd1;
end

if(sclk_up_cnt >= 'd9)begin
sclk_reg <= 1'b0;
sclk_osc_flag <= 1'b0;
end

if(sclk_bit_cnt >= 'd15)begin
clk_state <= BYTE_DELAY;// next state
sclk_bit_cnt <= 'd0;
byte_cnt <= byte_cnt + 'd1;
end

end

FRAME_DELAY:begin
frame_dly_cnt <= frame_dly_cnt + 'd1;

if(frame_dly_cnt >= 'd1399)begin
clk_state <= BYTE_DELAY;// next state
frame_dly_cnt <= 'd0;
end

end


endcase

end
end


assign sclk = sclk_reg;

 

Here is the RTL View of the pin that I need to constrain.

FDSE.jpg

I was reading another post

https://forums.xilinx.com/t5/Timing-Analysis/Timing-constraints-for-a-gated-clock-output/td-p/872359

It was recommended to put an ODDR before the OBUF but I'm not sure if I need to do that.

Can someone help me please.

Thank you

Joe

 

0 Kudos
1 Solution

Accepted Solutions
avrumw
Expert
Expert
586 Views
Registered: ‎01-23-2009

An ODDR is used for sending a "true" clock out of the FPGA.

When we have a clock in an FPGA, it is generally comes in on a clock capable pin, goes through some combination of dedicated clock logic (possibly including one or more MMCMs/PLLs, and maybe some intermediate clock buffers) and finally ends up on a clock buffer - a BUFG/BUFH/BUFR. These clock buffers drive dedicated clock networks - these are networks of wires that are fixed routes to distribute the clock from the clock buffer to the clock pins of all clocked elements in a given clock region or regions (depending on the kind of clock buffer). It reaches all these clock pins with constrained and fixed clock skew - that is the purpose of the dedicated clock networks.

When you want to forward a clock out of the FPGA it needs to go through an OBUF - the I pin of an OBUF is not a clock pin of a clocked cell, and therefore cannot be directly reached by the dedicated clock network. If you connect the I pin of an OBUF to this clock net in your design, at the time of place and route the tools will need to find a way to get the signal from the dedicated clock network to the I pin of the OBUF; thus it has to exit the dedicated clock network. So somewhere in the FPGA, a branch of the dedicated clock network will be driven off the dedicated clock network and into "general fabric routing" - from there it can make its way to the I pin of the OBUF. This net segment (from where it leaves the dedicated clock network to the I pin of the OBUF) will be routed however the tools can route it - it may have different routing (and hence different delay) each time you modify the design and re-run place and route. Furthermore, this net is going through general routing in the fabric, which is much slower than the dedicated clock network and highly process/voltage/temperature dependent, since it is going through multiple switch matrixes (which is how the programmable routing is done). It will also pick up more jitter as it goes through the general routing due to switching noise from adjacent signals.

All this is generally undesirable for a clock. Aside from the quality issue (jitter) the unpredictable delay of this net will directly affect the timing of any interfaces that use this clock as a reference.

So, to fix this, we realize that an ODDR is a clocked cell - therefore the dedicated clock network already has a branch going to this cell. The ODDR is part of the Input/Output Block (IOB) so is already physically adjacent to the OBUF (with a dedicated connection). So if we get the ODDR to "mirror" the clock, then this clock will be far better than one that goes through fabric routing. 

The ODDR is an output double data rate register - on the rising edge of the clock it puts out D1 and on the falling edge of the clock it puts out D2; thus if we connect D1 to 1'b1 and D2 to 1'b0, on the rising edge of the clock it puts out a 1 and on the falling edge of the clock it puts out a 0 - this is a mirror of the clock. This is why we use an ODDR for forwarding a "true" clock out of the FPGA.

Now for your SPI clock that comes from a state machine. This is not on a dedicated clock network - it is already in the fabric. Further more, unlike the "true" clock it is not making two transitions per clock cycle (relative to the clock that is generating it) ; the clock I described above is - it is making a 0->1 and 1->0 transition on every clock cycle. Your SPI clock is high for at several (base) clock cycles and low for several (base) clock cycles. So we don't need an ODDR.

That being said, if you want the clock to have predictable timing, you do want it to come from a flip-flop that is close to the OBUF. But you don't need an ODDR - you just need an SDR flip-flop. Within the IOB there are also IOB registers - for the output there is an SDR IOB flip-flop. This can be inferred from any flip-flop description in your RTL code provided the output of the flip-flop goes directly to and only to the I pin of the OBUF (or through an ODELAY if one exists). Thus this flip-flop cannot be part of your state machine since your state machine needs feedback. But it can be an "extra" flip-flop after the state machine, or it can be a flip-flop whose input is generated from the "next" logic of your state machine (being careful to make sure that the tool won't somehow merge this flip-flop with the flip-flop of your state machine - a DONT_TOUCH attribute may be required). 

If this flip-flop meets these requirements then you can force it into the IOB by setting the IOB property on the output port

set_property IOB TRUE [get_ports <name of port>]

Finally, trying to use an ODDR to forward this "clock" would be "bad". The "clock" (which is really just a periodic signal in your fabric) is not on a clock network. To use an ODDR for mirroring this "clock" would have to be on the C pin of the ODDR. But the C pin of the ODDR is normally driven by a dedicated clock signal... In this case you would be taking a fabric signal and using it as a clock - this would be a locally routed clock (which is discouraged).

Avrum

View solution in original post

Tags (2)
13 Replies
maps-mpls
Mentor
Mentor
649 Views
Registered: ‎06-20-2017

Are you using the FSM created clock anywhere in the FPGA, other than to get it off chip? If so, I wouldn't worry about a constraint on it, unless you have some other interface clocked by it to the outside world.

*** Destination: Rapid design and development cycles *** Unappreciated answers get deleted, unappreciative OPs get put on ignored list ***
joe306
Scholar
Scholar
644 Views
Registered: ‎12-07-2018

Hello, thank you for responding to my post. So what constraint should  use for the AZ_CLK pin? This signal is used as a SPI clock.

Last, what is the purpose of ODDR? Should I use one here?

Joe

0 Kudos
maps-mpls
Mentor
Mentor
640 Views
Registered: ‎06-20-2017

ODDR's can be used for clock forwarding.  I have found that they aren't always beneficial on US or US+, but on 7-series, they introduced a smaller uncertainty due to process (P), voltage (V), and temperature (T) variations.  Depending on your frequency, you may not need it.  But it is there whether you use it or not so it won't hurt.  Look through language templates for ODDR, and instantiate per your device family and your favorite language, tying one of the inputs of the ODDR to high, and the other to low. 

As for constraining, open a synthesized design and go through the constraints wizard in the flow navigator--but be sure to back up any existing constraints first. The constraints wizard will recognize the ODDR as a clock forwarding structure, and offer to create a constraint for you. 

After you create the forwarded clock, you can reference it in your set output delay constraints for your SPI interface.

 

 

*** Destination: Rapid design and development cycles *** Unappreciated answers get deleted, unappreciative OPs get put on ignored list ***
avrumw
Expert
Expert
587 Views
Registered: ‎01-23-2009

An ODDR is used for sending a "true" clock out of the FPGA.

When we have a clock in an FPGA, it is generally comes in on a clock capable pin, goes through some combination of dedicated clock logic (possibly including one or more MMCMs/PLLs, and maybe some intermediate clock buffers) and finally ends up on a clock buffer - a BUFG/BUFH/BUFR. These clock buffers drive dedicated clock networks - these are networks of wires that are fixed routes to distribute the clock from the clock buffer to the clock pins of all clocked elements in a given clock region or regions (depending on the kind of clock buffer). It reaches all these clock pins with constrained and fixed clock skew - that is the purpose of the dedicated clock networks.

When you want to forward a clock out of the FPGA it needs to go through an OBUF - the I pin of an OBUF is not a clock pin of a clocked cell, and therefore cannot be directly reached by the dedicated clock network. If you connect the I pin of an OBUF to this clock net in your design, at the time of place and route the tools will need to find a way to get the signal from the dedicated clock network to the I pin of the OBUF; thus it has to exit the dedicated clock network. So somewhere in the FPGA, a branch of the dedicated clock network will be driven off the dedicated clock network and into "general fabric routing" - from there it can make its way to the I pin of the OBUF. This net segment (from where it leaves the dedicated clock network to the I pin of the OBUF) will be routed however the tools can route it - it may have different routing (and hence different delay) each time you modify the design and re-run place and route. Furthermore, this net is going through general routing in the fabric, which is much slower than the dedicated clock network and highly process/voltage/temperature dependent, since it is going through multiple switch matrixes (which is how the programmable routing is done). It will also pick up more jitter as it goes through the general routing due to switching noise from adjacent signals.

All this is generally undesirable for a clock. Aside from the quality issue (jitter) the unpredictable delay of this net will directly affect the timing of any interfaces that use this clock as a reference.

So, to fix this, we realize that an ODDR is a clocked cell - therefore the dedicated clock network already has a branch going to this cell. The ODDR is part of the Input/Output Block (IOB) so is already physically adjacent to the OBUF (with a dedicated connection). So if we get the ODDR to "mirror" the clock, then this clock will be far better than one that goes through fabric routing. 

The ODDR is an output double data rate register - on the rising edge of the clock it puts out D1 and on the falling edge of the clock it puts out D2; thus if we connect D1 to 1'b1 and D2 to 1'b0, on the rising edge of the clock it puts out a 1 and on the falling edge of the clock it puts out a 0 - this is a mirror of the clock. This is why we use an ODDR for forwarding a "true" clock out of the FPGA.

Now for your SPI clock that comes from a state machine. This is not on a dedicated clock network - it is already in the fabric. Further more, unlike the "true" clock it is not making two transitions per clock cycle (relative to the clock that is generating it) ; the clock I described above is - it is making a 0->1 and 1->0 transition on every clock cycle. Your SPI clock is high for at several (base) clock cycles and low for several (base) clock cycles. So we don't need an ODDR.

That being said, if you want the clock to have predictable timing, you do want it to come from a flip-flop that is close to the OBUF. But you don't need an ODDR - you just need an SDR flip-flop. Within the IOB there are also IOB registers - for the output there is an SDR IOB flip-flop. This can be inferred from any flip-flop description in your RTL code provided the output of the flip-flop goes directly to and only to the I pin of the OBUF (or through an ODELAY if one exists). Thus this flip-flop cannot be part of your state machine since your state machine needs feedback. But it can be an "extra" flip-flop after the state machine, or it can be a flip-flop whose input is generated from the "next" logic of your state machine (being careful to make sure that the tool won't somehow merge this flip-flop with the flip-flop of your state machine - a DONT_TOUCH attribute may be required). 

If this flip-flop meets these requirements then you can force it into the IOB by setting the IOB property on the output port

set_property IOB TRUE [get_ports <name of port>]

Finally, trying to use an ODDR to forward this "clock" would be "bad". The "clock" (which is really just a periodic signal in your fabric) is not on a clock network. To use an ODDR for mirroring this "clock" would have to be on the C pin of the ODDR. But the C pin of the ODDR is normally driven by a dedicated clock signal... In this case you would be taking a fabric signal and using it as a clock - this would be a locally routed clock (which is discouraged).

Avrum

View solution in original post

Tags (2)
maps-mpls
Mentor
Mentor
563 Views
Registered: ‎06-20-2017

@avrumw Great point  I let slip my mind on the SDR can be in the IO block.   I was focused on killing two birds with one stone, knowing the constraints wizard would pick up the ODDR as a forwarded clock, and helping @joe306 get the syntax for a forwarded clock correct with less effort describing how to do that on my part.

 

Regarding dedicated clock networks, and US/US+, it seems that US/US+ clocks go through some version of interconnect tiles, and that the benefits of an ODDR clock forwarding are not as good or consistent.  I believe I've seen situations where an ODDR on a US+ hurt me more in uncertainty than not having it.  I never saw that on 7-series.

*** Destination: Rapid design and development cycles *** Unappreciated answers get deleted, unappreciative OPs get put on ignored list ***
joe306
Scholar
Scholar
501 Views
Registered: ‎12-07-2018

Thank you very much for the detailed response.

0 Kudos
joe306
Scholar
Scholar
501 Views
Registered: ‎12-07-2018

Thank you very much for your responses. I have so much to learn.

0 Kudos
joe306
Scholar
Scholar
462 Views
Registered: ‎12-07-2018

Hello, I quick question, so what constraint should I use the DAC_SCK "clock" pin?

Thank you

0 Kudos
avrumw
Expert
Expert
443 Views
Registered: ‎01-23-2009

If the DAC_SCK pin is being used for constraining the clock of the DAC, then it should have a create_generated_clock command on it.

Assuming the DAC_CLK is generated by the IOB flop, you have some logic behind it that is determining how many clock cycles the D is high vs. how many cycles the D is low. So, while it is this logic that determines the "frequency" of this clock, it is still the IOB flop that is "generating" the clock. So if your logic is generating a clock that is divided by 20; then this would be

create_generated_clock -name DAC_CLK -divide_by 20 -source [get_pins <the C pin of the IOB flip-flop] [get_ports DAC_CLK]

Now you would have a clock named DAC_CLK which can be used for set_input_delay and set_output_delay commands to constrain the reset of this interface.

But - this interface is slow (from the point of view of your base clock) - you can launch your data on any one of the 20 base clocks in your DAC_CLK period in order to meet the setup/hold requirement of the DAC, and 20 base clocks to capture the return data. In general, for interfaces that are this slow you do not use "single clock cycle timing" - you don't launch the data on the base clock edge before it is needed by the DAC and capture the return data on the base clock edge after. This leaves you with two choices for how to constrain it.

One is that you do it "correctly" - you specify the set_output_delay based on the DAC_CLK setup/hold requirements and the set_input_delay based on the min/max clock to output delay of the DAC (assuming it has outputs). But then you need to add all kinds of set_multicycle_path commands to tell the tool that it doesn't have only one clock cycle to satisfy these requirements, but however many you have given it. 

Lets number the base clocks starting at cycle 0. The rising edges of the SCLK coincide with the base clock rising edges at 0, 20, 40, 60... So if, for example, you decide to launch the data on clock edge 5, this gives you a 15 cycle multicycle path for the FPGA -> DAC data setup and a -5 clock cycle hold requirement on the hold. I would have to think for a while on exactly how to express this. 

So if you did this (use proper techniques for the set_input_delay and set_output_delay and have the correct set_multicycle_path commands) then your interface would be completely constrained.

But there is a simpler solution. You know that you probably have tons of margin on this interface - the 15 clocks of setup is way more than enough, the 5 clocks of hold is way more than enough, and even on the return path (if there is any), you can sample (say) on the 20th clock (which is 20 clocks after the rising edge of the DAC_CLK), which gives tons of time for the DAC_CLK to propagate to to the DAC, generate return data and get captured back. If the margin on all of these is more than one base clock, then you don't need to do anything. All the DAC_CLK and output data to the DAC and the input data from the DAC (if there is any - if this is a DAC there probably isn't) are all launched/captured in IOB flip-flops. The maximum skew on these flip-flops is small (a couple of hundred picoseconds) - way smaller than the margins you have on an interface like this. So you can simply ignore all this... Do the timing analysis manually based on the number of base clocks between events (subtracting a "reasonable" number for skew - again, 400ps is probably enough, but you can guardband it by a lot and say 1ns or even 5ns)  to prove that this will work. Then force all your flip-flops into the IOB, put "dummy" set_output_delay commands on them that are "large enough to pass, but not larger than one clock period", and call this done.

Avrum

joe306
Scholar
Scholar
429 Views
Registered: ‎12-07-2018

Thank you very much. I just realized I should have said AZ_CLK and not DAC_CLK. DAC_CLK is another clock in my design.

Too many clocks.

I will try my best.

Thank you

0 Kudos
joe306
Scholar
Scholar
395 Views
Registered: ‎12-07-2018

Hello, I decided to take your suggestion and put a extra register outside of the state machine.

Here is the schematic after implementation:

ExtraReg.jpg

Next I will put it in an IOB:

set_property IOB TRUE [get_ports AZ_CLK]

By the way, I'm using a Zynq Ultrascale+ MPSOC device.

Now I can create a "create_generated_clock" on this pin:

Master Source is FDRE C and Source is FDRE Q.

Small steps..

 

Thank you

 

 

0 Kudos
avrumw
Expert
Expert
363 Views
Registered: ‎01-23-2009

Careful with the nomenclature

create_generated_clock -name <clk_name> -source <source_object> [-master_clock <clock>] <relationship> <object>

The -source must be a pin or net upstream from the clock modifying cell - in your case a cell or a pin that is upstream of the flip-flop generating the clock. It is acceptable (I think) for this to be any thing on this path - from (say) the output pin of the MMCM that generates the base clock to the clock pin of the flip-flop. It is conventional to use the closest accessible point, so if possible the C pin of the flip-flop.

The -master_clock is an optional parameter. It is (usually) only used if there is a situation where the -source object has more than one clock on it (which is possible). In this case - you can restrict the create_generated_clock to use only one of the clocks on this pin (otherwise you will get a different clock for each clock that exists on the source.

The <relationship> is one or more of the set of relationship definitions:

  • -divide_by <val>
  • -multiply_by <val>
  • -invert
  • -edges {a b c}
  • -edge_shift {a b c}
  • (and probably one or two more I am forgetting)

Finally the <object> is the attachment point of the generated clock. It must be downstream from the clock generating cell - so anywhere from the Q pin of the cell down. While it is fine to put it on the Q, it is more common to put it on the output port in cases where the generated clock is going out of the FPGA.

So 

create_generated_clock -name AZ_CLK -source [get_pins hier_Motor_IF/myip_encdr_SPI_AZ/inst/myip_encdr_SPI_v1_0_S00_AXI_inst/encdr_FPGA/SPI/sclk_delay_reg/C] -divide by 20 [get_ports AZ_CLK]

Avrum

joe306
Scholar
Scholar
303 Views
Registered: ‎12-07-2018

Got it! Thank you very much for all your help.

Joe

0 Kudos