08-05-2019 02:23 AM
Hi, this is my first question in this form so forgive my mistakes .
I use Spartan 6 XC6SLX100T
I am a designing data acquisition system which has 14 Serial Channels ,4 Can Bus, 2 ADC .4 synchronous serial channels and 1 Ethernet. My design briefly works like taking instructions from a PC via ethernet interface and start to collect datas from all the channels , write them on block rams . When messages fhinish, a memory management component read them from block rams and make ethernet package from them ,adding header and checksum then send the messages to PC again. It was working perfectly so far . To decrease ethernet bandwith congestion ı had to make sipmle changes in Ethernet Block(I just turn on and off ethernet transmitter to prevent sending same messages always. when there is no new data from channels. But using flags to dont miss any data)
Anyway After I made this change , Serial channel 2 stop collecting data but all other channels working perfectly like before. I applied chipscope and find the problem, Serial Channel 2(rs422) taking datas but dont write on the it's own block ram but after it's process it generates Write_Memory_Done flag which I defined before to control memory units. So it dosent write on the memory but say "I did write on the memory " and I recieve only zeros from ethernet package. This design created with various channel numbers and IPs but all components and process are same . And if I increase the number of Channels , serial channel 2 and serial channel 6 stop working like the previous one. And also if I remove some of channels like some of Can bus , some of Serial or ADC .. The rest of Channels works perfectly . There is no congestion in my FPGA I use max 70 percent of LUTs I guess and max around 30 block rams are used in my desing . I couldnt figure out why this happens and how to fix this problem . If you guys had same problem and fix it before or just you want to give me some advice I will be so glad . Thanks for helping.
12-03-2019 06:05 AM
08-15-2019 11:26 PM
hi @aziz93 ,
is this issue happenng every time or pattern is seen for this issue like odd number or even number of time occurs or once in blue moon .
09-05-2019 01:26 AM
Hi ,sorry for late reply and thank you for response,
=>This issue is happening every time.
=>Yes host device is a PC and OS is windows 10 pro.
=>But I dont think the problem is related with OS or USB RS422 converter because I also tried self test in my card (Connect RX and TX pins to each other with loopback connector and card sends test messages from TXs and Receiving From RXs and when collects datas from RXs ,send datas back to PC via ethernet ) and only the same chanel is not working again. I also tried to use different cards and codes with same port, the hardware is working without any problem .
=> Problem looks like related with Block Ram , Block Ram writing process works properly and creates flag which define writing process has been done and I can only get Zeros instead of datas, somehow same Block Ram writing process doesn't effect on Channel 2 while it effects the rest of channels .
=> When I observe chipscope states and datas I can see data can receive perfectly from RX module, I examined all bytes and all matched with what I sent as test message in self test mode. After receiveng datas Block Ram writing process doesn not effect on actual Block Ram component Although same process works fine for other channels.
09-05-2019 02:07 AM
This has the feeling of a sw / hw interaction problem.
Somethign like a CH not being read fast enoguh, so next time it tries to receive, its blocked,
or state machines in the SW and HW getting out of step.
09-06-2019 01:24 AM
09-06-2019 04:09 AM
Sorry , its your code,
But in general , it sounds like you have a s/w process and a h/w process talking to each other.
Both are going to take a time to respond to an action, both with some form of state machine and both could have lock states.
As you change the code on either side, you are changing the performance, and as such timming interaction beween the two sides
An example : you have a bucket, one person filling it with big spade, one emptying it with spade half size.
But the emptying is running twice as fast as the filling.
Once there is half a bucket, All works well , the filler and empty are at the same volume per minute,
You can vary the size of th ebucket, or the speed the filler / emptying is done , and all is well.
But if one side takes a **bleep** break, then works harder to catch up, You could be OK, or the bucket could over flow / empty. If the bucket is big enough you would not notice,
Now take that system, unlike th eorriginal system, any change could make it unstable.
Shrink the bucket, and you could over flow / underflow
Speed up the filler and emptyer, and the bucket could over / under flow.
It has many names, but I learnt it as Que theory,
I coudl be worng, and its somethign else, bu tbase dupon what I have seen, Its all about system desing I'm afraid.
09-06-2019 05:42 AM
So, to recap...
You have made a change to the ethernet side and this has somehow caused 1 of the 12 channels of RS422 to break. Correct?
I assume there is no flow control from your ethernet code back to the RS422 code? i.e. could your ethernet change be halting ch2? If this is possible, then I would look in that area.
If there is no flow control where your RS422 is basically complely decoupled from the ethernet change you made then I would suggest you may have a timing issue with unconstrained paths/clocks or clock crossing issues etc.. etc.. that has always been lurking and just happened to manifest itself after this change/build.
09-10-2019 04:55 AM
09-10-2019 05:06 AM
09-10-2019 05:26 AM
suggest you draw out the state diagram,
09-10-2019 06:06 AM
I was not saying you need flow control just that if the ethernet did have it back to the RS422 module it would be an area to look at for the problem.
You have not mentioned timing in your response. Have you written timing constraints for this design?
It could be useful to share with us a chipscope trace showing the BRAM instance signals (including the reset) for write and read showing correct data being written, and zeros read.
My only other observation is if you have managed to write/read a buffer successfully and then find the next time round it contains 0's, it suggests either your read address has gone to an area of BRAM memory outside your 0-199 range. (i.e. If the write had somehow failed, your read would just return the previous data) Or, you could now be holding the BRAM in reset which would also return 0s.
12-03-2019 06:05 AM