The HLS synthesizer builds as many ROM reading circuits as the loop limit (e.g. 32 in this code). When SPREAD is set to 1024 the synthesis builds 128 array partitions on the ROM table and it turns out Vivado fails to complete the implementation.
I would like to keep the inter-module stream as 1024 bit but process the lookup by splitting it into two or four batches. Hence using this code:
Surprisingly the HLS synthesizer merges the two loops into one and build 128 ROM partitions. Tried adding different types of code in between the two loops and modifying the look and feel of the loops but still the HLS is too clever to optimize and regard it as one single loop of 128 trips.
Is it possible to disable the loop merge optimization, so that the HLS outputs two loops processing 64 bytes each?