Parallel Processing in Xilinx SDK using HLS IP core
I am working on string matching algorithms where I created an IP in HLS and then it is exported in Vivado to create a block design. I am using Zynq Ultrascale+ processor. I am using HP ports to connect the IP so that the interfaces can access the DDR memory. The block design right now has Zynq Ultrascale+ PS, HLS created IP (m_axi interfaces for ddr access, clk and reset interfaces), and AXI_Smartconnect. Since there are 4 High performance ports, I can have 4 copies of IP to reduce the execution time. However, I am just wondering what if I have to connect the 4 copies of IP using only one HP port because the other 3 HP ports are used some where else. I see that the execution times reduces but by a very small amount. Also, if I connect all the 4 copies of the same IP to only one HP port than how do I ensure that when coding in SDK, all 4 copies of IP start processing at the same time? Is there any specific way to do this in programming or any thing that takes care of this? I am using C programming and the driver functions of the IP that will be exported to the SDK in the design hardware platform folder. Also, I am working on a bare-metal application right now. Any tutorial or any help is highly appreciated.