I designed a Cores of microblaze design in EDK 11.1, and ran a parallelized matrix multiplication code on each of the cores.
Everything is running correctly, however, when I do the code profiling to compare the performance of a design versus another, the gmon.out is giving the same result (0 seconds) no matter how large the matrix is.
Please note that I tried the same code on a 2-cores design and the code profiling worked perfectly.
Does anyone know why I am getting this result?
And I also have another question, how do I know where the results of the multiplication are residing in memory ? (DDR2 or BRAM)?