10-08-2009 09:56 AM
How many cycles are required for double-precision implementation of floating-point operations add, multiply, divide, compare, etc on a Microblaze?
Since these will be implemented in software, I expect the performance will be poor. I just want to know latency/throughput of these operations in that situation.
Is there a manual where these costs are documented?
Thanks.
10-08-2009 04:35 PM - edited 10-08-2009 04:43 PM
As you said, there is no native hardware support in MicroBlaze for double precision. All double precision instructions are emulated in software by the compiler. When the FPU is enabled (C_USE_FPU > 0), the hardware supports single precision floating point operations.
The # of instruction cycles for the operations is documented in the MicroBlaze User Guide
e.g. %XILINX_EDK%\hw\XilinxProcessorIPLib\pcores\microblaze_v7_20_c\doc\microblaze.pdf
see fadd, fmul etc in the MicroBlaze Instruction Set Architecture section for the HW implementation. It is also dependent on the 3/5 pipeline configuration selection (C_AREA_OPTIMIZED).
You may also want to search this manual for "floating" and "FPU" occurences.
I suspect the actual implementation depends on the compiler (mb-gcc) version, its configuration and the specific code.
You could also try both methods (SW emulated or HW supported [single precision only]) for a few simple cases and then profile the code or disassemble the output to get a better idea of the actual implementation.
bt