cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Anonymous
Not applicable
11,647 Views

Can you develop a NanoBlaze

PicoBlaze is very efficient.  I use it a lot.  When system getting more complex, I often need 2 or event 3 PicoBlaze in my system.  I wonder if Ken can start a NanoBlaze, about 4 times the ScrachPadMemory and Instruction space.
0 Kudos
5 Replies
chapman
Xilinx Employee
Xilinx Employee
11,570 Views
Registered: ‎09-05-2007

I guess most people tend to think of a 'NanoBlaze' being a 16-bit machine of some kind but I think you are just looking to expand the features of the exisiting 8-bit machine which is something I rather agree with. Have you read the following threads in this forum?

Contribution: KCPSM3 with extended scratchpad (VHDL)

PicoBlaze FAQ – Programs >1024 instructions

Regards,

Ken
Ken Chapman
Principal Engineer, Xilinx UK
0 Kudos
Anonymous
Not applicable
11,384 Views

Thanks for the reply, I read the threads you listed and found them are very helpful.
0 Kudos
ganaylor
Observer
Observer
10,820 Views
Registered: ‎03-28-2008

Yes nanoblaze means 16 bit as it covers the sort of dynamic range used by typical signals. Sure picoblaze can do 16 bit arithmetic etc, but needs more program lines. Would be nice to have a processor that could implement 16 bit transations in a single step for designing efficient FIRs etc. I often tend to find myself concating 2 output ports to make 16bits for the real signals being treated - particularly when feeding the multiplier (do you really want to use a hardware multiplier for 8bit multiplications!?). In the end its a compromise between speed and resource. Picoblaze will do 16 bit stuff, but taking more program lines and slower. Surely there is a valid range of applications crying out for a 16 bit processor which is compact like the Picoblaze and doesn't want the overhead of Microblaze..... or is it just me?

 

I love Picoblaze, I think I could two-time a bit and love Nanoblaze aswell though.......

 

 

0 Kudos
whelm
Explorer
Explorer
5,673 Views
Registered: ‎05-15-2014

I know this is a very old thread, but the topic still seems relevant.  I've been developing embedded applications for over 30 years (many of which include an FPGA in the hardware) and still strongly believe that 16 bits is the "sweet spot" for many such applications.  Done well, it scales from simple control processors to projects that include an ethernet protocol stack.  MicroBlaze is an awsome architecture, but it pains me to use a bunch of 32 bit registers to hold boolean values!  At least it supports byte level memory access.  Most control data can easily fit into 16 bit integer arithmetic, and any communication protocol, including ASCII, is likely to be byte oriented.  The need for 32 bit values is generally really limited.

 

That being said, I don't think the 16 bit picoBlaze path is the ideal.  A 16 bit MicroBlaze would be much better.  MicroBlaze is a fairly efficient target for C programming, which enhances productivity a lot.  16 bit processors are quite well suited to the C language in general, also.

 

The previous posts allude to another area ripe for development--DSP.  To my thinking a 16 bit DSP architecture is the biggest hole in the Xilinx portfolio right now.  Something that can take one or more BRAMS and MACs and do things like FIR filters at near the limits of those hardware blocks.  If one is doing, say a SDR where a single task (FIR filter, mixer, etc) has to run repeatedly at near the hardware frequency it is no big deal to implement it directly in HDL.  However, often that needs to be followed by other stuff running at significantly decimated sample rates where five or ten FIRs could run in a single BRAM/MAC block because of their relatively low speed requirements.  That is messier to do in HDL (than say a DSP processor).  Being able to write a few lines of C (or even high level assembly) to sequentailly automate those tasks would greatly simplify efficient resource utilization for such items.  I find the need acute enough that I may explore it further, but I'm neither a processor expert or a compiler expert.  Customizing the tools to generate code and integrate it into the design flow is undoubtedly significantly harder than creating the IP core.  And then there is the testing necessary to assure that it is bug free and reliable enough to depend on for real world application (especially safety critical ones).

 

Wilton Helm

Embedded System Resources

 

0 Kudos
chapman
Xilinx Employee
Xilinx Employee
5,658 Views
Registered: ‎09-05-2007

Dear Wilton,

 

Since you have woken up this thread from 2008 it is probably best if I start my response by making sure that other readers are aware of the KCPSM6 variant of PicoBlaze that was first released in 2010. Although still very much an 8-bit processor of the same overall look and feel it does deliver somewhat more in an even smaller footprint. In fact, it could be argues that KCPSM6 provides the highest density of processing in an FPGA! Using multiple PicoBlaze processors could easily be a more optimum solution even if it takes a little more thought and effort to implement than using one larger processor.

 

I don’t disagree with your observations. Like you, I do still think there is a place for something between an 8-bit PicoBlaze and a 32-bit MicroBlaze. What isn’t clear is what that something is. In truth, there should probably be a set of things to fill that space.

 

16-bit microcontroller - Like PicoBlaze this would be a limited and compact entity where the main reason for 16-bits was mainly for the convenience of implementing operations of more than 8-bits in one go. For example, it would be a convenient fit when working with XADC which has 16-bit registers and analogue samples. I don’t see performance as being the reason. Yes 16-bit would be faster than 8-bit when handling 16-bit values but performance is rarely sited as being an issue by PicoBlaze users. After all, we have all that programmable logic to implement peripherals when we need to do something faster.

 

16-bit processor – Like MicroBlaze this is what I consider to be a ‘data processor’ which is associated with significantly more memory. Have you looked at MicroBlaze MCS? Although it is based on the same 32-bit processor it presents a compact controller system. It’s easy to say that a 16-bit processor would be smaller but it would also be slower and it is a case of accepting that trade off. Somewhat less obvious is that you would still expect to connect that 16-bit processor to peripherals and program it in a similar way (e.g. writing C). Unless there are also a set of 16-bit peripherals (or peripherals designed to interface with a 16-bit processor) then the reduction in size of the processor alone can become insignificant. Likewise, unless a compiler and code presented to it is truly targeting a 16-bit processor with 16-bit values then you just end up with bigger and slower code. I guess I’m saying that a 16-bit processor makes sense but the complete solution is a lot of work.

 

DSP processor – Now this is one I’ve looked at more times than I can remember. You are exactly right when you say that the repeated stuff is best assigned to pure hardware but the ‘messy’ stuff is better handled by a processor. With that in mind I have to say that my solution has been to use PicoBlaze as a controller of a ‘DSP peripheral’ and this has always proved to be a more adaptable scheme than a defined ‘DSP processor’. As soon as you try to define a general purpose DSP processor you end up with something rather big because there is such a huge variety of algorithms and applications that you could cover. Just a quick look at the DSP48E block is a good indication of what we could be trying to construct our processor around and even then we haven’t covered some audio signal processing due to the bit widths. As with pure hardware implementations of DSP algorithms, it is often the storage of samples and coefficients as well as how to present them to the processing units that dominates the design. Defining and setting up memories of suitable data widths, proving them with loadable address counters and connecting them to a processing block implementing the required calculations at the required range and resolution is relatively straightforward. PicoBlaze can then be used to implement the ‘messy’ stuff which implements the algorithm by setting pointers to memory and defining the calculations which then take place in bursts. It may occasionally it may be necessary to pull samples into PicoBlaze to analyse or manipulate them but generally the data stays outside.

 

Zynq – It would be remiss of me not to give Zynq a mention. Dual-core ARM Cortex A9 processor unit with peripherals and a high bandwidth connection to programmable logic. Why would you ever want anything else?  Of course we know why! However, it is a good reminder that it isn’t just about what form a processor takes; it’s about having the right one in the right place at the right time. At least that’s why I think soft processors still have a part to play even when using a Zynq device.           

 

Ken Chapman
Principal Engineer, Xilinx UK
0 Kudos