Convolutional encoder data rate

I need to use these cores, the convolutional encoder v9.0 and Viterbi decoder, for a data rate >1 Gb/s. That data runs along the logic 32-bit wide, so a comfortable 100 MHz clock is enough.

One thing I noticed is that the encoder takes an AXI stream (uint8) but only 1 bit data, so bit rate equals clock rate, so I'm capped at 600 MHz and I'm afraid I will have other problems running blocks at above ~250 MHz. 

The alternative I thought is to split data (bytes) across 8 parallel encoders (125 Mbps x 8 = 1 Gbps) and put the results back together in bytes at 125 MHz. It just looks clumsy... is that the way to work with that encoder?

And a similar question for the Viterbi decoder, will I have to parallel a umber of them? I noticed there is an option for multichannel decoding, but I'm not sure if it fits this case.


