It is well known that wide FXP multiplications (i.e. wider than 18x25 bit for DSP48E1 units) have to rely on several DSP Slices. As a result, the cascaded calculation takes longer and so either needs pipelining or a slower SCTL clock.
A made the following tests:
One SCTL with two FXP controls in it, one multiplication sign (set to truncate, wrap and coerce to 48 bit output width), and a single FXP indicator with the same 48 bit output.
I set the bit width of the operands going into the multiplication to various sizes and compiled. Compilation target is the sbrio9607 which has DSP48E1 units with 18x25 multipliers. However, the compilation results give me inconsistent numbers of DSP slices (see below). Is there a list somewhere which lists the largest possible Multiplications given a certain number of DSP slices ? In general, how can the unexpected results below be explained ?
18x25 bit -> 1 DSP (fine)
20x25 bit -> 1 DSP (should be 2)
25x25 bit -> 2 DSP (fine)
25x35 bit -> 2 DSP (fine)
17x40 bit -> 2 DSP (fine)
18x40 bit -> 2 DSP (fine)
25x36 bit -> 3 DSP (should be 2)
18x44 bit -> 3 DSP (should be 2)
17x47 bit -> 3 DSP (should be 2)
18x47 bit -> 3 DSP (should be 2)
18x48 bit -> 3 DSP (should be 2)
18x49 bit -> 3 DSP (should be 2)
17x50 bit -> 3 DSP (should be 2)