How about posting some of your code so that we can have a look. Bear in mind that the FPGA target you are using does not have many resources.
What width is the data going into your BRAMs? Virtex 5 supports either 18 or 36 bit width. Wider widths will require multiple BRAMs in order to get the data in and out, even if the usage is shallow. In addition, each BRAM can manage at most 32768 Bits (Not Bytes) so working out your width times your depth you can calculate how many BRAMs you will need.
Examples:
32 bit width x 65536 depth will use 64 BRAMs (1*32*65536 = 2Mbit, 2 Mbit / 32kbit = 64 BRAMS).
8 bit width x 65536 depth will use 8 BRAMS (Because the bit width is below 18, we can use each BRAM as two independent 18-bit BRAMS: 0.5x8x65536 = 262 Kbit, 262 kbit / 32kbit = 8 BRAMS).
40 bit width x 65536 depth will use 160 BRAMs (As the bit width is above 36 bit, TWO BRAMS must be parallelised in order to deal with the Bit width: 2x40x65536 = 5.2M, 5.2M / 32kbit = 160 BRAMS)
Given the fact that your FPGA only has 32 36-bit BRAMS (or 64 18-bit) then you need to make sure your data will fit. The maximum bit width supported by this target for depth 65536 is 16. This figure will be smaller in reality because some of the in-built functionality (DMA transfers and so on) already use one or two BRAMS.