I think I read somewhere that USB can have a significant latency on each call, possibly on the order of 1 ms. So even though it can transmit data quite quickly, the turn around times can cause problems. Jeff has shown you how to separate the first byte without a separate USB call. You may need to experiment with the number of bytes to read to find a value which works.
Buffers set by the OS on the USB side as well as any serial buffers, hardware or software, could affect the performance. It may not be possible to control the sizes (or even to easily find what the sizes are) of those buffers.
4096 bytes at 3096774 baud takes ~1.3 ms.
Lynn