I have an application that is trying to receive UDP packets at high data rates. Packets are 8220 bytes (data payload) and MTU is set to Jumbo to support this. My data rate is around 30 MBytes/s. This is time-domain voltage data that I want to use for spectral display. I've already optimized the code with queue buffers to do the DSP in parallel and only write to the queue buffer when the DSP is ready (I read here that the act of writting the data to a queue can slow the UDP read down enough). From the link just mentioned, I've changed the setsockopt receive buffer size to be fairly large (82.2 MBytes) and that hasn't really helped. Unfortunately, nothing I have tried so far has helped much with dropping a large number of packets. Normally I wouldn't mind this since the data is just for display purposes but dropping too many packets causes large phase discontinuities that, when present in the SMT Block Zoom FFT function, cause a ton of spectral leakage in the spectrum display.
I'm curious as to how GigE Vision Cameras can claim around 120-125 MBytes/s data rates using UDP but when I use the default UDP read function in LabVIEW it starts dropping packets around 3 MBytes/s let alone my goal of 30 MBytes/s.
I've attached a snippet of my UDP read while loop for reference. I decided to try a pipelined approach on a whim just to see if that helped but I don't think it did too much. It can be removed and the write to queue can be placed back in serial for testing. The ready for data flag in the snippet comes from my DSP code loop when it is finished processing the previous data grab and is ready for more. This way, I'm only writting to the Queue when ready and otherwise dumping the UDP read data on the floor.
If anyone has advice that would be GREATLY appreciated. I'm down to the wire on this and running out of options....
In the past I have found that you may need to simply your read task to the absolute minimum and let other parallel tasks handle the processing. You may want to separate out the other processing that you are doing in the read loop to ONLY contain the UDP read and an immediate queue of the data. Don't have any other logic in the code except the test to see if you should exit the loop. Have a parallel task actually process the data and take care of any parsing, testing or other things prior to passing further up the chain for more refined processing.
One other question though. Have you used a LAN analyzer such as Wireshark to verify that all of the data is actually on the wire? It should be but it is worth checking just in case. You would hate to chase down a "bug" in your code that doesn't actually exist.
I have tried the absolute minimum with the ready for data FG that's in the attached snippet. This reduced the number of queue writes that I'm doing since even the act of writing to the queue is causing a substantial slow down. All my processing is done after the queue write in multiple parallel steps (unflatten, buffering samples, FFT's, etc.). It's possible that the shift register in the pipelined approach may slow things down more but this is just a memcopy (kind of) so I wouldn't think that it harms things. Like I mentioned, I've tried the UDP read and write to queue when ready approach in serial as well as the attached pipelined approach. I think the performance difference between those two was negligible
I haven't tried wireshark yet but as a first step (and rough estimate) I used Windows 7's System Resource Monitor utility and when the port is connected the monitor is showing around 30 MBytes/s to my application. It's just that the app can't keep up so the OS is throwing packets on the floor. I was hoping to find a way to flush any old data out of the OS UDP socket buffer but I haven't found anything online to indicate that this is possible. My app will never need to grab more than 10000 packets and since I set the socket buffer to 10000 packets and the queue depth to 10000 elements I would hope to be able to get all the continous data I need in each data grab. Unfortunately, the socket buffer just continuily fills up unless it is certain that my app can read data faster than it is coming in. Once the socket buffer is full (which only takes about 2 seconds) there is substantial packet loss from that point on. I've tried resetting the socket buffer size to a small then large value, openning/closing the socket, and a few other things but nothing seems to flush the OS socket buffer correctly.
Part of the reason you are losing data is that your shift register only contains the last packet you read. I would simply your read loop even further. You could be running into issues calling your FG if other parts of your code are accessing them. Only one piece of code can access them at a time. If you strip this down even further as I suggested you may find that you will read the data fast enough. You did take into account like the maximum queue size to avoid unnecessary memory allocations which is good. Each FG you have in the read loop can cause this code to delay waiting for access to the FG. If this were nothing more than the read and queue you will avoid this. I would also consider using a notifier for the stop condition. At the end of the loop check the notifier using a timeout of 0.
This is basically what I am suggesting.
Note: This data queue may send the data to an intermediary task for further processing.
I've implemented your suggestion and slimmed down the UDP read loop. I completely committed a code blunder by not thinking about how the FG's could have been slowing down the while loop. Duh! You are correct that other pieces of code are accessing the FG but to make matters even worse, my FG's are also being accessed by many instances of the top-level application and in multiple places (that's the reason for the clone# input. Each FG stores it's values in an array index based on the clone# from the VI Clone Name property coming from the top-level VI).
With the modifications shown in the attached snippet (I'd inline the snippet but my proxy blocks it for some reason when I try to insert) I definitely was able to keep up. The DataPacking/DSP loop running elsewhere flushes the DataTmp queue before setting the Ready For Data flag to ensure that no old data is present. I verified the data rate of the read using Win7's Resource Monitor which showed that this app was getting about 30 MBytes/s on the port I was connected to. I wasn't achieving the full data rate before so that was great. Thanks!
Unfortunately, I'm hitting another issue that I'm not quite sure how to explain because I don't really understand what is happening. Maybe you'll have some ideas. After leaving the app up and letting it run for a while I came back later and noticed that the spectrum display started to look bad again. When I checked resource monitor my incoming data rate had dropped to 3~4 MBytes/s!!! That's a huge drop from the 30 MBytes/s I was getting for longer than 5 minutes. I don't suspect that it is my application since restarting it didn't seem to help at all. The only thing that worked was restarting Windows. Have you or anyone else ever run across something like this where all of a sudden the UDP stream is throttled back?
I would look to see how much memory is being used by the PC. Since you are using very large buffers for the UDP communications you could be reaching a point where Windows is swapping tasks. If this occurs you will see a decline in performance. Make sure that your application doesn't have a memory leak or using too much memory. When you say that you restarted your application were you running in the development environment and you simply restarted the VI or is it a built application that you exited and restarted. If the formal LabVIEW may still have held onto memory which may explain why the reboot worked. Based on what you describe I think you are running into some sort of memory leak.
Not much memory is being used by my application at around 100~150 MBytes. Each of the buffers I have could be around 82 MBytes a piece so I think that they are all getting flushed and/or emptied correctly to not fill up and start swapping with pagefile. The memory usage of the application stays steady for quite some time as well. This is always in the executable environment since I can't run this application on my development environment (the UDP provider of data to this application is on a different network without LV installed).
Another thing I noticed later is that I can get back to good looking spectrum and higher throughput rates in the Windows Resource Monitor when I disable/enable the 10 GbE port that the data is coming in on. Once again, it lasts for a while and then throughput starts to slow down. I'm actually going to try and install the latest NIC drivers for the card and see if that helps any.
Latest drivers have been installed for the Intel X520-2 10 GbE NIC that we are using but that has not helped the situation.
I've also used wireshark to do some testing but I can't confirm that I'm using it correctly to the point where it is a reliable source that can be used
to confirm definite packet loss outside of my LabVIEW reader.
I also created a one-shot UDP reader that I set up to run continuously (over and over again) and that also starts dropping packets (or the OS does...I can't tell where packets are really being lost now). I've attached this application for reference and anyone can use it for UDP multicast read testing just by changing the ports that I have defined.
No one has asked about your network hardware so far, what network hardware are you running? (NIC, Switch, Cable grade, etc). From my own experience with gigabit networking, Data transfer rates fluctuates from 3 MB/s to 100MB/s. Also, are there any other network services running that may affect your network connection?