we have an analog-to-digital converter card (ADC) from Alazartech (ATS 9373) that can stream at 4GS/s directly to memory. It thus produces a data stream of either 4GByte/s (8-bit per sample) or ~6.8 GByte/s (12-bit per sample).
Now we are using this ADC to acquire data from a laser scanning fluorescence microscope, so a lot of the samples do not contain physical information and can be set to zero. Thus, the data can be heavily compressed. I tested e.g. Labviews IMAQ array to PNG and achieved a factor of 100-1000 compression. However, it currently is about a factor of 40 too slow to be real-time applicable. This factor of 40 also leaves parallelization out of question for our 20-core CPU.
Right now I'm thinking about a) parallelizing on a GPU using CUDA, b) trying a C-coded DLL using libPNG to test wether PNG creation can be sped up, c) testing gzip to hopefully run faster than Labviews add file to zip function.
My question is: Does anyone have experience with this? A fast, real-time compression algorithm in Labview capable of handling 4GByte/s?
Preferrably I would want PNGs as this directly permits viewing the raw data as images obtained with the microscope.
Thank you for your help!
I very much doubt that writing in either PNG or ZIP format is achievable at your rates, but there are a number of high-speed compression libraries that you might want to look at:
B3D (GPU-based, rated to around 1GB/s) was developed for light-sheet microscopy:
Zstd was developed for Facebook but is in much wider use:
LZ4 (>500MB/s/core) is optimized for speed over compression ratio:
All of these are substantially faster than the zlib algorithm used in gzip or PNG files, but would require wrapping the library. Having said that, your data rates may exceed even what these are able to do, and if you can stream directly to the GPU (it looks as though Alazartech has a library for that) then a custom algorithm tailored to your particular data characteristics might be the better option. If you do truly have long streams of zeroes then perhaps a simple RLE is sufficient, or a form of Huffmann coding with a precomputed dictionary.
Personally, I've used the Zstd library from LabVIEW but only as an offline compression. My data rates are around 500MB/s, so I write uncompressed, and then compress later. It may be possible to compress in real-time but I couldn't get a multi-core setup working - that may be possible, I just haven't needed to do so.
Thank you for the very useful hints! Do you know of any VIs/wrappers of any of these compression algorithms? You mentioned that you used zstd in Labview, do you know if there is a VI out there?
I believe that LZ4 might get me there, but I was hoping for a PNG or perhaps BMP/TIFF image-type format with zip compression which I would be able to easily view using ImageJ or the like.
but I was hoping for a PNG or perhaps BMP/TIFF image-type format with zip compression which I would be able to easily view using ImageJ or the like.
I guess this is a misconception: using the same compression algorithm as is used for PNG will not turn your data stream into an image viewable by any image viewer software! (BMP and TIFF use/support different compression algorithms…)
sorry for asking again, but I'm afraid I haven't understood the origin of the misconception, yet.
Right now, I'm already using PNG compression and can already open the "raw data" as an image file.
To be more exact, I cut the 1D array of U8 samples into small portions that correspond to an image frame scanned by my laser scanning microscope, reshape them into a 2D array (typically 1200x800 samples) and save&compress them using IMAQ write PNG image. I then append this file to a growing zip file. I obtain 4000 of these images per second, so I chose zip in order not to overwhelm the file system by too many individual files, thus I store them in a single zip file.
So, I kind of already have what I want just that it is too slow (by a factor of 40) to handle the data stream. Since a factor of 40 seems reachable by more optimized compression algorithms (I'm thinking of modern smartphones taking 4k, 60fps RGB videos which also calculates to ~1.6 GByte/s of data stream which are currently compressed in real-time and stored on smartphone memory). However, I would need a lossless compression algorithm since I want to keeo the data as raw as possible (only thresholding the fluorescence thus replacing detector noise with zeros).
Perhaps you can help me understand the misconception. Thank you for your help!
I've only written a simple wrapper around the ZStandard functions that I needed, which were the in-memory compression and decompression. Here's the library, with the Windows dlls for Zstd 1.4.2. Works on both 32 and 64 bit LabVIEW, and I've back-saved to 2012. Nothing flash, but you're welcome to use or modify it as you like.
Sure, if you save as PNG, you can read that as an image file, but if you simply save the compressed image data without the header as well, then I think Gerd's point is that you can't. Similarly, TIFF will support the Deflate compression, essentially the same compression scheme as gzip or zip, but needs the other header information to be readable as a TIFF file. (It's true you could write a plugin for ImageJ to read whatever file you wanted, but that's a bit more work!)
The main difference with smartphone video is that these use lossy codecs, H.264 or H.265 mostly. Real-time lossless compression is completely different.
Couple of other things that come to mind. I understand why you're not wanting to create many small image files, but by creating a PNG, then storing that in a ZIP, you're double-saving the file. Plus, unless the ZIP is set to store-only, it will be trying to compress the PNG file again - not a good idea! Even then, the ZIP format is slow to write to, as it will be keeping a header and directory of pointers to each individual image file.
Even if you kept the current workflow, you'd be better to use the IMAQ Write String VI which converts the image to a String (essentially the PNG file in memory). Then a fast way to store those would be to write to a raw file, and keep a separate list of "lengths" of each PNG-compressed image, something like this:
You can also increase file-writing speed by disabling file buffering when opening the file, however you need to be careful to write in multiples of the disk sector size, so that can be tricky to keep track of - probably easiest to pad each file to a multiple of the sector size (often 512 bytes) and trim based on the length when reading the file back. This may not be needed if you can compress sufficiently.
Having said all that, I actually think you should avoid IMAQ Images altogether (IMAQ Array to Image takes up time as well), and directly compress your 1D U8 array in-memory with either zstd or lz4 or similar, and stream that to disk. Even if the compression itself is not multi-core, as long as the code is thread-aware you could run 8 or 16 in parallel (Parallel For Loop) and then write all the results to disk in one go. Do post some of your code, as it's easier to comment when seeing the actual data flow.
Thank you! The first test was successful using your Zstd VIs. I was able to stream at 4GS/s using U8 samples (data stream of 4GByte/s) using 3 parallel instances. However, I didn't have any fluorescence signal, i.e. just a stream of zeros. Have to test it with signal again to see wether it can really withstand the speeds but this is already a great step forward, thanks Greg! Will let you know if it works with signal as well.
Wanted to report back since we have tested streaming with fluorescence signal in the meantime. Unfortunately, once a substantial part of the stream contains fluorescence info (i.e. about 1/10th fill factor), the speed of the compression no longer matches the streaming velocity. Compression level is still great (a factor of 200), probably since a lot of the samples contain no signal and get set to zero, making compression really efficient. Write speeds go down dramatically as well and we can even stream to an external HDD, no M.2 PCIe SSD needed anymore.
Overall however, the speed doesn't hold up for live, real-time streaming. We tested ZStandard and lz4 and strangely lz4 was worse both in speed and compression factor. This is against the results in the documentation (lz4 should be faster but lower compression factor). We're looking into it and will report back once we have a more conclusive result. Especially if we manage to get streaming, lossless compression to work for fluorescence.
😞 I hope you can achieve your final goal! I just don't know how to get you there.