06-16-2010 03:16 PM
Hi Wired again,
my answer collided in flight with yours. I understand that you're doing pure streaming as I was, and nor Read Modify Write as Joe D. was suggesting, and that your problem is really in what I named as c). Well - my suggestion is try the most you can to avoid any unnecessary operation, even at the expense of efficiency - perhaps just allocating to each string ts aximal length is an option? Details really depend on what you are trying to do, but I figure out that any processing (epecially of many small chunks, expecially if there are hidden stride and array index computations involved) can severely get in the way before a block DMA transfer can happen at once.
Enrico
06-16-2010 03:17 PM
A Kudos for your response, good sir!
I figured I was up against a lot of hardware-related challenges. I might just try a x8 controller instead of the x4 that I have now (RocketRaid 2314), since I found an inexpensive one. But I do think my biggest problem at the moment is the memory copies required to prepare the data for writing to disk. I might also try swapping the PXie MXI card and the RAID controller card to see if there's any performance difference. Thank you again for the detailed response. 800+ MB/s? ...You da man!
06-16-2010 03:30 PM
In true honesty, I have to say that I'm only sure that the bottom line was that my system was (after a LOT of efforts and frustrating attempts) up to its target - capable of streaming video at 625MB/s to the first half of a 8x Raid0 array magnetic hard disks of total capacity 5.6TB. I may have seen 800+ MB with just dummy writes, on XP64 as OS rather than XP32, and with a different striping scheme than the one used in practice (because you know, XP32 can't handle volumes larger than 2TB, but LV does not support Vision on XP64, and...) Less than really 800MB/s, but alas, that's the ballpark.
Enrico
06-16-2010 03:35 PM
06-16-2010 03:53 PM
I'm really just quoting off memory here. First of all, the decision of XP32 rather than XP64 or Vista was at the time of my developing that project, with LV8.6.1 - the situation may have evolved. I got hold of LV2009 when I was already too far to reconsider. XP64 had the definite advantage that I could configure my whole array as a single volume at controller firmware level, while with XP32 I was forced to split it to four 1.4TB subvolumes, logically striped at OS level. AFAIR, my tests gave an appreciable performance decrease for that, but still within the 625+ target (maybe 660-670 for dumb writes?). Can't really say whether because of 32 vs 64 or because of logical striping or what. OTOH, LV8.6.1 was running as an emulated 32bit application on XP64, while the framegrabber drivers were native 64bit - a complete mess. At second thought Vision was probably functional in XP64, but the problem was there. And XP was the only choice for Vision, if I could I would have gone Linux...
Enrico
06-17-2010 07:47 AM
Please excuse me for jumping into the middle of this conversation but you mentioned disks and performance.
One topic not addressed above is the disk foramt and what is involved in actually writting a file. I'll spare you the dirty details but "pre-writting" a file can make a very big difference in performance. What I have done in past apps is guestimate the app will write twice as much data as the spec requires and will write the data file before the application starts. This moves all of the work of allocating the disk scetors and updating bitmaps, diectories and cahces to a time prior to the actually running the test. When the test is over I truncate the files to the actual length.
Now onto performance...
I would concider using an AE that inits a pair of buffers, one active and a spare both sized to handl ethe sector sized data. The AE should fill up one buffer using inplace operations and when full queue up data that data to the file writter and starting building in the second buffer.
TDMS has had some interesting numbers associated with it. Maybe not good for your final file format but may be useful as temp files.
Have fun and please keep us updated!
Ben
06-17-2010 11:03 AM
In fact in my application I took care of using the raid array exclusively for streaming and not for storage, and generally I wrote a file at a time, which was later deleted. That should at least avoid disk fragmentation and the associated overhead. I don't remember to have observed during development serious degradation in performance in case a few files were hanging around on that volume, though. YMMV, of course.
Besides, multiple buffers is the standard technique here. I used a ringbuffer with place for some hundred images (i.e. the elementary chunk of data pulled at once out of the framegrabber), a grabbing thread in parallel with the writing thread, and took care to queue only the pointers to the ring elements, not the elements themselves. Al this look me standard practice, and probably mentioned as such in the tutorial references mentioned above.
Grabbing in my case was intrinsically in place (DMA to allocated buffer), and dumping to disk as well.
Enrico
06-17-2010 11:16 AM
Enrico Segre wrote:In fact in my application I took care of using the raid array exclusively for streaming and not for storage, and generally I wrote a file at a time, which was later deleted. That should at least avoid disk fragmentation and the associated overhead....
Enrico
To write a file the OS has to
1)recognize the new dat extends the existing file size.
2) Loacate enough sectors on disk to hold the additional data (hopefully bitmap is cached)
3) Mark those sectors as allocated (seek heads to bitmap and re-write the flags)
4) Update the index file to show where those blocks of the file are (seek heads to write file)
5) Update the directorty entry so we can find all of the sectors when we open the file next time (still more seking)
6) Seek to where the data is written and start writing hoping we don't have to switch cylinders in the middle.
So be pre-writting the file, you dont have to do steps 1-5 at run time only step #6.
Fragmentation only enters the game in step number 6 and if your files are pre-written and then the disk is defragmented, THEN you can start looking at the published disk specs.
Ben
06-17-2010 01:03 PM
I said "at least"...
Not sure what is the impact of the other operations, expecially if they are done in memory, if seek is cheap because the allocation table is empty, and the index is dumped to the disk at a slower pace, and in what a the scheduling done by a good controller may help. My feeling was that they were not penalizing, but I don't remember having made specific tests.
Enrico