LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

SSD drive corruption with LabView real-time?

Solved!
Go to solution

Hi All, sorry in advance for the cross-post.

 

After running LV real-time doing some vision analysis and saving fails/passes to an SSD drive for a few months we're seeing what we suspect might be drive corruption.  We don't have any hard clues, but we've noticed that if clone the drive to a new drive and boot off that drive, the problems magically go away.

 

Does this sound familiar to anyone and if so, would you have any recommendations as to where to start digging?

 

(And if the issue is a file system issue, are there any log files that we can check into?)

 

Thanks in advance,
Dan

0 Kudos
Message 1 of 15
(1,845 Views)

@dmccarty wrote:

Hi All, sorry in advance for the cross-post.

 

After running LV real-time doing some vision analysis and saving fails/passes to an SSD drive for a few months we're seeing what we suspect might be drive corruption.  We don't have any hard clues, but we've noticed that if clone the drive to a new drive and boot off that drive, the problems magically go away.

 

Does this sound familiar to anyone and if so, would you have any recommendations as to where to start digging?

 

(And if the issue is a file system issue, are there any log files that we can check into?)

 

Thanks in advance,
Dan


Vision analysis sounds like heavy disk i/o.  Lots of disk writes.  Each SSD "sector" has a finite amount of writes.  I guess, by cloning it to a new drive, you give yourself some breathing room, but eventually you will run into the same issue.  I think this is a more likely scenario than LV corrupting your hard drive.

 

Have you looked at your SSD's S.M.A.R.T. drive health report?

Bill
CLD
(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.
0 Kudos
Message 2 of 15
(1,823 Views)

SSDs have a limited number of read/write cycles on a single cell before they go bad.  In theory a drive should manage this internally and move files from failed cells to non-failed ones quietly in the background, but if you've been doing massive amounts of data saving or buying bargain-basement SSDs, it might not have many write cycles available or might be doing a bad job of managing itself.

 

Do you still have some of these older drives lying around?  You can try one of many "drive health checkers" on them to see if they report problems.

 

Whatever the cause, it seems unlikely that LabVIEW specifically would be the root cause, but it's hard to know for sure.

0 Kudos
Message 3 of 15
(1,819 Views)

Thanks for the tip, I don't believe we've tried the SMART tool, but that fits along with one of our working theories that trim isn't being actively used on the drive and the SSD is eventually "unhealthy."

 

(Speaking of which, does anyone know if NIOS has a trim utility for SSD's?)

 

I should also mention that the drive isn't completely corrupted, it just seems to cause system errors, like random crashes or bad behavior that's been very difficult to reproduce.  Are there any filesys logs that NIOS keeps around that might help see whether this is the culprit?

0 Kudos
Message 4 of 15
(1,793 Views)

Wich controller model? If it is a NI Linux based one, it's pretty standard Linux so you can use many tools that are available for Linux. You just might have to install them first with opkg.

Rolf Kalbermatter
My Blog
0 Kudos
Message 5 of 15
(1,762 Views)

These systems of ours go back to the day when NI recommended RTPC's, so it's a pretty custom system.  But the drives are Swissbit SSD's and fairly new.

 

Link: https://www.digikey.com/en/products/detail/swissbit/SFSA120GQ1AA4TO-I-LB-226-STD/9920507

 

From a Linux shell are there any "SMART"-like packages to display drive health or whatnot that you'd recommend for this sort of thing?

0 Kudos
Message 6 of 15
(1,755 Views)

First hit on Google "linux s.m.a.r.t status":

Using smartctl to get SMART status information on your hard drives 

Bill
CLD
(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.
0 Kudos
Message 7 of 15
(1,748 Views)

A custom RTPCs, then it is almost certainly Pharlap and then you are pretty much out of luck. Whatever is contained in a Pharlap installation is pretty much all that you can get. No extra utilities to install.

 

In earlier days you might have been able to buy the Pharlap Development System to get access to extra tools, but that was a very expensive investment and required you do get dirty with low level programming interfaces. With Pharlap being definitely discontinued several years (~2013) ago that's not even an option anymore.

 

That its disk drivers might not be best suited for SSD use is probably a safe assumption.

 

The next version of Windows 10 will contain build in SSD performance reports, probably based on accessing the SMART interface. Under Lnux there are all kind of tools and utilities to access various aspects of a SMART drive. I used CrystalDiskInfo under Windows in the past for this.

 

Swissbit doesn't say anything to me. Despite its Swissness sound it may not be the quality you expect. It mostly depends on the SSD chips used and there aren't to many manufacturer of these. The various drive manufactures simply package them in some way and add some controller logic to it. Part of that controller logic can make a difference as it implements things like TRIM. But the cell quality in the chips is ultimately responsible for the number of write cycles a cell will survive. And this quality varies wildly. Even Samsung, one of the better suppliers in the market has several classes of SSD chips which vary a lot in life expectancy and price.

Rolf Kalbermatter
My Blog
0 Kudos
Message 8 of 15
(1,747 Views)

@rolfk wrote:

A custom RTPCs, then it is almost certainly Pharlap and then you are pretty much out of luck. Whatever is contained in a Pharlap installation is pretty much all that you can get. No extra utilities to install.

 


Could OP pull the drive and analyze it in another machine? It sounds like they already have old ones removed from the system.

0 Kudos
Message 9 of 15
(1,729 Views)
Solution
Accepted by topic author dmccarty

@BertMcMahan wrote:

@rolfk wrote:

A custom RTPCs, then it is almost certainly Pharlap and then you are pretty much out of luck. Whatever is contained in a Pharlap installation is pretty much all that you can get. No extra utilities to install.

 


Could OP pull the drive and analyze it in another machine? It sounds like they already have old ones removed from the system.


Yes, that is why I mentioned CrystalDiskInfo under Windows. And Bill mentioned smartctrl under Unix/Linux.

 

But it won't change the fact that Pharlap OS isn't really an ideal choice for SSD's. Even Windows needed quite some time to support them properly and as was shown in a recent patch, still managed to mess that up with one of the latest releases.

 

At the time SSD's got usable both from the point of reliability and price, Pharlap OS had been already announced to be in maintenance release mode. You could get minor updates only from that point on if you had a valid and ongoing Software Development license contract for it (which NI had). Even documentation was only available under these conditions. But Intervalzero won't support Pharlap ETS forever. It's likely the reason why NI plans to discontinue support for it after LabVIEW 2020. 

Rolf Kalbermatter
My Blog
Message 10 of 15
(1,714 Views)