LabVIEW Embedded

cancel
Showing results for 
Search instead for 
Did you mean: 

Error 6 writing to USB-stick on cRIO-9068

Hi Brad.

 

 

Thanks for helping with this issue, and confirming that FAT32 does not handle power cycles very well.

 

Is it possible to investigate how VxWorks is doing it differently, so that we have an explanation on why we haven't seen this issues on these controllers?

 

That leaves us with some questions for how to proceed with the 9068 system:

1) EXT3 is not really an option, since data on the USB stick should be easy accessible for the users of this product. Meaning Windows plug n' play.

 

So we need to somehow monitor when the power drops out, so that we can continue working with FAT32 (NTFS is not supported on our RIOs).

2) On most RIOs we have two power sources. So one solution could be to use a capacitor and connect that to the second input. That way we can then monitor which power source is active, and react (close the file), if we have a power loss. However, even though the 9068 does have two power source inputs, it doesn't seem like this feature is supported...??

 

3) We then have an option to add an extra module to our system, that can measure the voltage on power source input 1, and it is it 0V it means that we are using the backup capacitor and therefore should close the file. However, this add extra complexity and cost to the system.

 

Worst case is, that Bjarke will be forced to stick with cRIO-9014 and VxWorks, so I hope we can come up with a solution.

 

Thanks, your help is much appreciated.

 

Best Regards

Alex E. Munkhaus
Certified LabVIEW Developer (CLD)
System Engineer
0 Kudos
Message 21 of 33
(4,127 Views)

One option to try would be to change how the USB devices are mounted, adding the "sync" option. Depending on writing throughput requirements, this may be a non-starter as the option will decrease writing throughput. To test this out, from the console issue the command (as admin)


mount -o remount,sync /media/sda1 

0 Kudos
Message 22 of 33
(4,118 Views)

Hi Brad.

 

Thanks for your help with this issues.

I have escalated the problem to the AEs in US, however he is refering to you again. Can you help bring some attention to this?

 

Can you help answering our question:
1) The customer has been running the exact same application on a cRIO-9014 which is running on VxWorks. They have never seen the error on this system. According to Brad, Linux does buffer writes pretty extensively. This means that it offer higher performing at the expense of being more prone to issues like this. He does not know how the USB subsystem works on VwWorks. So this leads to the question:

Is it possible to investigate how VxWorks is doing it differently, so that we have an explanation on why we haven't seen this issues on these controllers?

 

Thanks again.

 

Best Regards

Alex E. Munkhaus
Certified LabVIEW Developer (CLD)
System Engineer
0 Kudos
Message 23 of 33
(4,094 Views)

The system is intended as a continuous logger. Logging over several weeks or even months. The system is battery powered(for the most part) and monitors physical phenomena that takes a long of time to manifest. For this reason we need only to measure for several minutes up to an hour every day then power down to conserve power. Our customer requires a lot of versatility in the abillity to turn off the power. Though my test is extreme, a powercyling every hour for a month is not out of the question,

Due to the rather large amount of data we end up with, we cannot use the storage space of the cRIO. Combined with the ease of accessing the data, we have chosen to store all data received on the above-mentioned USB-stick.

I use FPGA and I measure at 20Hz on 8 channels. Needless to say we cannot live with data getting corrupted after a long time monitoring. 😞

Bjarke Dahl-Madsen - CLA,CLED
0 Kudos
Message 24 of 33
(4,086 Views)

@BjarkeDM wrote:

...

For this reason we need only to measure for several minutes up to an hour every day then power down to conserve power. Our customer requires a lot of versatility in the abillity to turn off the power.
...


Could you go into a bit more detail as to *how* this power-down happens? What type of device is controlling when the cRIO (and the app) is running?

0 Kudos
Message 25 of 33
(4,082 Views)

We have a hardware timer that toggles a relay that cuts the power to the cRIO.

This device is currently set to 5 minutes "on" 2 minutes "off".

Bjarke Dahl-Madsen - CLA,CLED
0 Kudos
Message 26 of 33
(4,079 Views)
Highlighted

I think the only reasonable option without change to the setup and minimal change to the app would be to keep time in the app itself and, when getting close to the power-pull time, run a sync on the file and close the handle and wait for power to be pulled. If you needed some configurability in the timing of the shutdown, you could provide a file on the drive that contains the current power cycle runtime to know when the sync-and-wait operation should happen.

 

I am still pretty surprised that there were no issues on the 9014, but then again I have not spent much time with one of those.

0 Kudos
Message 27 of 33
(4,066 Views)

Hi Brad.

 

We cannot be sure, that the intervals will be 5 mins On, 2 mins off in the final application. The power losses can happen any time.

 

So as I see it, we have a couple of options:

1) Use a file format that supports journaling, like EXT3 of NTFS.

2) Or handle the power losses. We can monitor which of the two power inputs are in use and safely shut down the code, if the backup is in use.

 

I think option 2 is the best way to go for the customer. However, we would still like to know, why we haven't seen this error on VxWorks. I made a simple example, that I'm going to run over the weekend, to see if I can reproduce the error.

9014: Is controlling the power to the 9068. Will do a power cycle every 30s. So the cycle will be:

30s On, 30s Off, 30s On, 30s Off, ...

9068: Is writing to two files at the same time (different from what the customer is doing, to stress test). Loop rate is 10 Hz. At every iteration, I'm write more datapoints to the files that already exists. The takes approx 25s to boot the system, and start running the RTexe.

Windows: Is monitoring for any errors to occurs and log them to a file.

 

Please share with us if you have any news or ideas.

 

Best Regards

Alex E. Munkhaus
Certified LabVIEW Developer (CLD)
System Engineer
0 Kudos
Message 28 of 33
(4,048 Views)

Alex,

 

I would also recommend trying with the storage device mounted with the sync option to see whether or not that alleviates the issues and is tenable given the reduction in performance.

0 Kudos
Message 29 of 33
(4,042 Views)

Bjarke,

 

Is there anyway you can send a message to the crio to let it know it has 10s seconds to finish up before the shut down? Basically tell the cRIO to enter a safe state.

I feel that the USB drive is being left as in use so it is confused when you restart the system and reinitialize the device because the device was writing something. Since another reboot fixes this then it seem like it isn't a permanent error. I believe this has to do with leaving a file refence open when we pull the power from the cRIO. Linux assumes the disk was still in use. You can typically reproduce this on windows by yanking a thumb drive without ejecting it and the next time you insert the device you get a prompt saying the device was not ejected properly last time do you want to use it.

 

Can you upload a snippet of your file io code as it stands now or atleast describe the code so we have an idea how the code will behave between the two threads that are doing file IO?

Can you dedicate a consumer thread to handle file IO?

Are you keeping the file reference open and is it possible for the reference to stay open when you pull the power of the crio?

As Brad recommended why can you keep a timer for 1 minute and then tell the crio to shutdown and then the crio is in a safe state. (no file IO, any hardware IO is stopped)?

Since this is a planned power outage then implementing the timer should prevent this from happening most of the time. For the unplanned outage you could add to your intialize case checking to make sure the usb is accessible and if not have the cRIO reboot itself.

I do believe that is it possible to get a similar error in VxWorks maybe the os is better at recovering for the device. I'd speculate you could unmount and remount the usb device and see if this resolves the issue. I believe Brad had given you steps but I can't recall the final outcome of that processs. 

 

Can you please provide some more information to why one of the above scenarios won't work for your application? 

Even if he 9014 did not suffer from this issue given that this code can exhibit an issue then I think this calls for additional work to handle this case and make your application more robust. 

 

US

Kyle Hartley
Software Engineer
NovaCentrix
0 Kudos
Message 30 of 33
(4,028 Views)