Thank you Sam for your prompt reply. I would like to raise a service request with your applications engineers to assist in the final decision. However, at the moment I’m exploring options and I think the sort of decision making thought processes regarding which development tool to use is happening many times over around the world, so hopefully our dialog will help others too. It’s a time consuming process to determine which tools to use to get a job done. The right tool can mean so much for the product under development and my career !
Firstly, I was concerned to hear that “at this point we are still supporting LabVIEW Embedded for ARM”. This appears to suggest that National Instruments may abandon the product in the future. Is this the case? This would be a pity since it’s a great concept that needs quite minimal additional development to get it to be much more widely acceptable (mainly moving to the newer more powerful ARM processors). I like programming in LabVIEW and this solution looks the best so far – particularly if it could target an ARM Cortex M4 (or better).
Here is some info about our application. It is an in-house test tool with the following features:
1) Handheld device approximately 190 mm x 120 mm x 50 mm
2) 8 x analog inputs – 16 bits at 100 Hz bandwidth
3) 8 x I2C inputs
4) 8 x analog outputs – (being the 8 analog inputs or I2C inputs, scaled and filtered and sent back out )
5) Internal selectable 2nd order filtering (5, 10, 20, 50 and 100 Hz)
6) Display of selected waveforms on an LCD display
7) Streaming of data on Ethernet or RS232
8) Storage of data on a USB key
9) Some on board computation of the signals
A fairly generic design that would be similarly replicated the world over.
There will be somewhere between 200 to 700 built for in-house use. There may be potential to market the device, but this would be a bonus. The target price is about $600, for which the budget for the processor, Ethernet, display and A/D and D/A components would be about $200. The remaining $400 is for the case, power supply, connectors, PCB, assembly labour, etc. It would also be nice if the device was battery powered, but this is an optional requirement.
In order to reduce the requirements on the analog anti-aliasing filters, the plan is to oversample (say 1 kHz), digitally filter (with aggressive filters) and decimate to get the same effect as complicated analog filtering and sampling. These signals would then be passed on at 200 Hz to the 2nd order filters mentioned above.
I have the LabVIEW Embedded for ARM (LEA) evaluation kit and although it can perform the required functionality, I am quickly running out of CPU power. If I put in two loops the performance goes down. If I put in a timed loop the performance goes down. I know that if LEA could target one of the more powerful ARM processors I would have no trouble, but with an ARM7 or Cortex M3 it looks like there will be too many constraints. The option of porting LEA to an ARM9 or Cortex-M4 (able to run RTX) is viable if we could pay an NI Engineer for 1-2 weeks of their time to do this for us. As a bonus for NI and the community, there would now exist a new Tier 1 product ! In summary, LEA will probably only work for us if NI or a 3rd party introduces a more powerful out-of-the-box (Tier 1) option.
The LabVIEW C Generator is worth looking into, however I want to concentrate on developing our test tool rather than writing low level drivers. I get the impression that I may need to spend months writing or finding/testing the required USB, I2C, USB and Ethernet drivers both at the board level (if the required BSP doesn’t exist) and the LabVIEW level. That is the beauty of LabVIEW Embedded for ARM – I get these and many more for future enhancements out-of-the-box. In summary, the LabVIEW C Generator option will probably only work if I can get (or pay for) the various drivers ready written, or if it isn't too time consuming for me to write my own drivers – so that I can concentrate on the coding specific to the test tool.
Finally, we are already familiar with cRIO, which we use for one of our other test tools. The other test tool is larger, more expensive, more demanding and lower volume, so the cRIO is well suited for it. The single board cRIO would blow the $200 allocated budget, would be too large, does not have all the nice peripherals found in a microcontroller and, if I’m not mistaken, we can’t just take the cRIO circuit diagram and use it in our design. Also, the optional requirement of battery powered would probably be blocked. So the cRIO is suited to some of our application, but unless I’m mistaken, not for this application. (There may be an opportunity here for NI to provide a service where customers provide their analog circuitry and it is included in a custom board – now that would be nice.) In summary, cRIO will probably only work if a very small board option – about 80 mm x 80 mm daughter board – that could be plugged into our board, for a cost of about $150 (to allow another $50 for A/D, D/A and display)
I’m hoping we can continue our dialog both for myself and the many other people who are and will go through the same thought processes.
When I said 'at this point we are still supporting LabVIEW Embedded for ARM' I worded it as such because we currently do not have a defanite roadmap for the future of the module. When I say supported I basically mean fixing bugs, adding features and releasing a new version of the module with each new version of LabVIEW. Since we do not have a defined road map I don't want to make any promises that we cannot keep.
This is what leads the the following recommendation from my last post:
Since the future of the product is uncertain at this point we tyically offer the following guidelines:
If the current features of the module meets your needs and you are working on a relativly short term project (~1 year) we recommed using LabVIEW Embedded for ARM.
If you are working on a long term, multi year project, or the current features of the toolkit do not meet your needs we can suggest alternatives (hardware and/or software).
We are looking into using LabVIEW Embedded for ARM for an upcoming project (a hand held test tool). I have been putting the LM3S8962 evaluation board through its paces.
Firstly, let me say that LabVIEW Embedded is a pleasure to use. An application can be put together quickly using the various device drivers. Very nice. However, in the area of performance, I have some concerns.
In order to put together a business case to invest in the development system and commit to using it for a critical design, I need to undertake a risk assessment.
1) The number of Tier 1 devices has been reducing and the remaining 2 boards are quite old. Will Tier 1 support for a modern more capable board be introduced?
2) You mention that there is no defined roadmap, which is a concern to me. When will a roadmap be done and will LabVIEW Embedded continue to be developed?
3) It appears practically speaking only one loop can be used (and not a timed loop). Using the most time optimized build specification settings results in a 75% reduction in maximum loop rate. That is, if a loop can run at 1000 Hz's (adequate) and then a second loop is introduced or a timed loop is used then, with the mandatory build specification settings, the maximum loop rate becomes about 250 Hz, which is inadequate for our application. Is this consistent with NI's observations?
4) For the provided TCP/IP example, I could only get a a maximum loop rate of 34 Hz. And this is with nothing else running on the ARM processor. Is this consistent with NI's observations?
The current two Tier 1 development boards have CPU performance of about 60 DMIPS. The more recent ARM processors have a 2000 DMIPS performance (33x more!). Given a RT operating system constant load, I estimate that this will result in a 132x performance increase! This would then give me the ease of use and performance requirements for my project.
As you can appreciate, I cannot commit to this product until NI get their roadmap in order and informs its customers.
I certainly hope that NI will continue to develop this product!
As I mentioned our goal with LabVIEW Embedded for ARM was to target low end microcontrollers. Since there are hundreds of different ARM targets we cannot provide tier one support for all of them. The tier one targets are meant to be used for evaluation and as examples when porting to your desired target.
Since we are currently working to define the roadmap for the future of LabVIEW Embedded for ARM I cannot provide features, dates, etc because these are not currently fixed.
Regarding the loop rate issue you mentioned it is hard to answer this general question since it will depend on your hardware and especially what your are doing in your loop(s). Keep in mind that you are working with a relatively low power, single core architecture. When you run parallel loops they run in separate threads but share a single CPU. There will, of course, be some overhead when switching between loops, but this should be minimal. If you are not able to achieve the loop rates you desire you may have to consider rearchitecting your code. For example move low priority sections of code into a slower loop (one that only executes every second rather than at 1KHz).
Unfortunately I don't have benchmark data for TCP/IP on an ARM target. Keep in mind that TCP/IP is non deterministic. TCP/IP communication should not be done in a timed loop so I'll assume you are using a while loop. Again its hard to comment on general observations such as these because the results depend greatly on your hardware, code, network, etc. For example (and I'm completely making this up) if you send a single 1 byte packet each loop iteration perhaps you get a loop rate of 35Hz. However if you send a 1500 Byte (maximum ethernet TCP packet size) packet each loop iteration you may still get 35Hz. So now the question is, in your application do you need small packets with low latency or just high bandwidth? Again this is just a general example of why its hard to comment on such benchmarks. The best thing you can do is code things up they way you need them to work and then try to optimize from there if you need to.
Thank you for your reply. Let me make some clarifications.
1) Will Tier 1 support for a modern more capable board be introduced?
Please , do not attempt to provide Tier 1 support for the hundreds of different ARM targets. This would be a waste of NI’s resources. (And we, the LabVIEW community, want and need NI to do well.) This is why I asked for “a modern more capable board” (singular).
What is needed is a good modern powerful Tier 1 development board. This one board should have the same capability as the current Tier 1 offering of the LMS3S8962 Evaluation Board plus a USB-OTG and at least 240 DMIPS of CPU power, which would make it four times faster than the LMS3S8962. Of course, 1000 DMIPS (or more) would be great. If you like, I can make a suggestion for a suitable development board.
Developing this Tier 1 board by someone from NI with access to the current source code and familiarity with the previous development would take only about 2 months of effort. I can guarantee you would get this investment paid back in increased LabVIEW Embedded for ARM sales.
2) When will a roadmap be done and will LabVIEW Embedded continue to be developed?
I understand there are marketing reasons for not sharing this information. However, please keep in mind that National Instruments customers are different.
The customers of just about any other product on this planet can go elsewhere if they are not satisfied. LabVIEW programmers can not go anywhere else – short of beefing up their C skills or similar. So, we are dependant on National Instruments. This is both NI’s greatest strength and weakness. The strength is that NI has a captive market. We get told what we will get, when and how much we will pay. It’s also NI’s greatest weakness. If the trust of the LabVIEW community is broken or we feel that we are being taken advantage of then long-term structural alternatives will be looked at.
I like programming in LabVIEW. It’s fun. However, I get paid to produce results and there are already rumbling from management about why we are using LabVIEW that are becoming progressively more difficult to defend.
3) Does working with more than one loop or timed loops reduce CPU performance by about 75%
The beauty of LabVIEW is the ability to have multiple loops (and timed loops). However, this appears to come with a large performance penalty. There is no need to know what is happening in the loops – this is a relative performance question. So put in a few simple maths functions in a single loop and run it at full speed. For ease of comparison, whatever maths function you use, duplicate it in the loop so you can readily split it into two loops later.
Ensure the complier settings are as follows:
Run time options: Optimize for speed
Generate guard code: OFF
Enable debugging: OFF
This should give you the fastest execution. Note, you will need to do certain things such as “disable parallel execution” in order for these options to be allowed. This is OK since we only have one loop. Time how long it takes to do, say, 20,000 loops.
Now, split the simple maths functions into 2 loops. This will force certain compile option changes. Make the minimum changes for this to be accepted. Time how long it takes to dothe same number of loops as before.
What is the ratio between these two? I’m getting a 75% reduction. Can you please confirm? If something here is not clear, take your best practical guess on what should happen.
The answer to this question will tell me how much LabVIEW programming versatility I have.
4) For the provided TCP/IP example, I could only get a maximum loop rate of 34 Hz. Is this consistent with NI's observations?
No need to comment in general. The board is the LM3S8962 Evaluation Board supplied by NI and the code is the TCP/IP example provided by NI. Don’t worry about network traffic – it won’t be an issue.
What loop rate do you get?
What to do
You mentioned “The best thing you can do is code things up they way you need them to work and then try to optimize from there if you need to.”
This is exactly what I have been doing with an evaluation copy of LabVIEW Embedded for ARM. The protocol code was very quick to set up (what else would you expect from LabVIEW). However, the performance has not been sufficient for the LM3S8962. I have spent many multiples of the coding time trying get the required performance, without success. My application is not too demanding and I know that if it was coding in C there would be more than enough performance. Although I can program in C, I prefer LabVIEW since its fun. However, as already mentioned, I get paid to produce.
It’s possible that I could eventually get the application to work using LabVIEW Embedded, however if I fail what do I tell my boss? Do I tell him I’ve wasted $9,500 and several months of development effort and need to start again with C or spend another 4 months converting a Tier 2 board to a Tier 1 board? Sorry, my career is worth too much to risk that.
Of course NI has the power to get me back on track. Introduce the additional Tier 1 board mentioned above. Then all will be well. NI will increase its LABVIEW Embedded for ARM sales, LabVIEW users will feel as if NI is looking after customers and I can develop my application.
I'm wondering, do you get the same type of performance decrease when you add another loop to your code without changing any of the compiler options? This would be a good test to determine whether it's the second loop or the compiler options that's causing this CPU performance decrease.
If I use the best compiler options with one loop I get a certain level of performance. If I introduce a second loop certain compiler options are disabled and the performance drops by 75%. So, it was not the selected changes in the compiler options, but the forced changes in the compiler options that caused the speed degradation.
That is, the introduction of the second loop does not allow certain compiler options, which in turn result in a dramatic performance drop.
To be clear, if the "inferior" compiler options are used then there is only a small difference in performance between having one loop or multiple loops. But if I use one loop I have more compiler options and execution speed is four times faster - but this is the reference speed, so it is more accurate to say that the second loop won't allow certain compiler options thereby resulting in a 75% speed/performance reduction.
In summary, the introduction of a second loop disables certain compiler options which result in a 75% speed/performance reduction. This is a phenomenal impact.
I agree that it's likely the compiler changes that are causing that speed degredation, rather than the addition of another loop. The addition of another loop only forces you to change those options, which are likely the contributing factor to the performance drop. For testing purposes, have you run a trial to see what kind of performance you get with the "inferior" compiler options on the code with a single loop?
If I run the "inferior" compiler options on either a single loop or the single loops split into two loops I get similar execution times. That is, it is the compiler options that are causing the speed degradation. Using parallel loops means that the "Use stack variables", "Enable expression folding" and "Generate C function calls" can not be used. Not being able to use these three compile options means a significant performance hit. If programming in LabVIEW Embedded and speed is of the essence then if you can get everything to happen in one loop without calls to delay type functions then there is a significant performance increase.