Least squares fit alternative?

Ken_Brooks · ‎09-10-2018

In the process of calibrating a tool, we need to find the best line fit for three data points. LabVIEW has a nice Linear Fit.vi tool, but unfortunately that is only part of the Full Development System, not the Base system. This would cost $3000 that our small company can ill afford, just for one library VI. I wonder, has anyone out there written a good alternative bit of line-fitting code that would be willing to share it?

Taki1999 · ‎09-10-2018

How accurate does your calibration need to be? With only 3 data points, it might be simplest just to throw out the middle one and use an analytical form on the remaining 2. Alternatively, if you're sure it will be linear, increase your sampling at the endpoints and then use an analytical fit on the average of the endpoints.

In a pinch, you could also use the linfit in an external program like Excel.

Bob_Schor · ‎09-10-2018

Least Squares has the virtue that if you understand the idea and know a little math (say, first year calculus), you can usually "do it yourself" with pencil and paper (to get the formulas) and a "calculator" (which could be LabVIEW code) to do the arithmetic.

You have three data points, which I assume are measured at three different (known) values of an independent variable, which I will call X, and give you three measurements, Y. Imagine you plot the three sets of points x(i), y(i) on a piece of paper. You want to draw a (straight) line that best "fits" those points, i.e. is "closest" to them. But what does "closest" mean? [Note I'm assuming that they do not exactly lie on the line ...]. There are three obvious possibilities -- the line such that the vertical distances between the points (x, y) and the line are minimized, the horizontal distances are minimized, or the "perpendicular" (shortest) distances are minimized (draw a picture and you'll see what I mean).

[Before going further, a parenthetical remark, in square bracket. With three points, you can exactly fit a quadratic that will go through all three points. In general, with N points, you can exactly fit a polynomial of degree N-1, but this is rarely useful. So I'm going to go with a polynomial of degree 1, i.e. a Straight Line. Note you could also fit a polynomial of degree 0, namely a constant -- your choice of arithmetic mean, geometric mean, median, or something else.]

You say you are calibrating your instrument at what I assume are three different X settings. Since you set it, we'll assume you know its value exactly, i.e. there is no uncertainty in X, only in Y. So in our quest to find the line that "best fits" the points, what we want to minimize are the Y deviations, i.e. the vertical distances of our line at the X points from the Y values.

There are (mathematical/statistical) reasons to say what you want to minimize is the square of the distance between the line and the Y values (hence the name "Least Squares"). Let's see how you would do this. I'm going to generalize and say you have N points (here N=3).

Your model is y(i) = m x(i) + b (the equation for a straight line, high school algebra). We want to know m and b that minimize the distances from this line at the point x(i) from the measured value y(i). The value of the line at the point x(i) is m x(i) + b, and the value of the measured point here is simply y(i), so we need to find m and b that minimize (sum over N points) (y(i) - (m x(i) + b)**2. [Sorry, can't do math here, but I'm just summing the squared differences between my measured Y points and the Y I would get if I plugged in the X value in my equation for a straight line). I want to find values of m and b that minimize this sum.

Did someone say "minimize"? As in "take the derivative and set it = 0"? This is actually a useful exercise, and has a very simple solution. I'm going to give you the formula for the answer, but urge you to try to derive it yourself.

Let's define mean(Y) as the mean of the Y values, and mean(x) as the mean of the X values. Solve these equations:

m = sum((x(i) - mean(X)) (y(i) - mean(Y))) / sum(sqr(x(i) - mean(x))), where sum means "sum over all the values of i" and sqr means "square".
b = mean(Y) - m mean(X). Note similarity to y = m x + b -- we just took means.

So you can code these formulas up in LabVIEW and get the values of m and b that give you the "least-squares best straight line" through your data points. Note that it will work just as well (better, probably) for 100 data points -- the only assumption is that the model (namely a straight-line relationship between X and Y) is valid.

Bob Schor

cstorey · ‎09-10-2018

Here's a very simple implementation I needed to use in a pinch to get a threshold from an I-V curve. It was simply used as rough estimate (screen pass/fail) test before proceeding to further tests on a test setup with LabVIEW basic.

It implements C code from a C++ book I had at the time - https://www.springer.com/gp/book/9781852334888

Its not documented, not pretty and not nearly as thorough as what Bob suggest doing which is worth the work if you need more than a very rough fit.

Edit: Also worth mentioning is there is lots of code out there you could use to make your own DLL to do a better job - https://stackoverflow.com/questions/5083465/fast-efficient-least-squares-fit-algorithm-in-c

Anyway, it might help.

altenbach · ‎09-10-2018

@cstorey wrote:

It was simply used as rough estimate

Your slope is good but the offset is way off. Not sure where the error is, but compare with the stock linear fit.

For fun, I implemented the correct formulas in g (probably similar to what bob suggested earlier) and it gives the correct result. As seen from the diagram comment, things need to be cleaned up and error handling and input validation added.

The stock VI has outputs for error and best Ys and also has inputs for weight, tolerance, bounds, etc. So this is not a replacement. (edit: uploaded new version with correct intercept labeling)

LabVIEW Champion.

Ken_Brooks · ‎09-12-2018

Thanks, cstorey and altenbach combined, you probably saved me a couple of hours of comprehending and coding that math. You're stars! cstorey, I'm curious, why did you produce the X intercept instead of the customary Y intercept?

And I note, even Altenbach's version does not produce an intercept number consistent with what is displayed on the graph (and the graph looks to be the better of the two!) The stated intercept with the given data is -6.16667 but the visual intercept is about 1.75!

RavensFan · ‎09-12-2018

I think he just mislabeled it. The value corresponds to where the Y-intercept would be.

mcduff · ‎09-12-2018

And I note, even Altenbach's version does not produce an intercept number consistent with what is displayed on the graph (and the graph looks to be the better of the two!) The stated intercept with the given data is -6.16667 but the visual intercept is about 1.75!

Be careful what you write! It is not often that CA is wrong. (He is not wrong here) Look at the graph, the y-intercept (offset) gives the value when x is 0, on the plot x does not go to zero, it stops at 1. Replot the graph and change the lower limit to 0 and see what you get.

mcduff

cstorey · ‎09-12-2018

I was interested in extracting slope and theshold current (Ith) which is the X-intercept in a V vs. I curve. I used these values to assess where the laser I was testing was in fact "lasing" or just producing spontaneous emission. If it was lasing then I carried on with the excruciatingly long slow test, it it wasn't "lasing" after a certain threshold then I abandoned that device and moved on.

Yeah, I didn't return the Y-int. Sorry, I should have caught that and made it obvious. But CA's full LabVIEW version is a much more elegant solution, as per usual!

Craig

altenbach · ‎09-12-2018

Yes, the returned value the Y intercept (where x=0)

(did not notice and just re-used the original label :). I uploaded a corrected version, just calling it "intercept" like the stock VI output. picture is still wrong...)

LabVIEW Champion.

LabVIEW

Least squares fit alternative?

Least squares fit alternative?

Re: Least squares fit alternative?

Re: Least squares fit alternative?

Re: Least squares fit alternative?

Re: Least squares fit alternative?

Re: Least squares fit alternative?

Re: Least squares fit alternative?

Re: Least squares fit alternative?

Re: Least squares fit alternative?

Re: Least squares fit alternative?