# LabVIEW

Showing results for
Did you mean:

# how do I add a 95% confidence ellipse to an XY scatter plot.

How do I add a 95% confidence ellipse to an XY scatter plot.

Message 1 of 4 (8,003 Views)

## Re: how do I add a 95% confidence ellipse to an XY scatter plot.

Just add the ellipse as another plot. The XY graph can accept an array of plots. Please look at the examples that ship with LabVIEW as they show you how to use charts and graphs.

Message 2 of 4 (7,987 Views)

## Re: how do I add a 95% confidence ellipse to an XY scatter plot.

I'm sorry, I did not ask the question I intended to. How do you calculate a 95% confidence ellipse to be added to a XY scatter plot?

Message 3 of 4 (7,979 Views)

## Re: how do I add a 95% confidence ellipse to an XY scatter plot.

http://zone.ni.com/devzone/cda/epd/p/id/5832

possibly some info towards the bottom of this link.

http://zone.ni.com/devzone/cda/tut/p/id/6954

here is a recipe that gives the ellipse (or set of ellipses) that i think your boss wants. in fact, there are two recipes; the first uses several columns in your spreadsheet; the second uses less columns but more formulae and is more likely to contain errors because of numerical rounding and because i've made a mistake, but may be easier to write as a reusable formula.

at the end i will explain what the ellipse means.

recipe 1
1. start with two columns `x` and `y`, which are the coordinates of the points in your scatterplot. from those columns, calculate the following variables:
• `sumx = sum(x)`
• `sumy = sum(y)`
• `sumxx = sum(x*x)`
• `sumyy = sum(y*y)`
• `sumxy = sum(x*y)`
• `n = number of points`
2. from those values, calculate the average (`xbar`, `ybar`) and the variance and covariance:
• `xbar = sumx/n`
• `ybar = sumy/n`
• `varx = sumxx/n`
• `vary = sumyy/n`
• `covarxy = sumxy/n`
3. generate two new columns, `dx` and `dy` where:
• `dx = x-xbar`
• `dy = y-ybar`
these should be the same scatterplot as before, but shifted to be about the origin.
4. calculate the following, which are the same as above, but for the new columns:
• `sumdxdx = sum(dx*dx)`
• `sumdydy = sum(dy*dy)`
• `sumdxdy = sum(dx*dy)`
5. calculate the angle `theta = 0.5 * arctan(2*sumdxdx / (sumdydy*sumdxdx))` which is the angle that the ellipse is "rotated" from the horizontal, and also:
• `c = cos(theta)`
• `s = sin(theta)`
6. generate two new columns `X` and `Y` (if you can't use capitals change the names!) which should be the same scatterplot, but with the rotation removed:
• `X = c*dx - s*dy`
• `Y = s*dx + c*dy`
7. as before, generate the following from the new columns:
• `sumXX = sum(X*X)`
• `sumYY = sum(Y*Y)`
• `varX = sumXX/n`
• `varY = sumYY/n`
8. finally(!) calculate:
• `a = sqrt(varX)`
• `b = sqrt(varY)`
these are the lengths of the semi-major and semi-minor axes (the two principal "radii") of the ellipse (which is which depends on your data). the basic ellipse that you want to plot has that size, is centred on `xbar`, `ybar`, and is rotated by the angle `theta`. see the explanation below for what this means and how to generate other (larger) ellipses.
recipe 2
using just the `x` and `y` columns, and appropriate formulae above, you can calculate everything using:
• `vardx = varx - xbar*xbar`
• `vardy = vary - ybar*ybar`
• `covardxdy = covarxy - xbar*ybar`
• `varX = c*c*vardx - c*s*covardxdx + s*s*vardy`
• `varY = s*s*vardx + c*s*covardxdy + c*c*vardy`
explanation
traditional statistics often assumes that noisy data is distributed as a "gaussian" or "normal" distribution (this is justified by the famous "central limit theorem" that says you get this distribution when life is complicated). the process above is equivalent to fitting a model for that distribution when it describes two, correlated variables (x and y). the final values `a` and `b` are the "standard deviations" of the underlying, uncorrelated distribution (ie with the rotation removed).
section 15 of numerical recipes describes things in more detail (especially section 15.6).
if your data really do follow this distribution then a and b give the relative sizes of your ellipse. you can scale that ellipse to include any given fraction of the distribution.
disclaimer - i am not 100% certain about this next bit: in particular, the usual ellipse plotted is one that contains 68% of the data (it's a convention), which you would get by multiplying a and b by the square root of 2.3 (see the reference above to numerical recipes).
posted by andrew cooke at 6:44 AM on April 14, 2006

there's a BIG mistake in the expression for `theta`. the numerator should be `sumdxdy` and the denominator is a subtraction:
`theta = 0.5 * arctan( 2*sumdxdy / (sumdydy - sumdxdx) )`
note that it's "y minus x" in the denominator. sorry!
Message 4 of 4 (7,968 Views)