NI Home > Community > NI Discussion Forums

LabVIEW

Reply
Member
asmit172
Posts: 4
0 Kudos

how do I add a 95% confidence ellipse to an XY scatter plot.

How do I add a 95% confidence ellipse to an XY scatter plot.

 

95% ellipse.JPG

 

 

Knight of NI
Posts: 18,064
0 Kudos

Re: how do I add a 95% confidence ellipse to an XY scatter plot.

Just add the ellipse as another plot. The XY graph can accept an array of plots. Please look at the examples that ship with LabVIEW as they show you how to use charts and graphs.

Member
asmit172
Posts: 4
0 Kudos

Re: how do I add a 95% confidence ellipse to an XY scatter plot.

I'm sorry, I did not ask the question I intended to. How do you calculate a 95% confidence ellipse to be added to a XY scatter plot?

Active Participant
unclebump
Posts: 1,834
0 Kudos

Re: how do I add a 95% confidence ellipse to an XY scatter plot.

http://zone.ni.com/devzone/cda/epd/p/id/5832

possibly some info towards the bottom of this link.

http://zone.ni.com/devzone/cda/tut/p/id/6954

 

Founds this info here. http://ask.metafilter.com/36213/Best-Fit-Ellipse

 

here is a recipe that gives the ellipse (or set of ellipses) that i think your boss wants. in fact, there are two recipes; the first uses several columns in your spreadsheet; the second uses less columns but more formulae and is more likely to contain errors because of numerical rounding and because i've made a mistake, but may be easier to write as a reusable formula.

at the end i will explain what the ellipse means.

recipe 1
  1. start with two columns x and y, which are the coordinates of the points in your scatterplot. from those columns, calculate the following variables:
    • sumx = sum(x)
    • sumy = sum(y)
    • sumxx = sum(x*x)
    • sumyy = sum(y*y)
    • sumxy = sum(x*y)
    • n = number of points
  2. from those values, calculate the average (xbar, ybar) and the variance and covariance:
    • xbar = sumx/n
    • ybar = sumy/n
    • varx = sumxx/n
    • vary = sumyy/n
    • covarxy = sumxy/n
  3. generate two new columns, dx and dy where:
    • dx = x-xbar
    • dy = y-ybar
    these should be the same scatterplot as before, but shifted to be about the origin.
  4. calculate the following, which are the same as above, but for the new columns:
    • sumdxdx = sum(dx*dx)
    • sumdydy = sum(dy*dy)
    • sumdxdy = sum(dx*dy)
  5. calculate the angle theta = 0.5 * arctan(2*sumdxdx / (sumdydy*sumdxdx)) which is the angle that the ellipse is "rotated" from the horizontal, and also:
    • c = cos(theta)
    • s = sin(theta)
  6. generate two new columns X and Y (if you can't use capitals change the names!) which should be the same scatterplot, but with the rotation removed:
    • X = c*dx - s*dy
    • Y = s*dx + c*dy
  7. as before, generate the following from the new columns:
    • sumXX = sum(X*X)
    • sumYY = sum(Y*Y)
    • varX = sumXX/n
    • varY = sumYY/n
  8. finally(!) calculate:
    • a = sqrt(varX)
    • b = sqrt(varY)
    these are the lengths of the semi-major and semi-minor axes (the two principal "radii") of the ellipse (which is which depends on your data). the basic ellipse that you want to plot has that size, is centred on xbar, ybar, and is rotated by the angle theta. see the explanation below for what this means and how to generate other (larger) ellipses.
recipe 2
using just the x and y columns, and appropriate formulae above, you can calculate everything using:
  • vardx = varx - xbar*xbar
  • vardy = vary - ybar*ybar
  • covardxdy = covarxy - xbar*ybar
  • varX = c*c*vardx - c*s*covardxdx + s*s*vardy
  • varY = s*s*vardx + c*s*covardxdy + c*c*vardy
explanation
traditional statistics often assumes that noisy data is distributed as a "gaussian" or "normal" distribution (this is justified by the famous "central limit theorem" that says you get this distribution when life is complicated). the process above is equivalent to fitting a model for that distribution when it describes two, correlated variables (x and y). the final values a and b are the "standard deviations" of the underlying, uncorrelated distribution (ie with the rotation removed).
section 15 of numerical recipes describes things in more detail (especially section 15.6).
if your data really do follow this distribution then a and b give the relative sizes of your ellipse. you can scale that ellipse to include any given fraction of the distribution.
disclaimer - i am not 100% certain about this next bit: in particular, the usual ellipse plotted is one that contains 68% of the data (it's a convention), which you would get by multiplying a and b by the square root of 2.3 (see the reference above to numerical recipes).
posted by andrew cooke at 6:44 AM on April 14, 2006



there's a BIG mistake in the expression for theta. the numerator should be sumdxdy and the denominator is a subtraction:
theta = 0.5 * arctan( 2*sumdxdy / (sumdydy - sumdxdx) )
note that it's "y minus x" in the denominator. sorry!