- Community Home
- :
- Discussion Forums
- :
- Most Active Software Boards
- :
- LabVIEW
- :
- how do I add a 95% confidence ellipse to an XY scatter plot.

turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Topic Options

- Start Document
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

10-20-2010 09:38 AM

Options

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report to a Moderator

How do I add a 95% confidence ellipse to an XY scatter plot.

smercurio_fc

Knight of NI

10-20-2010 01:20 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report to a Moderator

10-20-2010 01:50 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report to a Moderator

10-20-2010 03:13 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report to a Moderator

*http://zone.ni.com/devzone/cda/epd/p/id/5832*

*possibly some info towards the bottom of this link.*

http://zone.ni.com/devzone/cda/tut/p/id/6954

Founds this info here. http://ask.metafilter.com/36213/Best-Fit-Ellipse

there's a BIG mistake in the expression for

note that it's "y minus x" in the denominator. sorry!

`theta`

. the numerator should be `sumdxdy`

and the denominator is a subtraction:`theta = 0.5 * arctan( 2*`**sumdxdy** / (**sumdydy - sumdxdx**) )

note that it's "y minus x" in the denominator. sorry!

recipethat gives the ellipse (or set of ellipses) that i think your boss wants. in fact, there are two recipes; the first uses several columns in your spreadsheet; the second uses less columns but more formulae and is more likely to contain errors because of numerical rounding and because i've made a mistake, but may be easier to write as a reusable formula.at the end i will explain what the ellipse means.

recipe 1`x`

and`y`

, which are the coordinates of the points in your scatterplot. from those columns, calculate the following variables:`sumx = sum(x)`

`sumy = sum(y)`

`sumxx = sum(x*x)`

`sumyy = sum(y*y)`

`sumxy = sum(x*y)`

`n =`

number of points`xbar`

,`ybar`

) and the variance and covariance:`xbar = sumx/n`

`ybar = sumy/n`

`varx = sumxx/n`

`vary = sumyy/n`

`covarxy = sumxy/n`

`dx`

and`dy`

where:`dx = x-xbar`

`dy = y-ybar`

`sumdxdx = sum(dx*dx)`

`sumdydy = sum(dy*dy)`

`sumdxdy = sum(dx*dy)`

`theta = 0.5 * arctan(2*sumdxdx / (sumdydy*sumdxdx))`

which is the angle that the ellipse is "rotated" from the horizontal, and also:`c = cos(theta)`

`s = sin(theta)`

`X`

and`Y`

(if you can't use capitals change the names!) which should be the same scatterplot, but with the rotation removed:`X = c*dx - s*dy`

`Y = s*dx + c*dy`

`sumXX = sum(X*X)`

`sumYY = sum(Y*Y)`

`varX = sumXX/n`

`varY = sumYY/n`

`a = sqrt(varX)`

`b = sqrt(varY)`

`xbar`

,`ybar`

, and is rotated by the angle`theta`

. see theexplanationbelow for what this means and how to generate other (larger) ellipses.recipe 2using just the

`x`

and`y`

columns, and appropriate formulae above, you can calculate everything using:`vardx = varx - xbar*xbar`

`vardy = vary - ybar*ybar`

`covardxdy = covarxy - xbar*ybar`

`varX = c*c*vardx - c*s*covardxdx + s*s*vardy`

`varY = s*s*vardx + c*s*covardxdy + c*c*vardy`

explanationtraditional statistics often assumes that noisy data is distributed as a "gaussian" or "normal" distribution (this is justified by the famous "central limit theorem" that says you get this distribution when life is complicated). the process above is equivalent to fitting a model for that distribution when it describes two, correlated variables (x and y). the final values

`a`

and`b`

are the "standard deviations" of the underlying, uncorrelated distribution (ie with the rotation removed).section 15 of numerical recipes describes things in more detail (especially section 15.6).

if your data really do follow this distribution then a and b give the relative sizes of your ellipse. you can scale that ellipse to include any given fraction of the distribution.

disclaimer - i am not 100% certain about this next bit: in particular, the usual ellipse plotted is one that contains 68% of the data (it's a convention), which you would get by multiplying a and b by the square root of 2.3 (see the reference above to numerical recipes).posted by andrew cooke at 6:44 AM on April 14, 2006