I posted
a query over at the AMD forums, and here's what I was told.
I had hoped that e.g. "128b FADD" would be able to do something like the following:
/* "quad" is a hypothetical 128-bit quad precision */
/* floating point number, similar to "long double" */
/* in recent versions of C++: */
quad x, y, z;
x = 1.000000000000000000000000000001;
y = 1.000000000000000000000000000001;
/* the hope was that "128b FADD" could perform the */
/* following 128-bit addition in hardware: */
z = x + y;
However, the answer I'm getting is that "128b FADD" is just a set of two 64-bit adders running in parallel, which are capable of adding two vectors of 64-bit doubles more or less simultaneously:
double x[2], y[2], z[2];
x[0] = 1.000000000000000000000000000001;
y[0] = 1.000000000000000000000000000001;
x[1] = 2.000000000000000000000000000222;
y[1] = 2.000000000000000000000000000222;
/* Apparently the coordinates of the two "vectors" x & y */
/* can be sent to "128b FADD" in parallel, and the following */
/* two summations can be computed more or less simultaneously: */
z[0] = x[0] + y[0];
z[1] = x[1] + y[1];
Thus e.g. "128b FADD", working in concert with "128b FMUL", will be able to [more or less] halve the amount of time it takes to compute a dot product of vectors whose coordinates are 64-bit doubles.
So this "128-bit" circuitry is great if you're doing lots of linear algebra with 64-bit doubles, but it doesn't appear to offer anything in the way of greater precision for people who are interested in precision-sensitive calculations.
By the way, if you're at all interested in questions of precision sensitivity & round-off error, I'd highly recommend Prof Kahan's page at Cal-Berzerkeley:
PDF DOCUMENT: How JAVA's Floating-Point Hurts Everyone Everywhere
PDF DOCUMENT: Matlab's Loss is Nobody's Gain