processor dependent math results

Oskar_Bosch · ‎02-23-2010

We are having issues getting math results (in this case the matrix multiplication) that differ based on the intel processor used. We have seen differences between the core 2 duo T7700, Xeon X5460 and a pentium 4 Xeon based processor. The OS did not seem to make a difference (we tried XP x32, XP x64, Vista x64 and windows 7 x64).

See also the source code with an executable here:

http://www.boschjes.com/LV2009ISSUE

We implemented our own (10x slower) matrix multiplication and that gave the same results on all platform (OS / Processor) combinations.

Did anyone experience similar issues? And did you see it in which math function? As we have only investigated the matrix multiplication function.

These tests were done using labview 2009, f2. We are still considering comparison tests with labview 7.1.1 and labview 8.6.1.

Anyone any ideas?

Oskar

Oskar_Bosch · ‎02-24-2010

Anyone did any experiments with the math routines on different processors.

that is our main issue: different processors give different results.

Any feedback is appreciated.

DSPmchen · ‎02-25-2010

Hi Oskar,

LabVIEW uses Intel Math Kernel Library (MKL) underneath to accelerate the linear algebra calculation. And Intel MKL uses different optimized code on different processor, based on what SSE instruction the CPU supports. Also MKL returns different result on 32-bit and 64-bit application. So it is possible that you will get slightly different result on different CPU.

But based on your data, it seems the difference between Core 2 and Xeon X5460 really high. Do you use 32-bit version LV or 64-bit version LV on your Xeon X5460 machine?

If you are using 32-bit version LV on Xeon X5460 machine, there is a bug in MKL that causes incorrect result in matrix multiplication in some specific cases.

http://software.intel.com/en-us/articles/dgemm-and-sgemm-accuracy/

It says, for 32-bit program on Xeon 5400 series, the incorrect results occur for input matrices for which all the following conditions are true:

M>2 for DGEMM

N>1537 for DGEMM

K>256 and not divisable by 256

And your input matrices meets these conditions.

One work around would be split the input matrix A into two parts and multiply B separately. Then combine the two resulting matrices.

Thanks,

Michael

Oskar_Bosch · ‎02-25-2010

We are indeed using the 32 bit version of labview on all machines.

I'll check up on that bug report.

Would the 64bit version of labview 2009 fix this?

If so, I can have that person use that version, assuming the source code is compatible and transferable between 32bit and 64 bit labview version.

Thanks for the quick reply!!!!!

DSPmchen · ‎02-25-2010

Hi Oskar,

From the Intel bug report link, 64-bit MKL does not have this problem. So 64-bit LV should be correct on this problem. However, I can not verify this because I don't have a computer with Xeon 5400 series processor. (Note this Intel bug is only reproducible on some specific CPUs.)

The source code is compatible and transferable between 32-bit and 64-bit LV.

Best Regards,

Michael

帖子被DSPmchen在 02-25-2010 08:29 PM

时编辑过了

Oskar_Bosch · ‎03-04-2010

Michael,

So the solutions seems to be:

1/ Use the 64 bit version of Labview 2009 on this machine. Possible!

2/ Download and install the latest MKL 10.2.

Problem with 2, which I would like to try, how do I get this MKL 10.2?

And how would I even be able to verify which MKL is currently used on a system.

I guess I would be looking for a MKL-Runtime?

On the Intel page I did not see any way to download a new version of the MKL.

http://software.intel.com/en-us/intel-mkl

Would that appear once I register? Would we have to purchase the MKL?

So still hunting this down.

I also wonder if SP1 of labview 2009 would fix this (would that be compiled with MKL 10.2).

Problem with installing the 64 bit version of labview 2009: then we won't be able to build applications on that machine anymore, as most others are using 32bit windows (for now...).

Thanks for your quick replies Michael. Very appreciated!!!

Oskar

DSPmchen · ‎03-08-2010

Hi Oskar,

LabVIEW 2009 SP1 still uses MKL 10.1 because we find some other issues with MKL 10.2. So LabVIEW 2009 SP1 does not fix this.

On http://software.intel.com/en-us/intel-mkl, there is an evaluate option. You can download the evaluation version of MKL for 30 days usage.

I create a revised matrix multiply VI which divides matrix A into submatrices and multiplies B block by block.

I hope this could fix your problem. Could you please test it on your Xeon X5460 PC?

Best Regards,
Michael

Oskar_Bosch · ‎03-08-2010

This could be a good option. Will evaluate on the pc in question (the xeon one).

It does not perform as well though. About a factor of 1.6 slower on dual core.

Will check also on quad core. Might be better there.

(Updating that machine right now to SP1.... will take a while).

Good idea to split the matrix into smaller subsets to work around the issue.

I still need to read the intel article in more detail, but was wondering if there might be issues if the 2nd matrix gets bigger than a certain size. Noticed that you only split up the first matrix (chuncks smaller than 1000 rows.).

Thanks, very good idea.

Message Edited by Oskar Bosch on 03-08-2010 07:39 AM

George_P_Burdell · ‎03-08-2010

Oskar-

After further information from our R&D department, it is not possible to change the version of MKL a particular versions of LabVIEW uses. This is due to the fact that this library is built into the complied code. Therefore upgrading the MKL on your system will not affect the behavior you see. However, now that you are aware of this fact, you can expect it and architect your code with this in mind. This Intel Bug is documented on their website which also provides a work around for the incorrect results you are seeing (linked below). With the release of LabVIEW 2009, MKL 10.0 was implemented into our development system as documented in the LabVIEW 2009 Help: LabVIEW 2009 Features and Changes (linked below). Each new version of LabVIEW may or may not update this library. It will be documented in the release notes of each version of LabVIEW if this library is updated.

Intel Software Network: Incorrect results for DGEMM and SGEMM
http://software.intel.com/en-us/articles/dgemm-and-sgemm-accuracy/

LabVIEW 2009 Help: LabVIEW 2009 Features and Changes
http://zone.ni.com/reference/en-XX/help/371361F-01/lvupgrade/labview_features/

Regards,

Mike S
NI AE

Oskar_Bosch · ‎03-08-2010

Michael,

Checked your solution on the offending hardware (Xeon processor). It seems to work with this fix.

Actually interesting benchmark results we get:

function

CPU Built in Matrix mult. Fixed matrix mult.

Core 2 Quad Q9650 @ 3.0Ghz 8 11

Xeon 5460 @ 3.16Ghz 16 13

(times are in msec)

So it seems that this Xeon processor is not the math powerhouse. Surprised about this.

Again, thanks for the help. We'll likely implement your fix!!

Oskar

LabVIEW

processor dependent math results

processor dependent math results

Re: processor dependent math results

回复： processor dependent math results

Re: 回复： processor dependent math results

回复： Re: 回复： processor dependent math results

Re: 回复： Re: 回复： processor dependent math results

回复： Re: 回复： Re: 回复： processor dependent math results

Re: 回复： Re: 回复： Re: 回复： processor dependent math results

Re: 回复： Re: 回复： Re: 回复： processor dependent math results

Re: 回复： Re: 回复： Re: 回复： processor dependent math results