Hi Philippe_RSA,
First, in general, when performance is required, Formula nodes should be avoided. They are slower than LabVIEW primitives and basic functions.
Second, in general, you do not need to convert a (x,y,z) vector into a 4x4 matrix. You only need to add a a 1 at the end (x,y,z,1). This is the reason why the 4x4 matrix for rotation-translation was invented in the first place. You can use LabVIEW matrix VIs with a matrix and a vector.
Third, use For loops with Parallelism enabled. Make sure that all the VIs within the loop are reentrant. You might be able to improve performance by making heavy matrix operations by calling a dll compiled in C++. By playing with those parameters, you should be able to get a significant performance improvement.
In any case, you should not expect to get near the same level of performance than what you get with the rotation of objects within 3D picture controls or most other 3D applications. Those use the graphic cards to manipulate the objects. The vertex rotations are one of the operations that the graphic cards are designed for. They are massively parallel.
Good luck
Marc Dubois