03-23-2022 10:45 AM - edited 03-23-2022 10:49 AM
All arrays are stored in contiguous memory with an I32 defining the index in each dimension. Calculating the linear offset of a given element based on five dimension-integers cannot be infinitely fast, of course (just try it with explicit code!).
So you have a huge linear memory space and each CPU core accesses elements all over the place (>>CPU cache), so I would expect significant memory thrashing too.
So what do the 5 Dimensions represent? How often do the dimension sizes change? How do you need to access them? Operate on them? etc. Maybe we can come up with a data structure that is much more suitable. Of one of the operations is the thresholding, maybe that could be done as one of the other processing steps (e.g. as the array is initially filled or during further processing).