Eventually resolved by rearchitecting the processing module so that instead of spawning 50+ instances of the same module. I run 1 module with an array of processes to do inside. The processes get run in parallel loops inside, and in each case structure of the MSG handler architecture I define how many parallel loops I want to allow to run in parallel (as a % of the number of logical processors or other determinant factor).
This limits me to not running at full CPU usage and taking longer to process that data.
(There are disadvantages occasionally for getting all of the bottle necks out of a processing module so it can run re-entrant so efficiently that it can hog the whole CPU! 🤔)
CLD; LabVIEW since 8.0, Currently have LabVIEW 2015 SP1, 2018SP1 & 2020 installed