The first loop can be replaced by a simple "index array". Why would you need to sort the same array twice in parallel? Wouldn't once be enough? Why is the count DBL instead of an integer? The second loop could use a conditional tunnel for 10% of the current code. Why do you wire the lower array twice into the third loop stack instead of using autoindexing?
A good exercise would be to try again and do it all in a single simple loop!