You will find several knowledgable people willing to help -- all you need to do is to take the LabVIEW Project that shows this behavior, compress the file containing the Project and all of its VIs, attach the Zip file to your reply, and make sure we know enough to be able to run it.
You have accurately described how VIs started asynchronously run -- they run in parallel with the main VI. However, there "is no free lunch" -- if you are trying to compute something at maximum speed, breaking it up into two pieces and trying to run them "at the same time" will not work (and may well be slower). There are only so many CPU cycles available -- if both routines "want it all", they will need to "share" the CPU.
Where asynchrony really wins, of course, is when certain tasks have "waiting" times built in to them. For example, if you are doing disk I/O, while the disk heads are moving and the disk is spinning, another routine can be doing computation. Similarly, if you are doing A/D sampling of 1000 points at 1 kHz, this routine will, once a second, hand you 1000 points, but 99.9% of the time it will be "waiting", allowing any other process that needs the CPU to use those cycles "for free".
I suspect your processes are "compute-bound", and might not benefit from a "Divide and Conquer" approach. But without seeing the code, it's only a guess ...
Bob Schor