Some of this stuff is probably pretty system dependent and variable, but I did a quick trial run myself with a desktop PCIe board and found that I had to call DAQmx Control Task with a "reserve" or "commit" to force the buffer allocation to actually take place. Even so, the Write still required the majority of the time.
Note though that if you "commit", subsequent stop/start cycles will be more efficient due to the DAQmx Task State Model. Probably only in the realm of single digits worth of msec though, so maybe negligible if you need to write different data into the buffer before restarting. (If you stop and restart without writing different data, the task will just start over with the buffer of data you wrote previously.)
CAUTION! New LabVIEW adopters -- it's too late for me, but you *can* save yourself. The new subscription policy for LabVIEW puts NI's hand in your wallet for the rest of your working life. Are you sure you're *that* dedicated to LabVIEW? (Summary of my reasons in this post, part of a voluminous thread of mostly complaints starting here).
This is interesting! I could reproduce your findings with my board: when reserving the buffer first and commiting it, the loading time drops from 160 ms to 110 ms for a 16 MB buffer, which is quite substancial.
Relative to re-triggering the same task with or without reloading the buffer I must be doing something wrong but I cannot seem to achieve it properly: when restarted the task immediately exits at the wait until done. Here is a snapshot of my updated test VI in case you can point me out to my mistake.