Nyquest is not an issue here because the sample rate (2.5 MSPS) is so much higher than the signal I’m measuring (60Hz).
I think the key in that document is figure 10. In my case, the dithering effect is intrinsic because each sample is not an integer multiple of the LSB. If each sample has a different LSB error (some high, some low) and given enough samples, does this imply the LSB error gets nulled out when performing the RMS calculation (N discrete consecutive samples converted to 1 RMS value)? How can this error be calculated for a specified number of samples (e.g. 1 cycle captured by 512 samples producing 1 RMS value)?
Going back to the example shown in figure 10, if the undithered waveform was captured with 8-bits resolution, what equivalent resolution would you get if you dithered 512 waveforms (8 + 9 bits)? Would the dithered resolution be the same for each point in the waveform regardless of the amplitude? The RMS calculation introduces a similar dithering effect, but it’s not simple averaging (quadratic mean).