04-12-2007 04:02 PM
04-13-2007 08:42 AM
Still hanging around, but *very* short on discretionary time lately. I briefly tried out your earlier post with milqman's pitch-shift algorithm and the results were audibly pretty similar to what I remembered hearing before you removed the DC offset. Haven't had a chance to dig further.
I only looked quickly at your most recent post -- the format is harder to deal with than the earlier one. All I did was look at the length of the different 1-D arrays that were bundled into clusters and saw that they weren't uniform at 512. I *think* I see what you did, but the data format isn't so easy to work with.
I really don't need you break down the raw data at all since I already have it in the .wav file. I only need the result of your voice and pitch detection algorithm to tell me how to break it down. How about the following:
Give me two 1D integer arrays of equal length. Their contents are:
a. # of samples to consider in this chunk. Nominal value is 512. Actual value may be smaller (due to need for integer number of segments in voiced chunks) or larger (samples leftover from previous voiced chunks).
b. # of samples per sub-segment. For voiced chunks, value (a) above must be an integer multiple of value (b) here. For unvoiced chunks, value (b) here should be a special impossible value like 0 or -1 to identify the chunk as unvoiced. Just tell me which.
-Kevin P.
04-13-2007 02:07 PM
04-13-2007 03:21 PM
04-13-2007 04:57 PM
04-13-2007 05:03 PM
04-14-2007 08:30 AM
04-14-2007 12:51 PM
04-14-2007 02:18 PM
04-18-2007 11:54 AM