How can you crossfade (fade in/out) arrays?

Kevin_Price · ‎03-27-2007

Milqman -- I appreciate the vote of confidence, but in fact you're way ahead of me on this. I sitll don't have a grasp on exactly what the algorithm should do. Some parts I'm pretty sure I follow, other parts I'm pretty sure I don't.

madgreek - here's a screenshot of Milqman's latest post. I'd like to try to help more now, but I really don't have time available to study & learn the theory. If I can understand the problem in terms of generic input arrays that need to be manipulated in a specific way to produce output array(s), I may be able to help a bit.

It would help a lot if the starting array contained default data corresponding to, say, the 'hijacked.wav' file. Then I could play the result with the Sound vi's to help determine whether the algorithm worked right or not. Right now, the default data is just 2 rows of mostly 0 values.

I wish I had more spare time available to delve into things more deeply but right now I simply don't.

-Kevin P.

Message Edited by Kevin Price on 03-27-2007 10:03 PM

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

madgreek · ‎03-27-2007

Kevin by default data of the "Hijacked" you mean the original sound data or something else?

I guess Milq used as an input the 2D array with zeroes because in the code i posted earlier my final output array is a 2D array, and he just simulated another one as an input. If this is what you are referring i can repost my code, if not i guess i didnt understand correctly what you meant

Milqman · ‎03-27-2007

Madgreek, are these segments going to be of constant size (are you sampling at a constant multiple of the pitch?) If so, the problem might be easier. Until you actually get the code, take a look at the picture Kevin posted and follow what is happening (hard to do without probing, I know). This gets you a lot closer to your goal, and is currently only fishy around the first and last indexes of each segment. I bet if you sit down with it and follow along, you can probably iron it out pretty quick and that little exercise will teach you how it works (so you can make it better and shiny and such).

Kevin, see the attached Audacity screen shot. Their default fade in/out tool gives a triangular fade (bad for constant volume). If you fade it their way, then puff it out with their "envelope" tool (up in the top left tool tray). You can see that the transition is imperceptible (is that a word?). Either way, I could not hear the transition and I was listening pretty closely. Give the envelope tool a couple tries to see how it works then puff out the fades like I did, it might work for you too. Best of luck with your fan-lover and your miserly tactics 😉

This has been a solid educational thread, right on!
~milq

Message Edited by Milqman on 03-27-2007 11:41 PM

madgreek · ‎03-27-2007

yes they will be of constant size, unvoiced 512 and voiced whetever the pitch is. Even if you change the segment size (i.e 256) the pitch periods still are going to be the same.That is what i am trying to do right now...going through your code....kinda hard to visualize the whole thing without running it

...thank you again

Milqman · ‎03-28-2007

My code does not touch unvoiced stuff.

It starts with a chunk of voiced segments and shifts their pitch, then outputs a composite array. It will likely be a subVI in whatever big grand thing you are doing. Have fun, some of the record keeping math in the inner loop on the right is trixy.

~milq

madgreek · ‎03-28-2007

ok milq....will keep that in mind...

madgreek · ‎03-28-2007

Kevin

If you are still on this post, right now i am trying to rebuild Milq's code in 7.0 and see how it works. I am attaching a quick graph of what the whole thing is about hoping to help a bit more....i am also attaching my code showing the real data of the wav. file.

Kind regards

greek

Kevin_Price · ‎03-29-2007

I'm still checking in on the thread, but at this point don't have much to offer. Various diagrams and code have given me some hints, but I've got absolutely no background in thinking about speech processing. There's a lot of terminology that's simply foreign to me, and I'm presently too busy on a project to take the needed time to study up on it.

Let me summarize what I think I do and don't know here. I think these relate only to the pitch-shift aspect of processing:

1. Start with big array of sound samples, such as 8-bit samples from 'hijacked.wav'

2. Break array down into some standard chunk size and evaluate & process one chunk at a time. (Do these chunks overlap? Are these called either voiced or unvoiced segments, depending on some evaluation criteria?)

3. If chunk is considered "unvoiced", then what? Does it pass unchanged to become a chunk of the output .wav?

4. If the chunk is considered "voiced", then the chunk is further subdivided into several sub-chunks. The number of sub-chunks and or length of the sub-chunks is determined by some evaluation of the chunk in question. It's still unclear to me how to handle the special case of the first and last sub-chunks.

5. The original mapping of sub-chunks would have them overlap one another by 50%? Then, depending on the desired ratio of compression / expansion, some # of sub-chunks are either duplicated and interleaved or are deleted. Regardless of which happens, the new amount of overlap must be set different than 50% so that the size of the entire output chunk will be the same as the input size.

6. Once the voiced chunk has had its pitch shifted through the interleaving or deletion of sub-chunks, the resulting chunk is passed out to be part of the overall output .wav?

7. After processing all chunks, the output .wav can be played back and should clearly demonstrate the result of pitch-shifting.

I believe milqman's code tried to address steps 4-6. I couldn't do anything with it because the default input chunk data didn't have any meaning to me, so neither did the output. I could probably help some if milqman's code were made into a subvi representing steps 4-6, and steps 1-3 and 7 were implemented around it using a known input like 'hijacked.wav'. *Then* I'd have feedback that was meaningful to me about how the processing in steps 4-6 affected the final resulting sound file.

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

madgreek · ‎03-29-2007

Hello Kevin

In question 2 the chunks do not overlap with each other and yes they are called voiced - unvoiced. 3) The unvoiced ones just pass out to the output "Hijacked" wav.

4) The first sub-chunk can be just taken as is and overlapped only with the sub-chunk to its right.

5) The sub-chunks have to be overlappled by 50%. You are right about everything else.

I understand that you are busy with your own work and projects. I want to thank you again for everything .

Madgreek

p.s is it possible to give me any help how i can overlap by 50% the sub-chunks if its possible?The rest i can try and do it in a more "manual" way

Milqman · ‎03-29-2007

1. yes
2. this is madgreek's business (and something I think he/she/they are comfortable with doing in whatever way complements the scope of the work) (the evaluation is based on it having a fundamental frequency, my guess is that if a certain amount/percentage of power is present within a certain bandwidth, it can be considered "voiced")
3. yes, I think so
4. if voiced, break it down into chunks that are the size of 2 periods of the fundamental frequency (with a peak in the middle). The first and last chunks might have to be doctored or ignored if they do not have the same time domain signature as the rest of the segment. But this again is madgreek's business and seemingly not where the help is needed.
5. As far as diuplicated or deleted, take a look at the middle loop in my code, this duplicates where necessary and deletes where necessary. The overlap is 50% when the ratio is 1/1 or when you are shifting the pitch down and have deleted segments (that last part is a guess to be honest).
6. Yes
7. I would think so, all voiced segments should indicate the shift which (if the ratio were from enough from 1/1) you would be able to hear.

Yes, my code was intended for 4-6, my default array did not carry over into the saved file. I think an easy sample case to see (and the one I tried) would be where each voiced period looked like a triangle. This is not accurate, as the real thing would look like a cursive 'w' but it gives you decent peaks to follow along the path. My sample was short enough that a serious number of the final points were special cases. If you had a larger sample set, you would be able to better isolate the special cases and deal with them more directly than I was able to. In order to determine the shift with a set like I did, you should be able to see the number (and frequency) of peaks increase or decrease with the applied pitch shift ratio. I would think that 8 periods of 30 or so elements each would be a decent sample for eyeballing the process. Like I said, there are some rounding problems going on in the inner loop on the right that play havoc with first and last elements of periods, and seem to not like periods of odd-size (as opposed to even). If madgreek KNEW what period the voiced information was coming in at, you could hard-code a lot of the index manipulation and cater it a bit to that size (using constant offsets instead of determined offsets, that sort of thing).

🙂

~milq

LabVIEW

How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?