How can you crossfade (fade in/out) arrays?

Milqman · ‎03-26-2007

Whoops, forgot to attach.

madgreek · ‎03-26-2007

Hello Kevin

I will try and answer your questions from 1-7 as best as i can

1) Yes

2) Yes for the first part. What do you mean exactly the same? If you mean the number of samples is the same then yes its exactly the same. If you are referring to actual magnitude values lets say, then no they are not exactly the same

3-4) About the number to be copied or deleted. The algorithm works from 50%-200% pitch shifting (basically this is the overlap part between adjacent segments moving left or right. If you move lets say a segment by 50% to the left that means it will completely be on top of the segment to its left, correct? Therefore for those 2 segments that are on top of each other, the pitch has shifted up (after being added together) by doubling its magnitude therefore you get the 200% shift. But since half of those segments will overlap completely with each other, the duration of the signal will shorten by half, therefore we need to double the number of segments being used to compensate for the time shift. In the exact but opposite way, this is what happens when you are shifting the pitch down, to 50%. You are moving the segment by 50% to the right, therefore no overlapping between them, thus pitch is shifting down by 50% BUT duration is doubled, therefore 50% of the segments have to be deleted. Basically what is happening, by changing the distance between the segments, you pitch scaling the signal.The way i see this whole shifting is that it will have a resolution of 1 period, you will be very limited to the amount of shifting up or down, i think it will be 1/2,2/3,3/4,...,4/3,3/2,2/1)

Yes the output should still be 512 samples.

5) Time shifting is much easier than pitch shifting. You just delete or replicate segments. See each segment as an element in a 1-D array. How would you shrink it down?By eliminating certain number of elements right?or if you want to expand it, by replicating some of them. Each element(segment) has the same size.

6) I am not quite sure what you mean by this. We are just changing the rate that the segments were received in the original signal by deleting or adding some of them and then creating a new signal by overlap/adding them together.

7) I have found several papers that stating the use of the triangular window also. Here we are dealing with pitch, which "is" the fundamental frequency, and the only use of the Hanning window is to eliminate any aliasing when you break the 512 array into 8 arrays of 64 by fading them in/out.

If there is a novice here, thats me for sure. You have asked me questions and made observations that i spent 3 months to answer

and you only saw this message 2 days ago

If i was not clear to some of your questions or made it even harder for you please forgive me but please feel free to ask me again

Kind regards

Madgreek

madgreek · ‎03-26-2007

Milqman

We practically posted the same time. Is it possible to get the code in Labview 7.0?

Thank you for all your help and time. I really appreciate it and you deserve 5*E5 stars

Milqman · ‎03-26-2007

My LV is only letting me save as far back as 8.0, maybe Kevin can do 7.0 for you?

Like I said, there are some problems with it, but it is damn close to what you are looking for.

~milq

madgreek · ‎03-26-2007

Ok milq, thank you again

Kevin_Price · ‎03-27-2007

I've only got 8.20 and 7.1 installed. I can't backsave any earlier than 8.0 either.

I'll try to look at milqman's posting later tonight. Meanwhile, I'll follow on with the numbered list of q's & a's.

2. By "exactly the same" I really meant the exact same values. The way I interpreted the "pitch and time scale.png" attachment was some of the original 512 samples belonged to both the last half of segment 3 and also to the first half of segment 4. That's what I meant by calling it 50% overlap.

However, I am right now realizing a silly oversight on my part. Even while I read and typed the value 512, I kept thinking in terms of 256 simply because that's the barrier value I encounter far more commonly. So I figured that 8 segments times 64 samples each was 2x too many, when in reality it's exactly the right number. Looking at the diagram you posted, I'd expect a more relevant starting point would be to define 16 segments of 64 samples each, such that the segments overlap by 50% as shown in the diagram. (I think I gather that for each set of 512 samples, you may come up with a different # of segments to divide it into. For now, let's stick with a single sample set though.)

3-4. Maybe the diagram is throwing me off. Here's how I interpret it: the x-axis overall extends over exactly the original 512 samples. The triangular regions give clues as to what range of indices from the original sample array are used to define the segments, as well as a "relative gain" to apply while cross-fading from one segment to the next. However, after performing the pitch-shift, it appears that the segments no longer overlap by exactly 50% of their width. The overlap may be either less or more depending on how many total segments are used to span the 512 sample range. Is this interpretation wrong?

I'm now inclined to think the following, starting with 16 segments of 64 samples, each segment overlapping by 50% with each of its neighbors. (Note: the final algorithm will need to have some memory so that the last segment of one 512 sample data set can be used to combine with the first segment of the next 512 sample data set. Let's ignore that for the time being though.) To compress by a factor of 0.5, keep only 8 alternating segments, leaving you with 0 overlap and 0 gaps, but exactly 512 total samples. It seems to me there's no cross-fading to do. To expand by a factor of 2, you duplicate all the segments and fit them halfway between existing overlapping segments. Now you'll have 32 segments which overlap their neighbors by 75%, and cross-fading must be done among 3 overlapping segments (2 unique, 1 duplicate).

5-6. It appears that when you perform time-shifting, you specifically do NOT preserve the # of samples at 512. Is this right? Do you produce an output with either <512 or >512?

7. I'm sure a Hanning window has a different spectral response than a Triangular window. I don't really have any good idea whether that difference may be useful or even significant to this processing. I'd suggest making the windowing operation into a sub-vi so it can be more easily changed later on if needed.

Sidebar to milqman on Audacity: As I recall, it's easy to generate true white noise in Audacity. Can you verify my observations? Generate 10 seconds of noise, then copy and paste it as a 2nd audio track. Set the two tracks to overlap by about 1 second and apply the standard linear cross-fade to the tracks for that 1 second. Don't you hear a volume drop during that second? If no, I'd like to learn what I'm doing wrong...

-Kevin P.

ALERT! LabVIEW's subscription-only policy came to an end (finally!). Unfortunately, pricing favors the captured and committed over new adopters -- so tread carefully.

Milqman · ‎03-27-2007

The overlap does not need to be 50%, in fact, that is only the case when the pitch shift ratio is 1/1 (the general overlap = 1 - 1/(PR*2) ). Think of each triangle (or hanning window) as a peak in a sound wave. In order to increase the pitch, the segments need to get closer; in order to decrease the pitch, the segments need to spread. The hanning window is not very significant here. A triangle would accomplish --nearly-- the same thing. The difference is that the Hanning window is concave up such that the weighting reduces drastically as you leave the center. This meshes with the pitch-shifting algorithm because each segment is 2 periods (which means there is a peak at the beginning, middle, and end). The intent then, is to keep the meat of the middle peak and zero the 2 adjacent peaks (so you can mash them together). A triangular window would allow a decent amount of the adjacent peaks to "leak" into the central peak. This would not be a problem, but they would be "leaking" in at the original frequency/pitch not the shifted frequency (which is a problem, a serious one at that).

If you could tell me quickly how to use those little comment bubbles, I could throw some on my code really quick and probably reduce your "what the hell is he doing here?" time.

Right now, all that is obviously wrong with the code is that the first half of the first triangle is lost (the first triangle is a special case), and every other time where segments of odd-count get added together (rounding issue I think). There might be a some items that are off kilter (an index array that is one index off) but it performs alright on my tiny test cases. Either way, the glut of the coding is done, and all that needs to be figured out is some of the little bits. Which of course should be no problem for a LV Jedi like yourself.

I will check the audacity thing at home.

~milq

madgreek · ‎03-27-2007

Hello guys

Kevin as Milq said, to pitch scale the signal you just need to increase or decrease the overlap part between the segments and then add them together to get your scaled signal. About the array size 512, although i am using 512 samples for each segment, only 256 are passed finally into my code because of the FFT. I know though that this is irrelevant to our discussion. The only case where the overlap stays at 50% is when you doing time scaling because there you just delete or duplicate integer number of segments.

Milq

About the bubbles. If you left-double click anywhere in the front or block diagram you can create a text box that you can write and comment anything. Is this what you meant by comment bubbles? You can also do this by going Functions>Decorations>Free Label

Milqman · ‎03-27-2007

Ok, I threw together some comments real quick. For the big comments I have a little letter in the code and the explanation outside the jumble. Hopefully it makes sense, it is a fun little puzzle, but I am a little sad that I didn't quite stick the landing. I am sure Kevin will make it all nice and sparkly.

Next we need to get someone to translate this into a version that greek can see. Maybe he/she/they can stick the landing on their own.

Again, post what you guys come up with 🙂

~milq

madgreek · ‎03-27-2007

Milq no matter what happens i want to thank you for everything.

I am sure if i post your vi in the room someone can save it in Labview 7 so i can see it too.

LabVIEW

How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?

Re: How can you crossfade (fade in/out) arrays?