LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Using NI USB-6009 DAQ for Speech Recognition with LabVIEW VI

I am trying to use the DAQ to control the speed of a motor. A microphone is connected to the DAQ, and the DAQ is connected via USB to the computer which is controlled in a LabVIEW VI. Ideally, the motor speed would increase when "faster" is spoken into the microphone and decrease when "slower" is spoken into the microphone.

 

Essentially, some sort of speed recognition is needed to tell the words apart. I have tried using matched filtering in LabVIEW by creating templates but the output graphs are not clear enough for the matching to work. Is there a simpler way in LabVIEW to do some sort of speech recognition or somehow compare these two words?

0 Kudos
Message 1 of 7
(2,264 Views)

It depends on whether you want to make a real Speech Recognition System using LabVIEW (which you possibly can do, but it would help to have several years of LabVIEW Data Acquisition and Data Analysis experience before attempting this), or whether you want to distinguish two commands, "faster" and "slower".

 

I was recently in Charlotte, NC, and stayed at a hotel near the Convention Center.  When you got into the elevator on the ground floor, the recording said  "Going Up!" in a fairly rapid voice, with the pitch rising on the word "Up", and when you got in on the 10th floor and wanted to go back to the Lobby, the voice said "Going Dowwwnn", drawing out the timing and dropping the pitch on "Down".

 

You could easily say "Faster" at a "normal" speed, and "Slow-er" at "half speed" (it almost "does it itself" because the "w" takes a little longer to say).  It should also be easy to add "Stop!".

 

Bob Schor

0 Kudos
Message 3 of 7
(2,221 Views)
0 Kudos
Message 4 of 7
(2,178 Views)

Bob,

 

This was my initial solution also. Ideally in the long run I will want to implement a real speech to text, but for this initial stage, I am just trying to show a proof on concept. I am currently measuring the amplitude and taking the max value of the amplitude. I've found that when I say "faster", the amplitude is higher than "slower". I am now trying to use comparators and logical statements to either increment by 1 or decrement by 1 if I say faster or slower. My problem is generating some sort of counter that can be used as the input and output to the increment/decrement. 

 

Right now I have a constant 0 as the input, and just a numeric box on the output showing when it turns to +1 and -1. What I want to do is connect a changing value to both inputs, possibly do something to this value if one of the words is said, the output that word to the DAQ while also looping the output value back to be used for the input. Is this possible with this set-up or do I need to look at it from a different type of solution?

0 Kudos
Message 5 of 7
(2,135 Views)

Have you taken the LabVIEW tutorials?

LabVIEW Introduction Course - Three Hours
Learn LabVIEW

 

Have you learned about shift registers?  Those features on loops that maintain values from one iteration to the next?

Right now your code does nothing except write a 1 to Numeric 2, or a -1 to Numeric 3.  There is no Count UP or Count DOWN, because you are working on a constant 0 rather than the previous value!

 

Run your code in Highlight Execution mode to see how LabVIEW works.

0 Kudos
Message 6 of 7
(2,128 Views)

There are a number of problems in what you are proposing, namely "speech recognition" (as opposed to "distinguishing between two simple utterances, perhaps deliberately made to have different salient features").

  • Utterances need to be distinguished from "non-utterances".
  • You need (or may want to) capture the entire utterance, whose start and stop time you don't know.  This may require some form of continuous sampling and continuous processing to determine onset and offset.
  • Once you have an utterance, you need to classify it.

Sound, particularly sampled representation of a complex sound, has many parameters, including:

  • Sampling rate
  • Number of samples
  • Sampling Duration (Sampling rate * Number of samples)
  • "Amplitude" (what does this mean?  Max - min?  Mean RMS?)
  • Frequency content
  • "Amplitude variation" (which might be related to "number of syllables")

To distinguish speech, you will probably want to sample at a fairly high rate (say 10-20 kHz) and be sure to catch the entire utterance.  While you should test and verify what criteria would be good at distinguishing between two words ("Faster" vs "Slower"), my guess is that "Amplitude" would be one of the worst.

 

So let's forget this for a moment and concentrate on your last paragraph.  Let's "model" it as follows:

  • Most of the time, nothing is happening (as the Subject does not say anything).  Motor turns at constant speed.
  • At some point, the Subject "says" Faster.  Motor speed increases.
  • At some point, the Subject says Slower.  Motor speed decreases.

Model this as an integer indicator (Motor Speed) and two Boolean Buttons ("Faster" and "Slower").  Add another Boolean Button ("Stop") that stops your program.  Objective -- make Motor Speed increase by 1 when "Faster" is pushed, and decrease by 1 when "Slower" is pushed.

 

Hint -- use the Square Buttons (that look like the "Stop" button, maybe "OK" for Faster and "Cancel" for Slower).  Name them "Faster" and "Slower", right-click and change the Off Text to "Faster" or "Slower".  Do you know what's "special" about these buttons, as opposed to the Push Button or Rocker?  If you know about the Event Structure, (oops, that's almost too much of a hint), you should be able to come up with an elegant one-loop solution.

 

Colleagues -- please let Emily try this herself ...

 

Bob Schor

0 Kudos
Message 7 of 7
(2,124 Views)