LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

How would I make a voice recognition program using Labview?

Hi, 

I'm new to labview and am working on a program that, if successful, should be able to take an audio file of a person saying different numbers and reproduce what the person is saying, but in text. I've researched around and what I've gathered is that I need to make some sort of dictionary with pre-recorded sounds that can be compared with the recording, and I'm not sure where to start with that or if it's even the best way. Any help at all would be very appreciated!

0 Kudos
Message 1 of 10
(3,982 Views)

The challenge is that sound is not 100% identical each time. If you say "Please" several times, the sound your body produces will be slightly different each time, even if you say it with the same speed and pitch. So a direct comparison by "equal" is not sufficient.

I consider that task to be a Ph.d. task at least. This is definitely breaking all bounds of a 'short term project'. What is your situation?

 

Please refer to speech recognition for more information on the complexity.

Norbert
----------------------------------------------------------------------------------------------------
CEO: What exactly is stopping us from doing this?
Expert: Geometry
Marketing Manager: Just ignore it.
0 Kudos
Message 2 of 10
(3,920 Views)

Thanks for your reply! I'm interning at a lab and this was the project assigned to me, and while I have a mentor available to help me, the expectation is that I complete the project almost entirely on my own. Voice recognition programs have been made in various different contexts, so I was hoping to learn from those and make my own program suited to my needs using those as a guide. I just started and I'm still learning, but so far I've been conflicted as to what exactly I even need to do the program, as the only thing the program would need to recognize are a series of numbers from 1-10, but from various people and different noise-to-sound situations. Would I require each different voice to create a dictionary of their own numbers to contrast later recordings with, or would I just need many different pre-recorded voices to somehow make a guideline of how each number sounds on its own that I could compare with other, different voices? Any advice you could give me would be greatly appreciated!

0 Kudos
Message 3 of 10
(3,906 Views)

@anna770 wrote:

Thanks for your reply! I'm interning at a lab and this was the project assigned to me, and while I have a mentor available to help me, the expectation is that I complete the project almost entirely on my own. Voice recognition programs have been made in various different contexts, so I was hoping to learn from those and make my own program suited to my needs using those as a guide. I just started and I'm still learning, but so far I've been conflicted as to what exactly I even need to do the program, as the only thing the program would need to recognize are a series of numbers from 1-10, but from various people and different noise-to-sound situations. Would I require each different voice to create a dictionary of their own numbers to contrast later recordings with, or would I just need many different pre-recorded voices to somehow make a guideline of how each number sounds on its own that I could compare with other, different voices? Any advice you could give me would be greatly appreciated!


If this is a Windows PC, you can always take advantage of Microsoft Speech API in LabVIEW.  There is at least one example out there, but I can't seem to find it ATM.

Bill
CLD
(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.
0 Kudos
Message 4 of 10
(3,895 Views)

I strongly recommend you to talk to your mentor. This topic definitely blows any kind of internship as it is very complex.

If you can define enough constraints (e.g. recorded voice is trained directly, no background noise, equal level/pitch, ...) it is rather simple as a waveform envelope will do ('predefined pattern match'). But anything above is more than an internship can ever realize.

 

Please refer to this outdated discussion for some information.

 

However, there is good news: Instead of going into the details of sound analysis, you might have the option to include and interface existing speech recognition software.

Norbert
----------------------------------------------------------------------------------------------------
CEO: What exactly is stopping us from doing this?
Expert: Geometry
Marketing Manager: Just ignore it.
0 Kudos
Message 5 of 10
(3,894 Views)

Thanks for your timely response! I'll definitely consult my mentor, but one last question: If I were to integrate an open-source program written in a different programming language (for example, java or python) would it be possible to somehow put it into a labview program?

0 Kudos
Message 6 of 10
(3,885 Views)

For C/C++ you can use the Call Library Function Node (calls into C/C++ DLLs).

For .NET you can use the .NET nodes (constructor, property, invoke).

Norbert
----------------------------------------------------------------------------------------------------
CEO: What exactly is stopping us from doing this?
Expert: Geometry
Marketing Manager: Just ignore it.
0 Kudos
Message 7 of 10
(3,881 Views)

Hi, thanks for your answer! How would I be able to integrate the Microsoft Speech API and what would it do?

0 Kudos
Message 8 of 10
(3,842 Views)

Well, I couldn't find the link to the example - but I do have the example on my computer!

 

I've found that MS Speech likes to guess at commands a lot, and this example adds your set of commands to the MS command list, so some of the guesses can range from amusing to annoying to downright dangerous.  Unfortunately, it's a lot more difficult to instantiate your own instance of MS Speech API, and I haven't even been able to do that successfully yet.

Bill
CLD
(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.
0 Kudos
Message 9 of 10
(3,820 Views)
0 Kudos
Message 10 of 10
(3,206 Views)