From Friday, April 19th (11:00 PM CDT) through Saturday, April 20th (2:00 PM CDT), 2024, ni.com will undergo system upgrades that may result in temporary service interruption.

We appreciate your patience as we improve our online experience.

Student Projects

cancel
Showing results for 
Search instead for 
Did you mean: 

The High Precision Sound Recording System used the 252ch Spherical Microphone Array, Japan

Contact Information

University: Tohoku University

Team Member(s): Jumpei Matsunaga

Faculty Advisors: Shuichi Sakamoto

Email Address: jum.matsu@gmail.com, saka@ais.riec.tohoku.ac.jp

Country: Japan

Project Information

Title:

  The High Precision Sound Recording System used the 252ch Sperical Microphone Array, Japan

Description:

  We are developing a system to acquire 3D sound space information that can transmit accurate sound space information to a distant place using a microphone array on a human-head-sized solid sphere with numerous microphones on its surface. We were actually building of a 252ch real-time processing of SENZI sound space information acquisition system.

Products:

   PXIe-1071: 1 Unit: 18 Chassis,

   PXIe-8133: 1 Unit: Controller,

   PXIe-7965R: 3 Units: FPGA board

The Challenge:

   In order to implement this system, the following process is necessary to perform in real time:

   ・Record the inputting signals to 252 microphones simultaneously and formatting the data,

   ・Regarding the all channels, perform the FFT process with high speed after cutting the signal at 512 points,

   ・Regarding the all channels, multiply the weight coefficient imitated HRTF by the frequency.


The Solution:

1. Background

  There is an increased demand of a device to be able to transmit high precision sound-spatial information to a distant location and provide it simultaneously to several listeners with a high sense of presence.  Simply, we are able to perceive what kind of sound is around the sound recording location by recording it with a single microphone and playing it with a single speaker.  However, we are not able to obtain the sense of presence from it such as we are there at that place.  In fact, we are not able to transmit the sound source information such as its direction and distance, either.  Therefore, it is necessary to precisely record and play the sounds from the several sound sources including the directions. 

  Human beings are able to obtain spatial information such as direction and distance of the sound source and size of the room by using left and right ears.  This is done due to experiencing to process and judge the input difference to the left and right ears, and the frequency spectrum change by the sound wave reflection/diffraction such as at wall, head, and pinna.  There is an element, HRTFs: head-related transfer functions to include this information. HRTFs mean transfer functions from the sound position to listener’s ears.  Therefore, if the sound source is identified, the HRTF is able to reproduce the transferred sound information from it including the sound source direction. We are expecting that it will be possible to record and play the high sense of presence sound information which is able to precisely provide the sound source information including the sound source direction by properly using these HRTFs.

2. Assignments

  It is necessary to reproduce each person’s HRTF simultaneously since the HRTFs are different by each head shape of the listeners.

  Also, the listeners are not always keeping still when they are listening.  They are always moving such as rotating heads and nodding, therefore we can think that they are also obtaining the sound-spatial information from these moving actions.  In addition, the HRTF also changes as the position relative to the listener and sound source changes by moving the head.  Therefore, it is possible to provide the high precision sound-spatial information if we measure the listener’s head position by a certain way and provide the sound by switching the HRTF based on the movement.

  By considering these points, it is important to record and provide the sound by using the HRTF which is suitable to the each listener in real time with following the listeners’ head movement.  We have thought the algorithm to implement these, and we have decided to build a system by actually recording the sound with a lot of microphones and providing it after the signal processing in real time.  In fact, the sound receiving part is spherical, and it is an array to arrange microphones in an axial symmetry.  Since, this is for being able to easily switch the microphone based on the head movement.

  In order to implement this system, the following process is necessary to perform in real time:  

Record the inputting signals to 252 microphones simultaneously and formatting the data

Regarding the all channels, perform the FFT process with high speed after cutting the signal at 512 points

Regarding the all channels, multiply the weight coefficient imitated HRTF by the frequency

3. Solution

3-1. System Components

  This system consists of “Recording Part”, “Signal Processing Part”, and “Reproduction Part” as shown on Fig.1.  At the recording part, it simultaneously receives the sound by using 252 microphones and sends it to the signal processing part after formatting the data. At the signal processing part, it performs the composing process to the sound should be provided from the signals of the signal processing part based on the information obtained by the HRTF and head position sensor of the each listener.  The HRTFs are measured per each listener in advance, and use it by obtaining from the numerical calculation.  At the reproduction part, it provides the signal processed sound to the listener by the headphone with the binaural reproduction method.

  When we built the system, we used a spherical array and performed SENZI[1,2] algorithm implementation by using a PC for the control.  The controlling PC consists of chassis (PXIe-1071, NI), controller (PXIe-8133, NI), and FPGA board (PXIe-7965R, NI).  The each specification is shown on Table. 1.  We have prepared 3 FPGA boards here in order to operate the 252 ch in real time, and the each board is processed to have a role.  The system actually built is shown on Fig.2.

                                           Table.1  Microphone and Controlling PC Specifications

Item

Product Name

Specifications

Microphone

(Digital)

KUS5147

(Hosiden)

Clock Frequency: 2.4 MHz

S/N Ratio: 58 dB (typ.)

Directivity: Non-directive

PXI Controller

PXIe-8133

(NI)

CPU: Intel Core i7-820 (1.73 GHz)

Memory: 3 GB

OS: Windows XP

FPGA Board

PXIe-7965R

(NI)

FPGA Chip: Virtex-5 SX95T

On-board Memory: 512 MB


 

  Each FPGA board has been implemented a function shown on a diagram as following Fig.3-5:

  The first FPGA board formats the digital signal sent from the spherical array.  At this time, the quantization bit rate is 16 bit, the sampling frequency is 48 kHz.

  The second FPGA board performs the FFT process at 512 points by overlapping 50 % of the sent signal by multiplying the window function. On the resource, FFT processor is only able to secure for 4ch at the same time, and it is necessary to perform the high-speed calculation to process the 252 ch data.

  The third FPGA board reads the weight coefficient found in advance by using the each listener’s HRTF, and multiply the weighted value by the data sent from the second FPGA board in the frequency domain, and add the all 252 ch data.  Then, perform the inverse FFT to the added result.  The weight coefficient has left and right channels, and finally the 2ch data will be outputted.  Since, the size of DRAM holding the weight coefficient is 512MB, and the weight coefficient of the listener’s head position is around 1MB information, a lot of the weight coefficients will be prepared.  Therefore, it is able to switch the multi-directional HRTFs in real time by the information obtained from the head position sensor.  Also, the built system has not been installed the head position sensor, but by installing it in the future, it will be able to choose the optimum weight coefficient based on the head movement information obtained by the sensor.

3-2. Results

  This built system’s LabVIEW front screen is shown on Fig.6.  The system is built to be able to verify if the process is performed in real time such as able to observe what kind of signals are provided to the both ears, always checking if there is missing data, and if there is an overflow at using FIFO.  And in order to make it be responsive to the head movement, we have made it to be able to switch the all-channel weighted values during the operation by writing the weight coefficients at the several head directions in the DRAM in advance.  If we extend it to make it read the optimum information obtained from the head position sensor, it will be able to follow enough the listener’s head movement in real time. 

  In this research, we have built a new system never existed before, the high precision sound recording/providing system used the multi-channel microphone array able to operate in real time.  We would like to actually establish it in the sound field in the future.  And we will proceed on the physical-psychological evaluation regarding the precision of the sound-spatial reproduction, and we will improve the system and apply to a lot of researches based on the knowledge.

References

[1] S.SakamotoS.HongoR.Kadoi and Y.Suzuki,“SENZI and ASURANew high-precision sound-space sensing systems based on symmetrically arranged numerous microphones,”Proc. 2nd International Symposium on Universal Communication (ISUC) pp. 429-434(2008).

[2] S.SakamotoJ.KodamaS.HongoT.OkamotoY.Iwaya and Y.Suzuki,“A 3D sound-space recording system using spherical microphone array with 252ch microphones,”Proc. 20th International Congress on Acoustics(ICA)736(2010).

Acknowledgements

  This work was supported by Strategic Information and Communications R&D Promotion Programme (SCOPE) No. 082102005 from the Ministry of Internal Affairs and Communications (MIC) Japan and a grant for Tohoku University Global COE Program CERIES from MEXT Japan.

Comments
Sadaf_Hussain
NI Employee (retired)
on

Hey SENZI,

Thank you so much for your project submission into the Global NI LabVIEW Student Design Competition. It looks great! Make sure you share your project URL (https://decibel.ni.com/content/docs/DOC-20973 ) with your peers and faculty so you can collect votes ("Likes") for your project and win. If any of your friends have any questions about how to go about "voting" for your project, tell them to read this brief document (https://decibel.ni.com/content/docs/DOC-16409). Be sure to list LabVIEW under the products you used. We are offering students who participate in our Global NI LabVIEW Student Design Competition the opportunity to achieve certification at a fraction of the cost. It's a great opportunity to test your skills and enhance your resume at the same time. Check it out: https://lumen.ni.com/nicif/us/academiccladstudentdiscount/preview.xhtml.

Good luck!

Sadaf in Austin, Texas

Contributors