Spatial audio coding

 

Research into spatial audio coding includes the proposed Spatially Squeezed Surround Audio Coding (S3AC), a novel solution to spatial audio coding that compresses multichannel audio signals (e.g. 5.1 surround) into a stereo or mono signal. As opposed to many existing spatial audio coding solutions, S3AC requires no transmission or storage of side information to enable accurate decoding and synthesis of the original multichannal audio signals.  This research was conducted with a former PhD student, Bin Cheng.

S3AC is particularly suited  to coding spatial audio containing dynamically localized sound sources. Listed below are example files that were used in subjective listening tests published at ICASSP2007. These files include the original and decoded 5 channel surround audio files and can be played back on a 5 channel audio system using suitable software. In the S3AC approach, the stereo downmix can be further compressed with an existing audio coder with little loss in quality when decoding to the original 5 channel audio signals. For the files presented here, the stereo downmix signals were compressed using the Advance Audio Coder (AAC) operating at 128kbps. These files were then decoded and converted back to 5 channel surround files using the S3AC decoder.

Sound Scene                 Original                           S3AC Coded

 

Airbus                                File                                    File

        

Ambulance                        File                                    File      

    

Female Speech                  File                                    File

 

Male Speech                      File                                    File

 

Mosquito                            File                                    File

 

   The entire file collection can be downloaded here

 

As well as compression of 5 channel surround audio files, S3AC has also been used for binaural reproduction of compressed surround sound files as well as coding of Ambisonic B-format audio. Further details can be found in the publications below.

 

Relevant Publications

[1]     Cheng, B., Ritz, C. and Burnett, I., “Spatial Audio Coding by Squeezing: Analysis and Application to Compressing Multiple Soundfields”, Proc. EUSIPCO2009, August 2009.

[2]     Cheng, B., Ritz, C. and Burnett, I., Psychoacoustic-Based Quantisation of Spatial Audio Cues, IET Electronics Letters, vol. 44-18, p. 1098-1099, Aug. 2008.

[3]     Cheng, B., Ritz, C., Burnett, I., “Binaural Reproduction of Spatially Squeezed Surround Audio”, Proc. Of  9th International Conference on Signal Processing (ICSP’08), Oct. 26-29, 2008, pp. 1-4.

[4]     Cheng, B., Ritz, C., Burnett, I., “A spatial squeezing approach to Ambisonic audio compression”, to be presented at the IEEE 2008 International Conference on Acoustics, Speech and Signal Processing (ICASSP'2008), Las Vegas, USA, Mar 30 – Apr 4, 2008.

[5]     Cheng, B., Ritz, C., Burnett, I., “Encoding Independent Sources in Spatially Squeezed Surround Audio Coding”, Advances in Multimedia Information Processing – PCM 2007, Proc. of 8th Pacific-Rim Conference on Multimedia (PCM2007), Hong-Kong, December 2007, Lecture Notes in Computer Science 4810, 2007.

[6]     Cheng, B., Ritz, C., Burnett, I., “Principles and Analysis of the Squeezing Approach to Low Bit Rate Spatial Audio Coding”, Proc. IEEE 2007 International Conference on Acoustics, Speech and Signal Processing (ICASSP'2007), Vol. 1, pp. 13-16, Honolulu, USA, 15-20 April 2007.

[7]     Cheng, B., Ritz, C., Burnett, I., “Squeezing the Auditory Space: A New Approach to Multi Channel Audio Coding”, Advances in Multimedia Information Processing – PCM2006, Proc. of the 7th Pacific-Rim Conference on Multimedia (PCM2006), Hangzhou, China, Nov. 2-4, 2006, Lecture Notes in Computer Science 4261, pp. 572 - 581, Springer-Verlag, 2006.