Interactive Media Design at University of Michigan: Ambisonics

I've still got a little bit of build to go, but I'm focusing on the sound creation this week. I think my project can be used in a variety of ways, but one of my biggest interests is the spatialization of sound, recreating 3D sound environments and manipulating them; I think my project lends itself very well to this application. The nature of the interaction with the Bucky is hemispherical, with the round outer edge easily translating to the horizon and the various motions back and forth from the edge mapping nicely to an overhead dome. The data that comes from the onboard ITU maps fairly easily to the coordinates of a virtual soundfield as I showed in my previous post. The next step is figuring out how to implement the virtual soundfield. I had already had a specific library picked out in MaxMSP, but in a conversation Dr. Gurevich brought up a competing implementation, so I thought I should take a look at the two for the sake of thoroughness.

The method I was planning on using is one that I have used before, but only in a binaural (headphone) implementation; that is the method known as Ambisonics. The other method is a slightly newer system called Vector Based Amplitude Panning (VBAP). They both have their strengths and weaknesses and I'll try and survey a few of them here. The VBAP system was developed by Ville Pulkki who I believe is based in Helsinki. Through my very cursory research VBAP seems like an approach based on the traditional stereo field panning concept, where the ratio of the intensity of sound coming out of two speakers gives the perception that the sound is actually somewhere in between the two speakers. VBAP is a system that allows you to extend this system out to any number of loudspeakers. Using vector based mathematics to set the relative position of the sound and the combining it in a matrix with the loudspeaker position and distances, a convincing, easy to manipulate virtual soundfield can be created. Like the stereo system, it uses ratios of intensity to position the sound, but instead of a stereo pair, it uses stereo triplets, with loudspeakers above the plane, to give elevation information. It also smooths the transference of sound from one set of triplets to the adjacent triplets in order to involve the full range of motion.

Ambisonics, on the other hand, is much less intuitive 'under the hood' but instead uses some very cool psychoacoustics that I just barely understand to achieve the spatialization effect. It's an expansion of Alan Blumlein's invention, the Mid-Side microphone technique. Instead of the VBAP technique of actual localization of the sound in a triplet of loudspeakers, the position information is encoded in and emanates from ALL of the loudspeakers through a system of phase cancellation and correlation (I think...even after reading it a hundred times it still seems like magic to me. Probably why I'm a sound lover, this stuff just fascinates me). Ambisonics necessitates encoding and decoding stages on either side of the position determination, which can be processor intensive; this is one of the reasons that 5.1 Surround has surpassed Ambisonics for surround sound in consumer electronics. Up until recently the processing power needed put the decoder price point way out of range for anyone but enthusiasts.

The two systems, as I understand it, produce a relatively similar result, so the differentiation is in the implementation. The usage of one or the other may be dependent on the situation, one better for permanent installation and the other for undetermined performance spaces. I don't have enough experience to make that call. The VBAP system necessitates entering speaker position and distance from the 'listener', and so is a little more difficult in initial setup. Ambisonics, as far as I can tell, is more speaker position 'agnostic' (at least within reason) making setup much easier and allowing for a variety of performance settings. The problem with Ambisonics is that it has a much tighter 'sweet spot' (though I understand that it is getting much wider as decoding speeds up and more thorough HRTFs are implemented) and there are certain perceived phasing artifacts if the listener moves their head to quickly in the sweet spot.

There are libraries for both methods readily available for MaxMSP. On the VBAP side, the library is written by the methods inventor, Pulkki, and he has a paper on the implementation here. On the Ambisonic side, there are a few libraries out there. This page at Cycling 74 has a couple of the proven ones, including the High Order Ambisonics (HOA) library from CICM which I have used before.

I think I'm probably still going to go with my initial instinct and use Ambisonics. I would like to give a well thought out reason for this, but it's mostly based on my feeling that Ambisonics is just to cool of a psychoacoustic phenomenon not to play with. Also, the HOA library implementation is very advanced, including objects that aid in the connecting of the thousand patch lines inherent in spatialization.

Interactive Media Design at University of Michigan

Pages

Saturday, February 20, 2016

Ambisonics

1 comment :