surround cartoon Somebody recently asked me if there is a subcarrier on the video signal that carries the surround signal, and if so, what frequency it is on. Surround sound sent over standard TV or recorded on CDs or tapes does not work that way (yet). The dialog and surround channels are matrixed into left and right channels of the MTS (multichannel television sound) signal. Here is how it is done:

The cartoon to the right should help you to visualize which speakers respond to what parts of the signal. The Left and Right speakers respond mainly to their own stereo channels. The Dialog (center) speaker responds primarily whenever the left and right channels are in phase with each other. Meanwhile, the Surround speakers are responding primarily to information in the left and right channels that is 180 degrees out of phase. When the left and right channels are 90 degrees out of phase, all of the speakers reproduce that signal.

The Dialog speaker is also sometimes called the Center speaker. In addition, a subwoofer signal is derived from low frequencies in the dialog signal. Notice that both surround speakers carry the same signal. The separation and directional information is actually formed in your ears. The delay used in reproducing the surround signal is essential to this.

Four signals are generated by multichannel cinema/TV mixing boards. They are Left, Right, Dialog, and Surround. These are fed to the MP matrix encoder.

(See Encoding Surround Sound with your Mixer for a way to mix surround sound without the MP encoder).

The encoder first takes the Surround signal, filters it to a maximum of 7 KHz, and encodes it in the Dolby B (TM) noise reduction system (the method used in the link above omits these 2 steps). Then it matrixes the four signals down to two using the following equations:

Left_Matrix   =   Left  + .707 Dialog + .707 j Surround
Right_Matrix  =   Right + .707 Dialog − .707 j Surround

The factor "j" denotes a leading 90-degree phase shift. This is used to prevent the surround signal from canceling one side more than the other.

These two channels are placed on the optical tracks of films, recorded on the stereo tracks of video tape, or transmitted over MTS TV signals. They are also usually put on soundtrack music albums (and have been since 1977).

One way to look at how this occurs is to look at how these signals coming out of the encoder are engraved onto a phonograph record. The stylus (needle) on a phonograph pickup is vibrated by the groove in different directions by sound recorded to be coming from different directions in the final playback:

A matrix decoder can be used to recover the four signals, using these equations:

Left_Speaker      =   Left_Matrix
Right_Speaker     =   Right_Matrix
Dialog_Speaker    =   .707 Left_Matrix  + .707 Right_Matrix
Surround_Speaker  =   .707 Right_Matrix − .707 Left_Matrix

Obviously, the separation is incomplete. The values using .707 are only 3 dB down from the others. These are the contents of each of the four outputs if a simple passive matrix is used. Notice the absence of the diagonally opposite channel from each equation:

Left_Speaker     =  Left     + .707 Dialog + .707 j Surround
Right_Speaker    =  Right    + .707 Dialog − .707 j Surround
Dialog_Speaker   =  Dialog   + .707 Left   + .707 Right
Surround_Speaker =  Surround + .707 Left   − .707 Right

The above is the output of a cheap passive decoder. To increase the separation, the Dolby Pro-Logic (TM) matrix decoder actually changes the equations to match the content of the program. Here are two examples of the momentary coefficients obtained during operation. The first shows a strong signal to the left:

Left_Speaker      =   Left_Matrix
Right_Speaker     =   Right_Matrix
Dialog_Speaker    =   .383 Left_Matrix  + .924 Right_Matrix
Surround_Speaker  =   .924 Right_Matrix − .383 Left_Matrix

Giving these final values:

Left_Speaker     =       Left     + .707 Dialog + .707 j Surround
Right_Speaker    =       Right    + .707 Dialog − .707 j Surround
Dialog_Speaker   =  .924 Dialog   + .383 Left   + .924 Right
Surround_Speaker =  .924 Surround - .383 Left   − .924 Right

The other shows a strong dialog signal:

Left_Speaker      =   .924 Left_Matrix  − .383 Right_Matrix
Right_Speaker     =   .924 Right_Matrix − .383 Left_Matrix
Dialog_Speaker    =   .707 Left_Matrix  + .707 Right_Matrix
Surround_Speaker  =   .707 Right_Matrix − .707 Left_Matrix

Giving these final values:

Left_Speaker     =  .924 Left  + .383 Dialog + .924 j Surround
Right_Speaker    =  .924 Right + .383 Dialog − .924 j Surround
Dialog_Speaker   =  Dialog     + .707 Left   + .707 Right
Surround_Speaker =  Surround   − .707 Left   - .707 Right

Similar conditions occur when the other channels are strong. If all channels are equal, the coefficients of the simple matrix are used. The effect is to reduce the level of the strongest signal in the other speakers. Notice that the strong signal is reduced to .383 in adjacent channels, while that speaker's own desired signal is cut to only .924 of its former level. With proper timing of the matrix changes, they are inaudible. The separation is actually increased more, but I decided to keep the math simpler.

After being decoded, the Left, Right, and Dialog channels are sent to their amplifiers and speakers. The Surround channel undergoes some further processing. Since leakage of dialog into the Surround channel is the most objectionable of any leakages, these processing elements are used:

  1. Since sibilants are the most likely to leak from imperfect cancellation, the surround channel is filtered to remove all highs above 7 KHz. This does not cause much of a problem, because the highest frequencies are not directional, and the Left and Right speakers can fill in this information.
  2. The Surround channel is then Dolby B decoded, reversing the encoding done at the studio.
  3. The Surround channel is delayed to arrive at the ears of the listener AFTER the sound from the other speakers gets there.

These taken together have these good effects:

  1. Sounds to the side can be heard even when the listener is facing forward. The delay gives better cues to location than the old quadraphonic systems could do.
  2. The dialog is well focused onto the screen, even with the simple matrix.
  3. The surround channel is heard only when it is dominant.

Fancier decoders add a special decorrelation circuit to the Surround channel. It provides two slightly different signals to feed to the two surround speakers. The purpose of this is to keep the listener from locking in to the location of the Surround speakers themselves. Another trick to prevent locking in is the bipolar speaker. This spreads out the sound source and fools the ear so it cannot lock in.

The systems called AC-3 and Dolby Digital (TM) actually do have one discrete channel for each of the 6 speakers, including the subwoofer. Those are encoded into digital broadcasts and DVDs, but cannot be encoded into analog sources.