The Six Elements of a Film Mix

  1. Balance - volume relationships between sonic elements
  2. Frequency Range - having all frequencies properly represented
  3. Panorama - placing a sonic event within the sound field
  4. Dimension - adding ambience to a sound
  5. Dynamics - controlling the volume envelopes of a track or instrument (automated level control)
  6. Interest - making the mix special

Balance

Good balance starts with a good arrangement. It is important to understand arrangement since so much of mixing is subtractive by nature - changing or removing sounds or textures that don't fit together well.

If two elements play in the same frequency band at the same volume at the same time they fight each other for attention. For example, in music: You don't hear a lead vocal and a guitar solo at the same time. The listener is unable to focus on both simultaneously and becomes confused and fatigued as a result.

Elements of Arranging

Voice - The primary focus of most film work. The voice needs to be clear and intelligible at all costs.

Concrete sounds - Those sounds that appear to be connected to the image. In general these belong to the diegesis, pertaining to the reality within the film environment of the characters, whose point of view would incorporate these sounds. This usually implies sound effects, rather than ambient sounds: door slam, audience boos, telephone ring, clock tick, baby cry, etc.

The concrete sounds are usually derived from the visual elements like movement, weight, solidity, resistance, etc. When done properly these sounds should seem invisible, seamless and believable.

On the other extreme creating a cartoon character, for example, Roger Rabbit - in a "real" human world depends entirely upon making the character come to life by giving him the weight and texture of an inflatable, plastic, stretchy body that can do things that a "real" body can't.

Musical sounds - Any of the above sounds might also fall into this category if they become disassociated from the actual environment and turn into a kind of sensorial or emotional element independent of the characters space-time reality within the story. For example: a Ticking clock in a room is a concrete sound, but if becomes unlinked from the image and begins to pervade the scene with its emotional urgency or relentlessness as the underlying tone, it falls into the area of musical sounds. Ambient sounds can be considered in this division, especially when they are not creating a reaction to the characters, but rather setting a general mood. Examples could be: whistling wind, air conditioner hum, pounding surf, playground laughter, crackling fire.

Music - Both diegetic music (singers or musicians on the screen, radio, record player, etc.) and suggestions for scenes where background music or sound design are required. Background sounds and noises can become musical motifs. Often a sound effect can generate a complementary visual image - something like windshield wipers or a musical sound with a definite periodicity like a dripping faucet can lead to a specific beat in the music. A great musical example is "Money" by Pink Floyd.

Room Tone/Ambience - It is extremely important to have a recording of the on screen locations to be used to cover edit points. Otherwise this has to be artificially inserted on top of the ambient room tone that will accompany the dialogue tracks, adding another layer of mud for the voice to cut through.

Limit the number of elements - You seldom want them all playing at the same time.

The ear can perceive many different simultaneous sounds if they are separated by frequency.

Everything needs its own frequency - The arrangement or mix will work best if each element sits in its own frequency range. In general, it is difficult for an audience to absorb more than two separate audio elements at any one moment, and these must have good separation of pitch, timbre, or rhythm. If more elements are added, or if the acoustic characteristics are too similar, the sounds will begin to blend, creating a single information pattern to the brain to interpret. Often varying the volume of the tracks during the mix can create a psycho-acoustic space between potentially fusing sounds, allowing each to command attention for its moment on the stage. An extreme example of this occurs in Air Force One during the confrontation between the President Harrison Ford and the Russian terrorist Gary Oldman. Fight scenes are especially difficult.

Another problem can occur when in contemporary, fast paced edit cuts such as the pod race in Star Wars I. You can begin to edit the visual image faster than the brain can process the sound. As the shots get shorter and shorter, the sound bites get shorter as well. And get less distinctive even though they were different from one pod to another - just bursts of noise. There wasn't time for the pitch to be heard, sometimes as short as 9 frames. The sounds used were stretched (phase-vocoded) to develop distinctively different sounds and effectively panned to create space between them.

This is why scenes that involve multiple voice parts - people fighting and talking at the same - time are very difficult mix clearly unless each voice has its own distinctive place.

Change the arrangement and rerecord the track.

Mute the offending sounds so that they never play at the same time.

Lower the level of the problem sound.

Tailor the EQ so the offending sound takes up a different frequency range.

Pan the offending sound to a different location.

Panorama - Sound Placement

The stereo sound system represents sound spatially. Panning lets you select where in space the sound should be placed. It can also create excitement by adding movement to the track and adding clarity to an idea by moving it out of the way of other sounds that may be clashing with it.

Correct panning makes the mix sound bigger, wider and deeper.

Stereo was invented in 1931 by Alan Blumlien, an employee at EMI records in England. (They forgot to renew their patent when it came up in 1959 when the format was taking off.)

Stereo features a phenomena known as phantom center which means that the output of the two speakers combine to give the impression of a third speaker in between them. This phantom center can sometimes shift as the balance of the music shifts from side to side, which can be very disconcerting to a listener. As a result film has always relied on a third speaker channel in the center in order to keep the sound anchored.

The Power of Place - keep important elements in the center of the mix, especially for a stereo piece designed to be viewed in a museum or movie theater, or people sitting or standing towards the edges of the theater will have problems hearing the dialogue.

Three main areas of a panoramic mix - extreme left, extreme right and center.

Big Mono - Be careful not to use too many stereo sounds - especially from pre-recorded sound effects CD or sample CDs. If you use all stereo sounds panned hard left and right the end result will be a loss of depth and sound mono.

Frequency Range

Sub Bass - 16 to 60Hz - sounds that are felt more than heard - give the music a sense of power, to much will sound muddy. In a 5.1 surround sound mix these are the sounds that are sent to the sub-woofer. Bombs, Earthquakes, Thunder

Bass - 60 to 250 Hz - contains the fundamental notes of the rhythm section - changing this can change the musical balance making it fat or too thin. Too much boost in this range makes the sound boomy.

Low-Mids - 250 to 2000Hz. - low order harmonics of most instruments and can introduce a telephone-like quality if boosted too much. Boosting between 500 to 1000Hz makes instruments sound horn like while boosting in the 1000 to 2000Hz range can make sounds tinny - excess in this range can cause listening fatigue.

High-Mids - 2000 to 4000HZ - these upper mid range frequencies can mask important speech recognition sounds if boosted. Introducing a lisping quality to a voice and making the sounds formed by the lips such as m, b and v indistinguishable. Too much boosting - especially around 3K can cause ear fatigue (Fletcher Muson curve). Dipping 3K on instruments and boosting 3K on the voice (all slightly) can make the vocals more audible without reducing the volume of the instruments in that range which would otherwise bury the vocal.

Presence - 4000 to 6000KHz - responsible for clarity and definition of voices and instruments. Boosting in this range can make an instrument seem closer to a listener. Reducing 5kHz makes the sound more distant and transparent.

Brilliance - 6kHz to 16kHz. - Controls the brilliance and clarity of sounds. Too much emphasis in this range can produce sibilance in the vocals.

Always cut frequencies rather than boost them.

Usually better to increase a small amount at two frequencies than a large boost at one.

Don't EQ in solo mode - it might not fit into the overall mix. Drastic changes in the EQ of one sound will effect other sounds around it.

The fewer the instruments in a mix the bigger they need to be, conversely the more instruments you have the smaller each one must be to fit together with the other instruments of the mix.

If it sounds muddy cut some at 250Hz.

If it sounds honky cut at some at 500Hz.

Cut if you're trying to make things sound better. Use a narrow bandwidth when cutting.

Boost if you're trying to make things sound different. Use a wide bandwidth when boosting.

You can't boost something that's not there.

If you want something to stick out, roll off the bottom.

To make something blend in, roll off the top.

Theoretically the ear can hear sound between 15 to 20Hz at the low end up to 23kHz. - as a child and teenager. The common range is usually stated as 20Hz. to 20kHz. By the age of 60 most people can’t hear above 8kHz. Most of the action occurs from 100Hz to 6kHz.

Most equalization plug-ins include presets that can give a close approximation of speaker simulation. These can be used re-create or place a sound like it's coming out of a TV or AM radio as well as settings for things like a voice over a telephone, an LP record, from the room next door, etc.

Dimension

Adding effects.

The ambient field in which a track sits. Dimension can be captured in recording but usually must be enhance or created by adding effects such as reverb, delay or modulated delays such as chorusing or flanging. Dimension can be as simple as creating an acoustic environment, but it can also be the process of adding width or depth to a track or trying to spruce up a boring sound.

Move a track back in the mix - pan is left to right, effects are front to back.

The wetter a track is the further away it will sound. Long delays and reverbs push the sound away if they are loud enough.

Picture the performance in an acoustic space and realistically recreate that space around them. This space needn't be a natural one.

Smaller reverbs and short delays make things sound bigger. Reverb decays under one second and delays under 100 milliseconds tend to create acoustic space around a sound.

EQ tips for reverbs and delays

To make an effect stick out brighten it up.

To make it blend in filter out the highs.

If the part is busy - like drums - roll off the low end of the effect to make it blend in.

If the part is open, add low end to the effect to fill in the space.

If the source part is mono and panned hard to one side make one side of the stereo effect brighter and one side darker. Van Halen's guitar sound from their first two records.

Layer reverbs by frequency with the longest being the brightest and the shortest being the darkest.

Pan reverbs any way but hard left and hard right.

Return the reverb in mono and pan accordingly - all reverbs needn't be returned in stereo.

Get bigness form reverbs and depth from delays or vice versa.

Use a bit of the longest reverb and don all of the major elements of a mix to tie all the environments together.

If delays are timed to the tempo of a track they add depth without being noticeable.

If you want a delay to stick out make sure that it's not timed to the tempo of a track.

Reverbs work better if they are timed to the tempo - on a snare say - if the reverb dies just before the next hit occurs.

Dynamics - Compression and Gating

Compressed dynamics not important in classical or jazz music, but in modern music dynamics play a major role both on CDs and in the movies.

The sound of modern records and films is heavily compressed - automated level control. To get every element as loud as possible. In a film this is essential to keep dialogue at a constant volume.

Gating - keeping something turned off until it reaches a pre-determined threshold level (volume) and is let through.

Correct compression makes a track seem closer, more aggressive and exciting. Everything is right in your face.

Interest

Figure out the direction of a scene or film - determined by the producer and the performance.

Develop the groove and build it like a house - from the ground up. Find the pulse of the scene and let all of the various sonic elements dynamically breathe with it. Find the most important element and emphasize it!

Motion is created through a subtle combination of panning, equalization, and effects processing. Combined, these elements can create dramatic movement and effects.

Psycho-acoustic Principles

The brain's ability to discern individual simultaneous sounds when there exists a separation of frequencies. Make observations to determine what range of pitch will be most appropriate to use in conjunction with the other sounds occurring in the scene.

Fletcher Munson Curve- The ear does not hear all frequencies equally. The ear accentuates sounds between 2kHz. and 5kHz. and even more so between 3 and 4kHz. This is the frequency range of human speech and helps our species to communicate effectively.

You can make a long scene play shorter if there is sufficient information to warrant the slower pace. Off-screen sounds can motivate the characters thought processes and actions. A motivated noise like a window breaking or a baby crying can provide sonic coverage for a visually slow moment. Ex. In The Godfather when Al Pacino shoots a man for the first time an off screen train that is never seen on the screen is used quite effectively.

A scene may also play shorter if you overlap the dialogue or background sound effects between two scenes, in effect compacting the information. This very common technique can serve to propel the audience into the next scene with a greater sense of urgency, curiosity, or humor.

Synchronous vs. Asynchronous Sounds

In the early days of motion pictures the major debate over sound was initially concerned with the use of voice in the "talkies." Filmmakers believed the use of the voice would destroy the universality of film - film as a unique language of silent images and montage. The fear was that dialogue could lead to the filming of theatrical works (recreating plays and vaudeville performances on film), as opposed to staking out a unique idiom which they believed they were doing throughout the silent era. Consequently, some filmmakers began to employ sound effects, but stayed clear of dialogue.

Early theories held that if an image is seen, it wouldn't need to be heard and if it was heard, it wouldn't need to be seen. This brought about the idea of a more economical use of visual editing. If we could hear certain events, they wouldn't need to be explicitly filmed. For example: a person is looking out of a window, then we hear a door slamming, and a car starting up and driving away. This could all be portrayed with the single image of a person looking out of the window, concentrating on their reaction to the events as opposed to a visual cut to the door slamming, and then a cut to the car pulling away before returning to the expression on the actors face. Those events could be portrayed off screen through the use of sound.

The next step was to bring about a more psychological use of sound by portraying what the actor might hear "internally" instead of recreating the "realistic" sounds that would obviously be present in a scene. The following example written by V. L. Pudovkin will illustrate this idea.

"For example, in actual life you, the reader, may suddenly hear a cry for help; you see only the window; you look out and at first see nothing but moving traffic. But you do not hear the sound natural to these cars and buses; instead you hear still only the cry that first startled you. At last you find with your eyes the point from which the sound came; there is a crowd, and someone is lifting the injured man, who is now quiet. But, now watching the man, you become aware of the din of the traffic passing, and in the midst of its noise there gradually grows the piercing signal of the injured man: his suit is like that of your brother, who, you now recall, was due to visit you at two o'clock. In the tremendous tension that follows, the anxiety and uncertainty whether this possibly dying man may not indeed be your brother himself, all sound ceases and there exists for your perceptions total silence. Can it be two o'clock? You look at the clock and at the same time you hear its ticking. This is the first synchronized moment of an image and its caused sound since you first heard the cry."

From page 97, Film Sound, Theory and Practice edited by Elisabeth Weis and John Belton.

Three excellent books, each pertaining to a different aspect of sound and image are:

Film Sound, Theory and Practice edited by Elisabeth Weis and John Belton. 1985, Columbia University Press, New York.

This book presents a history of sound in film, early and modern critical theories of sound in film, and the economic motivation behind the rapid introduction of sound in film.

Sound Design, The Expressive Power of Music, Voice and Sound Effects in Cinema by David Sonnenschein. 2001, Michael Wiess Productions, Studio City, CA.

This volume offers a systematic method for creating sound design step-by-step, with many examples of how specific sounds were created in many well-known films.

Sound for Film and Television, second edition, by Tomlison Holman. 2002, Focal Press, Boston.

This book deals with specific equipment, technical information on microphones, recording techniques, file formats, mixing, editing and mastering film sound.