Getting started with sound recording

From Wikiversity
Jump to navigation Jump to search

This page is for material on sound recording - what you need to record sound, and how to get started. Please add ideas or information if you have any knowledge or interest in the area.

What is Sound Recording?

Sound Recording essentially means capturing a piece of sound onto a storage media, so as to archive it and review it afterwards. It is not just storing sounds but also maintaining its quality, i.e., the playback of the recorded sound must be an exact imitation of the original sound that was recorded. To do this we need the proper equipment, and some essential skills and knowledge.

What do you record sound with and what do you record into?

Steps in Digital Recording

[edit | edit source]

Digital recording is accomplished using a DAW - Digital Audio Workstation. A DAW can be a stand-alone unit or the combination of a computer and software.

  • Step 1: Sound (vibration of air molecules) is captured by the diaphragm of a microphone and is converted into an electrical signal. This is the analog step.
  • Step 2: As the electrical signal is very low (mic-level), it must be amplified by a pre-amplifier to line level.
  • Step 3: The amplified electrical signal is converted to a digital sample containing binary digits (bits) by the A/D Converter (analogue to digital converter). Higher voltages are assigned larger numbers, and the maximum/range is dependent on the bit depth (CD-quality is 16-bit, DVD-audio is 24-bit). The number of samples every second is based on the sample rate (CD-quality is 44.1 kHz, maximum DVD-audio is 192 kHz).
  • Step 4: The digital information is stored on a medium such as a hard drive.
  • Latency Correction: Step 1 to 4 requires time and if you are recording track by track (first track guitar, second track voice), then you play back the first track and you record the second track. That is not a problem, if you just record one track. The latency can be measured by recording one track with short clicks and you play back the clicks with the speakers and then you record the clicks from the speakers with a microphone. Zoom in and mark the shift of the first click in track 1 and the first click in track 2. At the bottom of audacity you can see the length of the marked section in milliseconds, that is the latency of your recording device (125 milliseconds). Set the latency settings of your device as e.g. -125 milliseconds. After this latency correction all recorded tracks are in sync.

For listening to your recordings a similar, but reversed, process of step 1 to 4 is used to play back the recorded audio.

The Wikipedia has more information on DAWs.

Digital vs Analog

[edit | edit source]

The sound wave in the air isn't 'sound' until it is captured by a transducer. The human ear is a transducer. It takes the variations in the air pressure caused by something, such as the vibration of a guitar string, and converts it (transduces it) into something that the brain interprets as sound. A microphone is a transducer that converts the variations in air pressure into electrical signals which are then an 'analog' of the sound. This audio can be recorded to tape and stay analog or it can be converted to digital information and stored in a file on a computer hard drive.


[edit | edit source]

At first, you will want convenience, ease of use, and low cost. But you will rapidly come to the limits of this path as you gain more knowledge and experience. You may still want convenience and ease of use, but more features and more flexibility will become more important to you as time goes by.

For the modern person, a DAW (Digital Audio Workstation) is the mainstay of recording. Basically, a DAW is your computer. The sound is captured by a transducer, such as a mic, or the pick up of an electric guitar, sent to an audio interface or sound card where it is converted to digital and processed through the software and stored on disk.

For a decent set up one will need a good computer (modern laptops will do), audio software, a USB or Firewire hardware audio interface and a transducer (mic, guitar, etc.) connected to the interface with a cable.

Simple setups are effectively the same, though the sound quality will suffer using the soundcard in any modern computer. You've probably seen the mic input jack on your computer. This will work, but it's not that hard to set up an external hardware audio interface. Prices range from below $200 to thousands of dollars.

The hardware audio interface connects to your computer through USB or Firewire. There are interfaces that are cards that fit into the computer in a slot or through PCMCIA and ExpressCard slots, but these are more expensive. The easiest is to stick with a USB or Firewire interface.

Some brands of audio interface are MOTU, M-Audio, Echo Digital Audio and Tascam. A quick search on your favorite search engine for 'hardware audio interface' will give an overwhelming list of options. It comes down to budget. Spend the most that you reasonably can, cutting corners here will lead to problems down the road. This part of the discussion is a whole world in itself, but a two or three evenings researching the topic will lead one to an option that works within a budget and allows for expansion in the future.

Briefly, the major considerations are:

1) How much audio do I need to record at one time? Am I going to have to record a whole band at once or am I going to play everything one thing at a time and layer the recording? This will help to decide how many audio inputs you will need. Two inputs, with a stereo output for your stereo (so you can hear all your hard work), limit the cost.

2) Consider the highest quality pre-amp your budget will allow. The pre-amp is part of the hardware audio interface. The lower the quality of the pre-amp, the lower the quality of the digital conversion which will lead to dissatisfaction as you get deeper into the art of recording.

3) Will I need MIDI functionality? MIDI, or Musical Instrument Digital Interface is a whole topic in itself. Briefly, MIDI allows one to connect a synthesizer or other sound module up to the DAW. With the advent of fast CPUs and lots of inexpensive RAM, this is less important for many small studios because of the ability to use plug-ins. See 'Plug-ins' below. One can get a cheaper audio interface without MIDI functionality that has a better pre-amp and a separate MIDI interface can be added later if needed that connects via USB.


[edit | edit source]

The Wikipedia page offers more information on microphones. Microphone choice and position is an aspect of recording that has one of the biggest impacts on the sound.

Condenser (capacitor) microphones

[edit | edit source]

Condenser microphones require power (typically 48 V, called phantom power) to bias two metallic plates. Vibrations in the air cause the capacitance to change, converting vibrational energy into electrical modulation.

Dynamic microphones

[edit | edit source]

Dynamic microphones can be either moving coil microphones or ribbon microphones. Moving coil microphones such as the popular Shure SM-57 are the most durable type of microphone and can often withstand high sound pressure levels. Ribbon microphones are often much more fragile. Dynamic microphones do not need phantom power to operate. In fact, phantom power will break most ribbon microphones.

Diaphragm size

[edit | edit source]

The size of the microphone's diaphragm has an effect on its frequency response. Generally microphones with smaller diaphragms such as pencil condenser microphones pick up higher frequencies well. Larger diaphragm microphones can pickup lower frequencies better.

Polar pattern

[edit | edit source]

A microphone's polar pattern determines the direction in which the microphone "hears" sound. Popular polar patterns include:

  • Omnidirectional: picks up sound equally in all directions (360 degrees around microphone).
  • Bidirectional/figure eight: picks up sound mostly from front and back, rejecting it from the sides.
  • Cardioid: picks up sound mostly from the front.
  • Hypercardioid
  • Supercardioid

Microphones can either be fixed pattern (only having one), or may allow the polar pattern to be selected. Ribbon microphones are often figure 8. Moving coil microphones are often cardioid.

Microphone use

[edit | edit source]

An overview of which microphones to use when recording (beginner's guide):

  • Guitar or any amplifier - Uni-directional Dynamic mic
  • Drums - Uni-directional dynamic mic
  • Overhead drumset - Omni-directional condenser
  • Vocals - Condenser for recording, uni-directional dynamic for live performance (although this is not a live performance article, the reader should be aware that that there is a large risk in breaking the condenser mic when playing live. They are very fragile. A small drop could destroy it)
  • Acoustic guitars/instruments: Condenser, any directional.
  • Other instruments. Trumpets, etc: Condenser mic, unless the instrument is overly loud. Unless there is too much background noise, omni is much easier to use and set up (you do not have to point the microphone at the instrument and then deal with issues of inconsistency. More later)
  • Non-instruments-Same as "other instruments." Although every item is technically an instrument by definition.

Alternative conversions of sound to electrical signal

[edit | edit source]

Song Torch

Microphones may be bypassed with certain electric instruments. For example, an electric guitar may be plugged directly into a recording device if that device supports instrument-level input. If not, the guitar's signal will have to be amplified to line-level by a pre-amplifier before being recorded. A guitar amplifier with a line out can provide this function.

  • Many amplifiers have 1/4" unbalanced TS (tip-sleeve) output jacks.
  • Keyboards/synthesizers have similar output jacks, but may be balanced TRS (tip-ring-sleeve).


[edit | edit source]
  • If you use a computer to record and/or edit, you will need two items: a sound card or hardware interface and recording software.

Sound card

[edit | edit source]

A sound card is an audio interface that integrates with a computer through the PCI or PCIe slots on the motherboard. Most sound cards do not provide as high quality audio as an audio interface that was built for professional sound recording, however, for the consumer or hobbyist, a regular sound card will often suffice.

Audio Interface

[edit | edit source]

The role of the audio interface, like a sound card, is to convert electrical signals carrying information about audio into digital information. They also do the reverse by converting digital information into electrical signals. The industry leader in audio interfaces is Digidesign, and their audio editing software Pro Tools.

Cables, Connectors and Adapters

[edit | edit source]
  • XLR/XL3: The connector used by microphones. It is a balanced cable. The same cable/connector is used to carry digital AES/EBU signals.
  • TRS: A balanced 1/4" plug.
  • TS: An unbalanced 1/4" plug.
  • 1/8": a smaller version of the 1/4" used by some sound cards. A small adapter can be used to convert from 1/4" to 1/8" or vice-versa.
  • RCA: a consumer cable used in pairs to transmit stereo audio. Also used for S/PDIF digital information.
  • Optical: a cable used to carry ADAT or TDIF lightpipe.

Computer software

[edit | edit source]

Entry-level Consumer Software

[edit | edit source]
  • Audacity
  • GarageBand
  • Goldwave
  • Sound Recorder

Prosumer to Professional Grade Software

[edit | edit source]
  • Ableton Live
  • Acoustica Mixcraft
  • Adobe Audition
  • Ardour
  • Cubase
  • FL Studio
  • Logic
  • Nuendo
  • Pro Tools
  • Reason
  • Samplitude
  • Sequoia
  • Sonar


[edit | edit source]

Plug-ins are pieces of software that integrate into recording software. A plug-in can be a virtual synthesizer, or some kind of sound effect like reverb. (Look down into Audio Signal Processing for more details on different types of effects.)

Plug-ins are the main tool of the composer and audio engineer to shape the recording using a computer or DAW (Digital Audio Workstation). Plug-ins can be virtual synthesizers or sound effects, so a song writer can have access to a full complement of instruments and sounds AND add a reverb that emulates a club, church or even an open air stadium.

Generally, using plug-ins, one could lay down a drum track using a drum machine plug-in, and add a bass line the same way using a synthezer plug-in, then record vocals using a mic with a little reverb from another plug-in and have a complete song all on a laptop!

Plug-ins do require a powerful computer to work well, especially when many plug-ins are used in real-time. A lot of real-time plug-ins need a fast computer with a lot of memory. But that's not hard to find these days, so that's usually not a real limitation.

On slower computers, or for personal, technical or even creative reasons, plug-ins can be used "off-line" to process a recording and then "turned off". For example, one might record a dry vocal with dry drums and bass (dry meaning not processed) so the software can maximize the computer's power to get a good quality audio recording of the vocals. Then the composer or engineer can process the dry audio with a plug-in and store it as the new recording. Then when it is played back, one will get the sound with the effect without using any more of the computer's resources. This is a strategy used all the time by audio engineers to layer and shape the final product.

Audio Signal Processing

[edit | edit source]

Dynamics processing

[edit | edit source]

or Insert Effects, because the entire signal is processed before going further.

  • Compression/limiting
  • Expansion/gating

Dynamic Range - To help the understanding, we'll need some terms defined. Audio has a dynamic range. Simply put, this is the difference between the highest and the lowest levels of a recording. For newbies, level is the volume of the sound recorded. Also known as amplitude. An amp or amplifier, amplifies the amplitude, it increases the volume, it raises the level.

That said, Dynamic Processing is the manipulation of the amplitude of the sound signal.

LIMITER - nothing more than a special kind of compressor...a limiter will effectively turn down the volume of the recording when it comes to within a preset amount of the maximum available space for the sound.

Sometimes the dynamic range is too much for all the wires, speakers, digital processors, etc. and it clips. Just like the word implies, the sound level goes too high and it gets clipped like the tops of tree when the pruning shears are applied. Just like the top of the tree is clipped off, the same can happen to the audio signal. When that happens, part of the sound is lost and the resulting audio becomes noise. So, a limiter will effectively turn down the volume of the recording when it comes to within a preset amount of the maximum available space for the sound. That way it doesn't clip and the audio isn't lost.

Limiter's can be external hardware that the audio is passed through as an analog signal or it can be a plug-in where the signal is adjusted digitally. One example of it's use, is to place a limiter between the microphone and the recording medium. As the vocalist pops his p's or sputters, or just plain belts it out, the limiter will work to turn down the volume just as it reaches the "limit" and keep the audio from "punching a hole through the tape". This keeps the recording within limits and makes it easier to get a good recording that could otherwise be ruined even if only one part of it actually clips, (or gets too loud and goes over the top, so to speak).

COMPRESSOR - Brings down the high levels and raises the low levels. Going back to the concept of dynamic range, when a recording is made it has quieter and louder parts. The compressor will basically bring up the quiet bits and turn down the louder bits. It's an art to make a compressor sound good, but it is virtually required in most recordings, especially if they are to have a commercial exposure. All the wires and speakers, stereos, radio broadcast equipment, etc. have a range in which they generally work well. If the dynamic range is too great, the quiet parts get lost in electronic hiss and the loud parts can break equipment. So, compression is a vital tool of the audio engineer.

A compressor can also be placed in the analog part of the signal (where the sound is in the wires and cables) as a piece of outboard gear. Or it can be inserted as a plug-in on the computer and process the sound digitally.

EXPANDER An expander effectively does the opposite or a compressor. It will take the lowest levels and make them lower and the highest levels and make them higher. This is a simplified explanation, but it it in essence what happens. Generally expanders are used for more technical reasons, but with an expander plug-in, anyone can experiment on their DAW and see what they come up with.

GATE A sound processor that will let through or block off the sound signal at a preset level. For example, one might have a recording of drums, but in the parts between the hits, one could possibly hear the guitar coming through, though a lot quieter. Since the guitar has it's own track in the example, it's not needed in the drum recording. Besides, later when the engineer wants to compress the drums and make them punchier and generally louder, that guitar bleed through could become quite distracting. So, on can set the gate to cut off all the sound that's quieter than the drum hits that comes through between the drum hits thereby cleaning up the drum tracks so further processing can take place.

Creatively, a gate that opens and closes slowly can be used to make an electric bass sound a lot like a cello, or a vocalist to sound like an alien with a speech impediment. (Not that you might want a vocalist with a speech impediment, but imagine you were making a sci-fi video track and the creature that has come to take over the earth has to sound really weird. :) ed.) In these situations, one could make the gate slowly open as the bass is played so instead of sounding plucked, it sounds more like it was bowed, just like a cello.


[edit | edit source]

or "send effects" because part of the signal is sent to them to get processed.

  • Reverb
  • Delay (Phasing, Chorusing, Flangering)
  • Distortion/overdrive
  • EQ

These kinds of effects are called send effects because a portion of the dry signal is sent to the processor. To understand this in real world reality, if you go to a concert you will hear a part of the sound directly from the performers and a part of the sound that is reflected around the venue. A reverb is a send effect that will take part of a dry signal and emulate a club, or hall, or whatever the engineer chooses and that will be mixed with the dry signal to come up with the same basic feel as being in a live situation.


Reverb is everywhere, the whole world is full of reverb. Everywhere you go, your ears are picking up reflections of the sounds around you. In fact, reverb is part of how the brain processes the direction of the sound. As the sound reflects back to your ears, it is in a different phase of it's vibration and the brain processes these differences to decide what direction it's coming from.

Of course, the brain being what it is, it can be tricked. Judicious reverb can make the brain think it's in a stadium when actually it's in some guy's living room between stereo head phones. Too much reverb can make the sound muddy and messy with everything washed together.

Reverb can even be set so that the reflections it creates reinforces the rhythm of the music. Or if not applied very well, it can actually screw up the rhythm of the recording because the sound reflections are out of sync with the beat.


Delay is a repeating of the sound and is often referred to as an echo in popular speak. However it is used very specifically in audio. The rock and roll sound of the fifties uses a short delay, sometimes called a slap delay, that is so short you wouldn't even know it was there until it is removed. But that is a classic use of delay that fattens up the sound and makes that unique fifties rock and roll feel.

More involved delay can be shown by considering The Pink Floyd and their characteristic sound. This involved lots of longer delays that repeat and wash into the sound. Sometimes delays are used to make a repeating loop that an artist can play along with and add to the loop to make an ever expanding texture.

DISTORTION and OVERDRIVE or better yet 'Overdrive Distortion'

Distortion and overdrive are interesting hybrid send effects. Following on the discussion above of dynamic processing, a distortion is what happens when the audio signal gets too big for it's wires and starts to turn into noise. It's a form of compression that, instead of getting too loud, the processor makes the sound go into a clipped state without actually clipping. Rock of the sixties took advantage of the smooth clipping of tube amplifiers when guitarists and keyboardists would push the volume past the limits of the equipment to get that distorted or over-driven rock guitar sound. But the overdriven sound was then recorded at the proper levels onto tape so the tape wasn't distorted, just the sound going onto tape. Done correctly it has it's place. Done incorrectly and the recording is ruined. Overdrive distortion effects make the signal distort but the resulting signal is recordable.

Overdrive distortion is a hybrid send effect in that part of the signal gets distorted and part of it stays dry, depending on the settings of the effect. For instrumentalists, distortion can be an effects pedal in the audio cable from the instrument, or it can be applied digitally in the DAW (Digital Audio Workstation).

First, a quality pre-amp can vastly improve the EQ of a recording. No amount of EQ adjustment can properly fix an inadequate recording.

EQ or equalization is like the tone controls on your stereo, just more complex. Here you can boost or reduce the audio level at specific frequencies. For example, if the guitar is too "tinny" one might pull out 1 kHz while boosting gently around 2.8 kHz. Or for vocals, one might round out a thin sound by boosting 800 Hz and rolling off 8k.

Sound is measured by its rate of vibration called hertz, or Hz for short. The human ear can hear from about 50 or 60 Hz up to 20,000 Hz, or vibrations per second, under ideal conditions. Effectively the ear hears in a smaller range, but vibrations outside the range of hearing actually can affect the recording. Typically, an audio engineer might roll off all the sound below maybe 40 Hz and above perhaps 16 kHz. (The k is short hand for thousands. So 16k is 16,000 Hertz).

Just like the other effects, EQ can be a piece of outboard gear that affects the analog signal or it can be a plug-in that affects the digital signal inside the DAW.

[edit | edit source]

Recording sounds

[edit | edit source]

These should probably be separate pages

  • How to record vocals
  • How to record ambient noise
  • Issues in recording sounds

External resources

[edit | edit source]
  • Tips on sound recording
  • Guide to the Home and Project Studio - from Tweakheadz
  • Recording live to a two-track - by Bruce Bartlett
  • The Basic Program
  • Newbie Guide to Recording
  • Audio Engineering online course under Creative Commons Licence

Where can we take this?

[edit | edit source]

Now that you've learned a bit about recording sound, where would you like to go? You may wish to learn, teach, discuss and apply what you can do by going into more depth in Wikiversity's Audio Engineering Department which is developing the technology needed for producing the soundtrack for Wikiversity the Movie. Perhaps you are a musician interested in Jamming Online in a music genre such as Basic Blues & Rock with a GarageBand or the like. Maybe you are a linguist and would like to work on the development of audio resources for the Wikimedia Commons like spoken versions of important texts. The possibilities are endless.

You may have a completely novel application for advanced sound recording. Wouldn't it be great to be on the ground floor of Wikiversity's sonic potential? Hmmm. Just a thought. Advanced User Modules