How we turn a raw recording into a polished podcast

When we receive a recording that needs editing and enhancing, it’s usually an MP3 file. The first thing we do is open it in Adobe Audition and convert it to a WAV file so that as we periodically save it, we’re saving it as an uncompressed file. In other words, we’re not throwing away vital data by continuing to save it as an MP3 file. We need it to have enough information to “manipulate” (i.e. enhance) so that it sounds its very best when we’re finished. Next, we adjust the volume to have it sound as loud as possible without sounding distorted or unnatural. We continuously adjust the volume as we work on the recording, so volume adjustment is an ongoing procedure.

After that first volume adjustment in Audition, we open it in SpectraLayers and remove low frequency rumble (usually 80 Hz and below). If there are obvious plosives (popped P’s) that we can see on the spectral display (they look like little downward spikes usually below the 100 Hz line), we erase those with the eraser tool. We save the file and move on to the next and most important software we use.

We open the recording in iZotope RX 8 (the current version at this posting) and start the meticulous task of cleaning and editing the recording. We start by sampling the room noise (when no one is speaking) using the Spectral De-Noise tool. We’re careful that we don’t reduce it more than 12 dB because we want the people speaking to sound natural. Over manipulation of a recording can give it a robotic sound.

When we’re satisfied that we have reduced the background room noise sufficiently, we start at the very first word spoken in the recording and move word by word through the recording, removing clicks, breath noise, thumps, cars passing, computer alerts (like incoming email), and verbal flubs. So, basically all sounds other than voice are removed or reduced in the recording. Sometimes we rearrange words so the person speaking sounds coherent. For example, if a person forgot to add an S to a word meant to be plural, we copy an S from elsewhere in the recording and paste it to the end of the mispronounced word. Our job is to make the people talking sound their very best. Just a quick note here: we don’t rearrange words in an old recording (for example, someone’s grandmother speaking) because we are trying to preserve authenticity.

After we’ve completed our combing over a recording and feel confident we’ve edited and enhanced it the best it can sound, we then mix in an intro and outro if one is available. By the way, Audiobag also creates intros and outros. We then save the final presentation in the format requested by the customer. We give the customer options to choose from on our Script and Instructions form (which the customer filled out when making a purchase from us). We explain on the form that a 320 kbps MP3 files is the highest MP3 quality they can get, but a 128 kbps MP3 file will download and stream more quickly and smoothly. Of course, the customer can also choose to have the finished presentation delivered to them as an uncompressed WAV file and then convert it to whatever format(s) they want.

Many people don’t realize that a 45-minute recording can take 3 or 4 days to edit and enhance. You can find faster turnaround time from other editing companies, but it probably won’t be the cleanest it can be. Like a fine wine, great editing takes time. If you’d like to learn more about Audiobag’s editing service, visit our Editing and Enhancing web page, where you can also hear samples of our work and place an order.

When should you consider buying a new microphone?

I can tell when a podcaster is using a poor quality microphone (or an internal computer or smartphone microphone) versus a high-fidelity microphone. The voice sounds tinny on the poor quality mic. In other words, the high frequencies are there but very little, if any, low frequencies are present. Also, the dreaded popped P, known as a plosive, raises its ugly head. And often there is an ear-piercing sound on words with an S (“sibilance”). As an audio editor at Audiobag, I can remove or reduce plosives and reduce sibilance with post-production enhancing. However, I can’t add something that’s not there: low frequencies. So, the voice is going to be missing warmth. And there’s only one way to correct this problem. You need a decent microphone. The type of mic you choose depends on the sounds around you other than your voice. Oh yeah, and your budget, of course.

The first microphone I purchased for our old studio back in 1987 was a dynamic microphone, which is not as sensitive to sound as a condenser microphone. The reason I chose a dynamic mic back then was because we were recording our voices in the same room where our reel-to-reel recorder was located (yes, Audiobag started out in the days before digital audio) in a studio in downtown Georgetown right next door to a fire station. Talk about noise! So, I purchased a Shure SM7B*, a dynamic mic. All these years later, the Shure SM7B is one of the most recommended microphones for podcasters. One reason is because most podcasters are not recording in a soundproof room. A dynamic microphone tends to knock out distant noises (like your kids yelling, the dog barking, or the sound of the furnace in the background). These days we use a condenser microphone for the voice work we do at Audiobag because we are in a quiet soundproof room (with no analog equipment, thank you very much!). A condenser microphone has a thin diaphragm which is more sensitive to detailed sound than a dynamic microphone. In other words, the words that come out of our mouth are picked up quite nicely by a condenser microphone, as well as other sounds. Luckily, there are no other sounds in our soundproof room.

So what microphone should you buy? Well, that comes down to your budget. I recommend spending at least $100 on a new microphone. And be sure to get a wind filter while you’re at it to knock out the popped P’s. If money is not a major concern, then you might want to start around $400. With that said, I’ve done a lot of testing of microphones and I’ve found that spending more doesn’t necessarily mean you’re going to get a better-sounding microphone. If the microphone doesn’t sound right to you, send it back. One side note here: I’ve enhanced podcasts where the podcaster used an expensive microphone and yet I still needed to roll-off some of the low frequencies because it was too bassy. So, keep in mind that adjustments to your voice can be made in post-production to make you sound better. And if you need that done, as well as removal of verbal flubs and extraneous noise, check out Audiobag’s editing and enhancing service. Yep, that’s a not-so-hidden plug for what I do for a living. I make podcasters sound their best.

Listen back to your recording. Is it the best your voice can sound? Think about your listeners. They expect quality.

* Note: As an Amazon Associate, we earn from qualifying purchases.

How to remove a spike (crackle or pop) noise in your podcast


I was listening to a podcast the other day on one of my late afternoon walks (yes, we can take walks in the middle of winter here in Central Texas), and I was amazed that the podcast had an annoying spike noise throughout the show. I contacted the podcaster and offered up some quick advice on how to easily remove the noise. I thought I’d pass it along here as well.

Continue reading “How to remove a spike (crackle or pop) noise in your podcast”