top of page

BOSS Audio: Normalizing Your Audio

What is normalization and is there a standard for it? You’ve probably asked those questions, and this week’s episode has answers. Anne and VO Tech Guru Tim Tippets talk about peak normalization, RMS normalization, understanding your noise floor, and why louder does not always mean better. Check out this episode to learn optimal input levels, the difference between amplification and normalization, and how to avoid digital distortion. Use these tips to create dynamic audio and rock your business like a BOSS.


Quick Concepts from Today’s Episode:

  1. The broadcast standard is to “normalize your audio to -3db”. This is so people can hear you at a standard level, no matter what device they are on 

  2. Peak normalization takes the peaks of your audio and sets them to the level you specify Everything else in your audio moves relative to this

  3. RMS (root mean square) normalization averages the overall volume of your audio to the level you specify. This is standard for audiobooks, but not for traditional voiceover auditions. 

  4. Emotions in your voice, show up as peaks and valleys in your audio

  5. Making your audio louder also brings up your noise floor and any background noise along with it

  6. If your input levels are at -6 to -12db when you are recording, and you are monitoring with your headphones, you can hear if there is any background noise

  7. You should be aware of the numerical values of your levels, not just the colors as they appear in your DAW and/or on your interface

  8. A dynamic VO that is done correctly will win out over an audition that is simply louder

  9. When everything is loud, nothing is unique, and there’s no way to discern emotion and nuance

  10. You can get plug-ins for certain interfaces that will process your audio in real-time

  11. A combination of your performance, software control, mic technique, and normalization can create a winning audition and great audio to submit to your client for the final product

  12. You should focus on quality of audio and performance when sending in your auditions. Don’t try and win the loudness war

Referenced in this Episode

Direct links to things we brought up ++

  1. Look at all of the gear and plug-ins that Anne recommends!

  2. Check out all of the equipment that Tim recommends

  3. These cables are studio gold!

  4. Recorded on ipDTL


>> It’s time to take your business to the next level, the BOSS level! These are the premiere Business Owner Strategies and Successes being utilized by the industry’s top talent today. Rock your business like a boss, a VO BOSS! Now let’s welcome your host, Anne Ganguzza.

Anne: Hey everyone, welcome to the VO BOSS podcast. I’m your host, Anne Ganguzza, along with the audio engineer extraordinaire, Mr. Tim Tippets. Hey Tim Tippets! How are you?

Tim: I’m good, how are you, Anne?

Anne: I”m doing great. Tim, I got a question for you, which I think you’ve probably received multiple times in the past. But it brings me back to when I first started in voiceover, and I was on the Voice123 pay-to-play. And the first time I saw this term was to normalize my audio to -3 dB.

Tim: Right.

Anne: And I get the question all the time from people new to the industry like “what, how do I normalize? What is normalization? Do I need to do it, and at what level do I need to do it?” So I thought it would be a good time to maybe talk about, at least begin the conversation about normalization, and what it is and why we need to do it. [laughs] I know that I have been normalizing my audio to -3dB, but maybe I don’t need to. Thoughts?

Tim: Well, why did you think you needed to normalize to -3 db? Back in the day, what did they tell you?

Anne: They basically said all auditions had to be sent in and normalized to -3 db. And my understanding was so that, you know, all of the submissions came in at roughly the same volume level.

Tim: Right.

Anne: And that was what helped to do that.

Tim: So the -3 dB is so that people can hear you at a certain level —

Anne: Right.

Tim: — regardless of the device that they’re on, right? But these days now, you’re hearing, what, anything from —

Anne: Oh my goodness! I’ve heard, you know, 0. I think next in line, I heard 0. Then I heard -6, -1. I’ve heard all different values. And so, is there a standard? Is there a right answer for that?

Tim: Well, let me ask you a question, ok? So -3 dB has always been the broadcast standard for normalization, right? Which means whatever the loudest peak is in your audio, it’s going to look at that, and if it’s louder than -3dB, then it’s going to lower that. Like let’s say it came in a -1dB.

Anne: Right.

Tim: It’s going to lower that to -3 dB, right, which is a total of 2 dB. And it’s going to take all of the other audio down with it 2 dB, right? So that’s what peak normalization is all about. Now if you were well under at -6, and that was your loudest peak, then it’s going to lift that 3 dB to get it to -3 dB. And the rest of the audio is also going to be lifted by 3 dB. Make sense?

Anne: That’s peak normalization.

Tim: That’s peak normalization.

Anne: Right. So there’s a different — there’s another normalization too that I learned about when I was doing some audiobook work.

Tim: Right, and that’s RMS, which stands for root mean squared. It’s a fancy way of saying averaging. So you’re averaging the overall volume of the entire thing. And that’s fine for audiobooks, right? But when we talk about normalizing whether it’s for an audition, or for a job, or whatever, let’s just talk about — let’s say it’s for auditions. Let’s just put it there, because that’s really where the conversation is today is, is at what level should I be normalizing so that the people on the other end are listening to me at level X? Ok? So you’re hearing anything from 0.1, which is the absolute highest that you can pretty much go, or -6 dB, right? If you come in at -3 dB, and we’ve got two Annes, ok, not with the same talent level — if one comes in at -1 dB and is not so good, but the other Anne comes in at -3 dB, which one do you think they’re gonna choose?

Anne: Oh [laughs] that’s a good question. I never thought of it in that way! [laughs]

Tim: If you’re not so good, yeah, if you’re not so good, do you just want to suck 2 db louder?

Anne: Yeah, I don’t want to amplify that. [laughs]

Tim: We’re assuming a lot when we say “hey, send it in at -1 db” or whatever. You’re assuming that everyone has, is equally talented.

Anne: Right. Or the other thing too, Tim, is that I’ve had people, when my students are submitting homework, and I have them record mp3s, I always make sure they normalize it so I can at least — I don’t have to play with the volume all the time. But sometimes that will bring up not just their voice but the, all that audio and that noise in the studio as well.

Tim: Right.

Anne: So not just the talent [laughs] maybe not being as good, but the talent and all of the audio environment as well being brought up to like — a lot of times, the students will be like, “what? I don’t know where that came from. Let me just — if I don’t normalize it, it sounds better.” [laughs]

Tim: This is one of the misunderstandings that we should really clear up, because if your input levels are at -6 to -12 dB when you’re voicing and you’re watching your meter, we’re now near that -3 db level. If we’re listening, we can hear if there are noise makers, right, because we are up at that level. I’ve actually had some people record at very, very low levels because they say “hey, it lowers my noise floor.”

Anne: Sure.

Tim: That’s all good and fine, but reality is you have to bring it up to -3 dB, and using the math, if you lower it to where your peaks are at -22 dB let’s say, ok, without getting too far into the math of it, and your noise floor is quiet, well when you go to normalize it to -3 dB so that it can be heard nice and loud, all that noise floor goes up with it. Whatever the difference between -22 and -3 is, that same level of noise floor is now going to be increased by that same dB level. All that noise floor is just gonna get louder and louder. So that’s why you — that’s one of the reasons why you want your input levels to come in somewhere between -12 and -6 dB. Not consistently. You can go over -6 here and there and under -12. We’re just talking about an average. But we want to understand what our noise floor is —

Anne: Right.

Tim: — before we voice, so that when we do normalize, we don’t have this big problem of all of this noise coming back up with it.

Anne: Sure, and you know what? You’re the first person that I’ve actually like heard talk about “this is where your voice should come in at.” I’ve always thought I look at my levels and I definitely don’t want to go into the red. I’m looking at the colors of my levels. I’ve never actually looked at the values. That’s a good place to shoot for in terms of where we should be recording at in the first place, so that when we do normalize, it’s not gonna — well, we’re not surprised, let’s say, with a lot of noise or a lot of other unexpected things that we didn’t think we heard.

Tim: Right, well if your interface, if the interface that you’re going into has an indicator of green, yellow, red, you don’t want to be going in the red on that, because that means you’re clipping that interface.

Anne: Right, right, which is what I was always taught to look for. Just don’t go into the red. Be in the yellow. That’s what I was —

Tim: Right, but you have to look in two places. You have to look at your interface to make sure that that’s not happening, and then you need to look inside your software —

Anne: Yes, yes.

Tim: — to make sure that’s not happening.

Anne: Oh, good point. Good point, yeah.

Tim: You can go into the red in the software. That’s not a problem, like I said. You can go over -6, and you can hit -3 here and there. But you just don’t want to go to 0. You just don’t want to clip, because that is the digital equivalent of distortion that a human would hear in real life with dB, right? If something were 140 dB, and you were next to it and listening to it, your hearing would just, I mean, it would hurt, ok? And 0 is the digital equivalent of that. The problem is is we have no scale above that for the computer to relate to, because what is pain? Pain has to be something, so let’s call it 0, and then we’ll move backwards from there. Because in real life, dB can go infinitely louder and louder and louder.

Anne: Right, right.

Tim: Like if a star were to explode, the dB level would be absolutely insane, right? There’s no way to measure that beyond 0.

Anne: [laughs] That’s a good example. I like that.

Tim: Yeah, so we have to use this negative in order to kind of get an idea of where we’re at with things. But anyway regardless, if you do go into the red here and there, that’s fine. That’s not a problem. Just don’t clip.

Anne: That’s in the red on your software, correct?

Tim: Yeah, in your software, in your meter, yeah. Just don’t clip. Clipping is when you hit absolute zero, and you look at the wave form, and it’s, you know, it’s beyond the ruler.

Anne: Right.

Tim: And that’s not good because it will begin to distort.

Anne: Gotcha.

Tim: Yeah.

Anne: So then normalization is a thing that we must do. And so if we have to pick a number, what number do you pick?

Tim: I use -3 dB again, again. We’ll use me as an example, ok? If I come in talking like this, and here’s my audition, and hey, please hire me because I could really use the money. And I come in saying that —

Anne: [laughs] You got it, Tim!

Tim: I got it, yeah. You know, so the other Tim comes in and says, you know, “Tim Tippets,” slates, “hey, I’d really like this job,” and it’s for Mercedes Benz or whatever, and it comes in at -3 dB — sorry. The guy at -3 DB is going to get the job because he’s not lame. Again, this concept of coming in at, you know higher and higher levels, this is part of what’s called the loudness war, as far as how I look at it. And the loudness war started in the late 80’s or 90’s or something like that. A lot of us audiophiles really didn’t appreciate it, because what radio stations were doing is they were maximizing the volume across the board to be as loud as they possibly could, because louder is perceived as better.

Anne: Oh interesting.

Tim: Right? So I believe it was Metallica, I believe, I think, it’s the loudest record in history. And I took a listen to it, and it doesn’t sound good.

Anne: Woww, ok.

Tim: Having heard Metallica from the 80s and listening to that album, I was just like, “wow, you guys are really like, you’re doing it just to do it.” It was like a spinal tap moment, it was almost like a comedic thing.

Anne: Sure.

Tim: Ok? And when you listen to dynamic music that is well mixed, and it has really nice lows and really nice highs, you get this sense of, you know, being brought down to that emotion where it needs to be.

Anne: Sure.

Tim: Just like if I’m whispering here and then trying to convey an emotion. And then if I get louder, and I get angry, you get an idea that there’s dynamic range there. If everything that I just did there came in at the same volume level, you’re not, the stuff isn’t gonna be conveyed correctly.

Anne: So it’s interesting to me that you mention that louder was thought to be better.

Tim: Well… [laughs] So…

Anne: Back in the day.

Tim: Louder being better is just a thing. It’s human nature, ok?

Anne: Yep, yep.

Tim: That said, when you take something that is louder, that is consistently loud, and you put it over something that is meant to be dynamic, and then you mix them together versus a dynamic VO that is not necessarily consistently louder, you are going to have more of an impact overall with the one that was done correctly, where there are dynamics —

Anne: Dynamics.

Tim: — versus the one that is just crushing, hitting the wall consistently and is just driving things as loud as it possibly can.

Anne: Yeah, and when everything is loud, like nothing sticks out. Nothing is unique. Nothing is brought to light, so I think it’s hard for the ear to discern, you know, different emotions and nuances like you’re saying.

Tim: [quietly] What do you mean, Anne?

Anne: [laughs] Now my question is — so we talked about peak normalization. When — I heard that you use the RMS when you’re using audiobooks, and why? Why is that done?

Tim: Because — ok, so RMS, the reason that you want or they want consistent volume across the board is because people will be listening on their iPhones, in their cars, on their computers, etc. And while audiobook quality is important, what is more important to audiobook producers is that the end user is not constantly reaching for the volume dial —

Anne: Right.

Tim: — to increase or decrease the volume.

Anne: That makes a lot of sense.

Tim: Yeah, they want it to be consistent. They don’t want it crushed, right?

Anne: Right.

Tim: Like we just talked about, but they do want it to be consistent, and RMS provides that, because it averages the loudness across the board versus using one target value, and then using that as the measure.

Anne: Here’s a very elemental question then. Why not use, in some audio editors, like I have amplify. Right? What’s the difference between amplify, amplifying it in a negative direction or doing a normalization?

Tim: Well, if you use amplify, you don’t really have kind of a under the hood thing to look at, right? And/or if you do, you really need to know what you’re doing, because amplifying something may get you the levels that you’re looking for, but you may not have the ear or the knowledge to be able to get there responsibly. Right?

Anne: Amplify responsibly. [laughs]

Tim: Yeah, amplify responsibly. But are we really amplifying, or are we averaging? The answer is we’re averaging.

Anne: Right. That’s the question. Because I will actually take something, if I happen to have a peak that just is one peak in my audio. I will de-amplify just that one peak.

Tim: Right, but now, you are running on the Manley VoxBox —

Anne: True.

Tim: We have that just a little bit of control with that compression, right?

Anne: Right. I haven’t done that since I’ve been using the Manley, you’re right.

Tim: And we’ve had that conversation about responsible compression. Now responsible compression is going to take care of that little peak that you have so you don’t have to go in there manually and fix it. That’s the whole idea.

Anne: That’s right! I love it. [laughs]

Tim: Yeah, so again, when people say, you know, should you send in your audio effected? If you know what you’re doing or you have someone you’re working with who knows what they’re doing, then the answer is yes, because it’s going to save you a ton of time. It’s going to emulate better mic control. You’re going to sound better because of the EQ and so on. Again not the “hey, I’m gonna go on YouTube and learn this version of it,” but an actually educated and/or person helping you who knows what they’re doing.

Anne: I think that there’s also something that comes into play when we’re talking about having to normalize our audio before we send it off, and that is a little bit of mic technique, in terms of if you’re going to be overly emotional or excited or a little bit loud. I think that mic technique really plays a role in how your normalization takes, because your peaks might be higher if you’re not using proper mic technique. Thoughts on that?

Tim: Well yeah, hat’s correct. If you back off the mic, and you do not have an effects rack in play — I will do it now. I will get far away from my mic. And I will get very loud. Now you can hear my room.

Anne: Yep.

Tim: Ok?

Anne: Yep.

Tim: And that’s a problem.

Anne: Right, exactly.

Tim: If I were able to decrease my output volume, which I’ll do right now, ok, with realtime processing, and I get that loud, then you don’t hear my room.

Anne: Correct.

Tim: So it’s really all a matter — in this case, I’m on the Manley VoxBox with the Apollo Solo. It makes all the difference in the world when you’re able to do that.

Anne: So that’s controlling it, software. Now what about physical? Like if I just — I think also we can turn away from the mic a little bit. Like if I were, like right now I’m kind of in the mic, a little bit side, slightly angled. But if I were gonna be a little bit louder, I’d turn a little bit more. Thankfully I don’t have room noise because you built my studio.

Tim: Yeah, right.

Anne: But I think that for me would be quicker than me like doing — using a software control. But then again, I never even thought, because it’s my first month or so with the Manley. So I didn’t even think about doing the software control.

Tim: Yeah, well, that’s the thing. I did a commercial for Bazooka brands where I did a very, very quiet read, you know, the leaning in type of read on my 416, and then I had to yell “Bazooka” very loud, and all I did was simply take the output of my Manley, and I turned it down.

Anne: Aha, yup.

Tim: Yeah, and the people in London, I was like “how’s my level?” They were like, “fine.” I go, “great. Let’s do it,” right? And then I turned it back up for the end of that commercial, so I could do the softer parts again, because it goes soft, loud, soft. A lot of that does have to do with mic technique as well.

Anne: Sure.

Tim: Because a lot of people will have the mic right in front of them, and then you get plosives, plosives, like that.

Anne: Right, exactly.

Tim: But if you angle it off to the side like this, around 30 to 45 degrees, depending, and crosstalk the mic, then plosives, plosives, Peter Piper all day long, no problem, right?

Anne: Exactly, and that’s why I had to do before. And maybe we should just explain about the Manley, because not everybody has the Manley VoxBox. Explain like what that is and how voice talent will be able to use that.

Tim: Sure, ok. The Manley is a plug-in, and that’s software inside of the Apollo console. Apollo makes various interfaces. They have the Twin, the Duo, the Solo, etc. And these devices come with plug-ins that act like real-time studio rack units, ok? It could be compression. It could be cue. It could be all sorts of things. But the Manley in particular has a built in de-esser, it has a built in EQ and a built in compressor, so that when you’re hearing me, you’re hearing me in realtime being dialed in to sound the best that I can sound or as much like I can like myself, right? Or to say it correctly, to sound as close to my real voice in real life.

Anne: Sure.

Tim: Because each mic has its own properties that will make anyone sound any different way. And that may not necessarily be optimized.

Anne: Right. And so being able to control our loudness with a Manley VoxBox is specific to the Apollo interfaces. Can you use, is that plug-in available anywhere else?

Tim: No.

Anne: Ok, so it’s only for people that have Apollo interfaces. But I assume, is there something equivalent in other interfaces, or?

Tim: Yeah, sure, sure.

Anne: Plug-ins?

Tim: Yeah, sure. There are units out there, and I have experimented with those. Of course I work with clients all the time who have various units. I’m just gonna tell you in my personal opinion, there’s nothing that even comes close. It’s, you know, I mean, it’s good, don’t get me wrong. We also have people who are using DBX units and racks that sound fairly decent, but again, nowhere near the type of control that you have here for several reasons. One is again, the Manley VoxBox is a $4600 unit in real life. This a faithful emulation of it. When it’s not on sale, it’s $300.

Anne: Right, right.

Tim: When it does go on sale, which I think you got it on sale —

Anne: I did.

Tim: — for like $129 or something. Ok? We talked about that in the interfaces episode. And yeah, it’s just you have control right here in real time. The thing pops up, and it’s like the rack except you take your mouse and move the knobs around and you adjust them until it sounds fantastic.

Anne: So then a combination of normalization, and depending on your performance, right, if it’s super dynamic, super loud, super soft, those sorts of things, a combination of your, either Manley Vox or your software control and mic technique and normalization can create a winning audition and even just something that I would submit to a client as my final product.

Tim: Yeah, as final product, they’re more than likely going to want it at -3 dB because you need to give them what’s called head room. You need to give them some room to work with. If they have to lower you overall, that’s going to be an annoyance. I know it is for me when I have to do a project, and someone sends it up to me, and it’s just been blasted and it’s clipping and all that. The bottom line is this. When you’re normalizing, and you’re sending your stuff out, if your agent, or if the P2P is saying, “look, you must normalize to this level,” then go ahead and normalize to that level. That’s fine. That’s what they want, ok? But again, what you should really be focusing on when it comes to sending in your audio is quality of audio and performance, bottom line. This concept of 2 dB louder or whatever is distracting everybody, in my opinion. I’m hearing it a lot lately, and I just, I don’t think it’s something we should be focusing on. So as far as normalization goes, that’s all I have to say about it.

Anne: Alright. Well, that sounds great. Well, thanks for clearing a lot of that up. I think it’s gonna help a lot of people that have had questions in the past and are wondering, what do I worry about, what number? I think -3 dB is a good place to be. That along with proper mic technique and a good environment is going to get you that winning gig.

Tim: Bingo. Bingo.

Anne: Good stuff. So I’d like to give a great big shout-out to our sponsor, ipDTL. We can communicate in a very controlled quality loudness situation. And you can find out more at Alright, guys, you have a great week, and we’ll see you next week. Thanks so much. Bye!

Tim: Bye!

>> Join us next week for another edition of VO BOSS with your host Anne Ganguzza. And take your business to the next level. Sign up for our mailing list at and receive exclusive content, industry revolutionizing tips and strategies, and new ways to rock your business like a BOSS. Redistribution with permission. Coast to Coast connectivity via ipDTL.