So far, if you're not familiar with music production, you're probably still with us. You might be able to imagine, roughly, what writing, rehearsing, and recording look like. But now we're onto mixing and this is where it starts to resemble witchcraft.
A bit of background first: each microphone's signal is recorded onto a separate track (or channel). This means that it can be manipulated independently of the others i.e. we can adjust its volume, location in the stereo field, and all manner of other effects and processing. Now, with modern computers and software, you can have as many channels as it takes to melt your CPU so some people's productions can span hundreds of channels. Our needs are a bit more simple. Usually we've got maybe 8-10 channels depending on exactly how we've mic'ed things and how many overdubs (like backing vocals) have gone on.
Our earlier recordings were limited in this fashion by being recorded to 8-track tape. This is what gives them their Spartan sound, since we physically didn't have room for more. That said, there are tricks you can play to wring some more tracks out of an 8-track. What you do is leave a couple of tracks unused and mix three or more of the others to them as you go. So, after the first round, you can have recorded up to six tracks but still have 6 more to use. It's pretty rock 'n' roll how you can make 6 + 6 = 8! This is how we achieved Marionette's layers of sound with the humble 8-track.
The goal of mixing is to take these separate tracks and blend them together. The result, in the case of conventional music production is a pair of channels representing the left and right hand sides of the signal. In film, you might blend down to five, seven, or even eleven channels, to be played back through a surround-sound system. In the early days of recordings, pretty much everything was in mono so the final result would have been a single channel. Indeed, mono was still king in the early days of stereo so some people insist that the mono mixes of, say Sgt. Pepper's, are superior since the band and the production crew concentrated mostly on it, with stereo a gimmicky afterthought. For a short while, quadrophonic mixes, where there are four resulting channels, were a thing, allowing surround-sound. These never really caught on in the mainstream, presumably because of the increased expense and inconvenience of a four-speaker system made it unaffordable to many. Since there's double the track count, this would also lead to halving of the amount of music you can fit on a vinyl disc or cassette.
The most important thing to create a balance of your mix is the relative volume of each track. We tend to lay our mixes out through an analogue mixing console so this means physically grabbing the channel's volume fader and putting it where it sounds right. There's no science to this, you just have to listen carefully but with enough detachment so as not to lose the forest for the trees, a rule that applies to mixing in general. We spend quite a lot of time just getting these basic volume settings right.
The next tool that we reach for is the equalizer, Denzel Washington, or EQ. This gives you the ability to cut or boost different frequencies within the signal. So if the bass drum doesn't quite thump enough you can boost its low-range, or if the driven guitars are cutting into your ears like razors you can cut away the nasty frequencies. At first, it's tempting to think you can use EQ to shape any sound into anything else, but it doesn't really work that way. What you're really doing is enhancing what's already there. It's also important to remember to keep each sound in context with the overall mix so you wield the EQ to subtly change the various parts to sit better with each other. Small changes here and there often add up to a drastic change in the final mix. Since EQ is effectively a volume change on a portion of the sound, you might be nudging the volume faders a bit as well. Sometimes it can feel that something isn't right, say a vocal isn't "bright" enough and be tempted to boost the higher frequencies but often the problem is incorrect volume. In the vocal example, just turning the vocals up a bit might fix the issue.
The final tool we'll discuss is audio compression (not to be confused with file compression). What this does is turn down the signal if it exceeds some volume threshold. If that's too abstract for you then another way of putting it is that a compressor helps "level out" the sound's volume. Some instruments are exceedingly dynamic, particularly vocals. This means that the loudest bits can end up too loud while the quietest bits are inaudible. The exact characteristics of how the compression is applied to the signal varies from compressor to compressor, as well as the settings you use, and gives each of them a distinct sound. Some of them give a very "smooth" result while others seemingly exaggerate the energy in a signal. This makes them brilliant creative tools. You can make the edginess of a ragged rock vocal stand out more, create a nice, even, rhythm guitar pattern, or shape the attack of different drums. Counterintuitively, compression can be used to make things louder. Since there's a limit to the peak level of the signal, over which you get distortion, if you want to make the average level (the one by which we actually judge the volume) louder, you need to turn down the peaks relative to the average and turn the whole lot up. This is exactly what compression does. You have to be careful though, too much compression and you squeeze the life out of the sound. It's a civilized weapon from a more civilized age.
The mixing process usually involves listening to the song over and over while using these tools (maybe plus some other more exotic things that we haven't mentioned like delay effects or de-essers) to sculpt the track into something that feels right. You need to take frequent breaks to stop your ears getting too used to whatever you're hearing. You can also listen to other music to "re-calibrate". Both these things help you keep perspective on the mix.