Copyright Michael Karbo, Denmark, Europe.
Chapter 7. Sound compression
Data media such as DVD and CD disks, etc. are expensive and have a limited capacity. So it is not unimportant how much space the sound data takes. Just like data transport, for example via the Internet, costs time and also kroner (or euro).
The music business’ distribution of music on CD, SACD and DVD-A does not require that the music be compressed. But in almost all other cases, there are good reasons for it and it is easy to compress music data. We will be taking a look at it in this chapter, where I will describe the mechanics behind the sort of compression found in Sony’s MiniDisc, mp3 files and digital soundtracks in DVD films.
Need for compression
The problem with uncompressed sound is, that it takes up a lot of space. It’s easy to calculate how much. If we, for example, sample for three minutes with 44,1 kHz and with 16 bit’ resolution, this will result in, an amount of data of almost 16 megabytes. The calculation looks like this, you have to remember that 16 bits = 2 bytes:
Figure 32. Three minutes of music fills more than 30 MB in an uncompressed CD format.
If music is sampled with a higher frequency and resolution – then the amount of data will be correspondingly greater.
Digital data is suitable for transmission
and copying for example via computer networks. But if digital music
is to be transmitted via a network, then too much data won’t do.
Figure 33. Approximate times for the transmission of 30 minutes of stereo music, which takes up 317 MB in an uncompressed CDquality.
Because of the explosive development of the Internet in the 1990s, among other things, a need for compressed sound data arose, so that music, etc. could be distributed more easily.
Compression without audible loss
All types of data can be compressed, and there are fundamentally two forms of compression: with or without loss. When the data is sound (like music) then it is not possible to compress very much, unless you accept some loss. The aim is, therefore, to compress sound data with as little audible loss as possible.
You can make a comparison with the world of graphics, where digital colour photographs are compressed with the loss giving JPEG-algorithms. But even though the images’ data is compressed very greatly, the photographs are still very good. It is the same with digital sound. With the use of different algorithms (methods) details can be removed from the sound data, which we can’t hear anyway.
Figure 34. The principle behind compression. The trick is to remove the sound information, which can’t be heard anyway.
Try and put yourself in an extreme situation. We are going to make a three-minute stereo recording with two microphones. The sound is sampled with the familiar 44,1 kHz and with a 16-bit resolution. But what we are going to record is silence! Three minutes’ stereo recording will take up 30 MB in a PCM format (as described in Figure 32).
So three minutes of silence takes up more than 30 MB, which is an enormous waste because no sound information has been recorded. The principle is the same with normal music – there is a lot of superfluous data, which can easily be cut out without the quality of the music being reduced.
With the help of special software, uncompressed PCM sound can be processed so that ”superfluous” data can be removed in a lot of ways. One example of this is that it takes a huge amount of data to keep noise down.
Noise, however, is only a problem, if the music otherwise has a low sound level. Which is why sound is encoded so that more noise is accepted in the passages with a high level of sound. The good signal/noise conditions are only kept in the soft passages of the music. This is just one of many mechanisms, which can compress the digital data of sound.
These compression mechanisms are used as mentioned in a number of sound formats like for example:
Common, too, for all compressed sound formats are the fact that there has to be software, and thus hardware too, which can encode and decode the formats. These mechanisms (codecs) have to be built into the sound device, which manages the digital sound:
Figure 35. Software is required, if digital sound is to be encoded and decoded.
A more detailed description of the mp3 format and its possibilities follows later in this booklet, but as mentioned, all methods are built on the same principle: to remove as much of the sound information that we cannot hear anyway as possible. Read more about codecs, too, later in the booklet (in chapter 24).
If we try to illustrate the conversion from analog sound to digital sound and the reproduction of it, then it could look like this:
Figure 36. The path from sound recording to reproduction via, for example, the mp3 format goes from analog signals to digital data with compression and back to analog signals.
Sound compression isn’t done in one particular way. Just like digital photography can be compressed with different JPEG quality, digital sound files can be encoded in variable degrees.
In practice many of compression’s different algorithms can be varied, so that they can work more or less powerfully. So, sound can be compressed to a higher or lower degree. The more powerful the compression, the worse the quality of sound; that’s the way it is. Compression removes information, and all other things being equal, can only reduce the quality of sound. This reduction is experienced in several ways:
What is brilliant with a format like mp3, etc. is, that you yourself can decide how good the quality should be. If you choose a weak compression, you will get a compressed sound file, which is almost identical to the original recording. With a little more powerful compression, you will get a minimal reduction in quality.
A variable bit rate is necessary for varying the compression. This means choosing in advance how much room the final sound file should fill in advance. You give the amount of data to be played per second. The amount of data is measured in bits or more correctly in kilobits.
In Figure 37 you can see the bit rates, which Windows normally can work with (when a sound card is installed in the computer):
Figure 37. Different bit rates for compression of sound in Windows.
A bit rate is a measure for bandwidth. In the table in Figure 33 (on page 3) I have listed the bandwidth of different kinds of Internet connections measured in kilobytes per minute. Music, however, is compressed in the above-mentioned kilobits per second. It is written as Kbps, which is also seen in Figure 37.
Figure 38. Mp3 compression with different bit rates.
Please note that in the next to last line the bit rate of 6,5 Kbps and further up bit rates of 28Kbps. These are very powerful compressions, which give really small files. They are not suitable for music.
Music is compressed to bandwidths between 64 and 320 Kbps. Most experts agree that a compression of 256 Kbps gives a sound reproduction as close to the original as possible. In practice, however, a bit rate of 128 Kbps is the most common. This gives, in my opinion, an acceptable sound quality:
Figure 39. Bit rates, used for compression of stereo sound.