Can ffmpeg detect the actual bitrate of an audio file

ffmpeg

Can ffmpeg tell me if an audio file has been upscaled or not? I see in this article, they used a spectrum analyzer to check for the real bitrate of an audio file. Can ffmpeg determine the real bitrate and encode files appropriately? By appropriate, I mean to encode at the correct bitrate for the audio quality the file contains?

Best Answer

The page you linked to is very imprecise in what it states. The actual bitrate of a signal may indeed not be the same as the amount of information encoded therein, but the cutoff of the frequency spectrum is not directly related to the bitrate.

See, I can choose to encode a pristine original file with full spectral resolution at 192 kBit/s with an MP3 encoder, cut off the highest frequencies at 16 kHz using a filter or encoder setting, and you would–based on that article–think that it is only 128 kBit/s.

Put differently, the absence of certain high frequences does not necessarily imply a low bitrate encode being masked by another round of encoding with an arbitrary bitrate. It only shows that there are indeed high frequencies missing, which can be perceived as a muffled sound.

FFmpeg indeed has a spectral analyzer built-in, but it lacks a scale on the axes and therefore isn't as useful as maybe Spek.


To give you a demonstration, I generated a file in Audacity that contains 30s of sine waves at 20, 18, 16, and 14 kHz, with an amplitude of 0.8, overlayed into a mono 16-bit PCM WAV file.

I then used LAME to encode them to MP3 at different bitrates:

for b in 320 192 128 96 64 32; do lame -b "$b" test.wav "test-$b.mp3"; done

We can see from the messages that instead of the "default" 20 kHz, LAME indeed uses a low pass filter for the 64 kBit/s file at 16–17 kHz. For 32 kBit/s, it will switch to an 8 kHz filter.

Let's put those into a spectograph such as Spek, and see what happens. As expected, even for 96 kBit/s, you get the full frequency spectrum – see the red line appearing at 20 kHz. You also of course get the harmonics at multiples of the frequencies which is due to the MP3 compression.

96 kBit/s

For 64 kBit/s, you can see the 16 kHz filter kicking in, which results in the red line appearing for the 16 kHz sine wave—but still there are frequencies above that, since the low pass filter is not attenuating the frequencies above its cutoff point.

64 kBit/s

Finally, for 32 kBit/s, you can see the 8 kHz filter in action. Here, indeed, the frequencies above that are attenuated enough so they're not visible. However, despite that filter, you can hear something in the file.

32 kBit/s

So, what we've seen is that the absence of high frequencies hints at a filter being used, but the cutoff frequency cannot be used to determine the bitrate a file was encoded at.

Want proof? Let's generate a file cut off at 8 kHz with 320 kBit/s:

$ lame --lowpass 8000 -b 320k test.wav test-fake.mp3
LAME 3.99.5 64bits (http://lame.sf.net)
Resampling:  input 44.1 kHz  output 22.05 kHz
Using polyphase lowpass filter, transition band:  7913 Hz -  8180 Hz
Encoding test.wav to test-fake.mp3

Here you can see the cutoff, but you'll also notice that due to the high bitrate, the amount of aliasing (noise) is reduced significantly:

When you listen to the files, the fake test file sounds a lot "cleaner" than the 32 kBit/s file with the same frequency spectrum.

Related Question