Sox: Split audio on silence but keep silence

sox

I've got multiple audiobooks that are stored in large mp3s. And I'm trying to split these large mp3s into multiple smaller files.

I've found a tool that can detect silence in audio files and split audio files based on this "delimiter".

Here is an example:

sox -V3 audiobook.mp3 audiobook_part_.mp3 \
silence 1 0.5 0.1% 1 0.5 0.1% : newfile : restart

This will basically split audiobook.mp3 into audiobook_part_001.mp3, audiobook_part_002.mp3, … where silence >= 0.5 seconds.

Now the problem is that this command not only splits the file but it also removes the silence.

Therefore when you play the new files in a playlist the tracks/paragraphs sound squeezed together.

So how do you tell sox to only split the file but to keep the silence (at the end of each track)?

Best Answer

You can preserve all the silences in the split parts with some small changes. Starting with your original command:

silence 1 0.5 0.1%   1 0.5 0.1% 

The first triplet of values means removes silence, if any, at the start until .5 seconds of sound above .1%. The second triplet means stop when there is at least .5 seconds of silence below .1%. The rest of your command, : newfile : restart, then starts a new output file and begins again to look for sound at the start. So the first file ends when the silence begins, and the second file will start when the silence ends.

The simplest option available to improve this is silence -l. It will preserve the .5 seconds of silence that triggered the end of file. Unfortunately, any longer silence will be removed because it is the start of the next file. An easy way to keep a longer gap is to combine -l with a longer detection time, eg 2 seconds:

silence -l  1 0.5 0.1%   1 2.0 0.1%

You will now only split if there is at least 2 seconds of silence, but you will preserve the first 2 seconds of the gap. To avoid losing all silence, simply remove the detection of silence at the start. You need to replace the triplet by a single 0:

silence -l  0   1 2.0 0.1%

If you want to play with simple sound files to see how sox handles situations, you can easily create 2 sound files, one consisting of 1 second of a tone, and one consisting of 1 second of silence, then join them together as you wish before presenting the result as input to the silence effect. For example, create:

sox -n gap.wav   trim 0 1
sox -n tone.wav  synth 1.001t sine C5

then join gap-tone-gap-tone and create out.wav using your effect and listen to the result:

sox gap.wav tone.wav gap.wav tone.wav out.wav silence 1 0.5 0.1%
play out.wav
Related Question