I looked at the following link: Trim audio file using start and stop times
But this doesn't completely answer my question. My problem is: I have an audio file such as abc.mp3
or abc.wav
. I also have a text file containing start and end timestamps:
0.0 1.0 silence
1.0 5.0 music
6.0 8.0 speech
I want to split the audio into three parts using Python and sox
/ffmpeg
, thus resulting in three seperate audio files.
How do I achieve this using either sox
or ffmpeg
?
Later I want to compute the MFCC corresponding to those portions using librosa
.
I have Python 2.7
, ffmpeg
, and sox
on an Ubuntu Linux 16.04 installation.
Best Answer
I've just had a quick go at it, very little in the way of testing so maybe it'll be of help. Below relies on ffmpeg-python, but it wouldn't be a challenge to write with
subprocess
anyway.At the moment the time input file is just treated as pairs of times, start and end, and then an output name. Missing names are replaced as
linecount.wav