Ubuntu – Convert speech (mp3 audio files) to text

software-recommendationspeech recognition

I am looking for simple converter from mp3 to txt. I have tried, without success: julius, CMU Sphinx, … In the past 4 hours I did not find a way how to use them (or properly install them).

What I am looking for is something like:

$ converterapp -infile myspeech.mp3 -outfile myspeech.txt

I am also fine with GUI application since I only have a few files to convert and can click around.

With the help of this answer Speech-recognition app to convert MP3 to text? I manged to get it working but it produces no output. Well, actually it produces a couple of blank lines (no words detected)…

Best Answer

pocketsphinx will do speech to text from an existing audio file. Depending on the initial format of the mp3, you may need two separate commands.

First convert your existing audio file to the mandatory input format:

    ffmpeg -i file.mp3 -ar 16000 -ac 1 file.wav

The run pocketsphinx

    pocketsphinx_continuous -infile file.wav 2> pocketsphinx.log > myspeech.txt

the created file myspeech.txt will have what you're looking for.

In case you are new to ubuntu, you would need to install the above programs using this command:

    sudo apt install pocketsphinx pocketsphinx-en-us ffmpeg