MacOS – How to run “say –output-file” without it hanging (and worse) with more than 310 bytes of input

audiocommand linemacostext to speech

For my non-commercial curiosity, I'm interested in turning some of Lewis Carroll's work into machine-generated speech. When sending the output to an audio device, the say command can do this even with very large amounts of input:

$ wc ~/Downloads/lewis-carroll.txt 
    7066   55439  311589 /Users/xxxx/Downloads/lewis-carroll.txt

$ date; time say -f ~/Downloads/lewis-carroll.txt; date
Wed Oct  3 00:24:38 EDT 2018

real    368m11.986s
user    0m0.009s
sys 0m0.011s
Wed Oct  3 06:32:50 EDT 2018

When sending the output to a file, it appears to also work with very small amounts of input text:

$ date; head -c 310 ~/Downloads/lewis-carroll.txt | time say -o lewis-carroll.aac; date; ls -l lewis-carroll.aac
Thu Oct  4 08:46:18 EDT 2018
        0.17 real         0.08 user         0.02 sys
Thu Oct  4 08:46:18 EDT 2018
-rw-r--r--  1 xxxx  staff  81426 Oct  4 08:46 lewis-carroll.aac

With any more input, it doesn't work:

$ date; time head -c 311 ~/Downloads/lewis-carroll.txt | say -o lewis-carroll.aac; date; ls -l lewis-carroll.aac
Thu Oct  4 08:46:40 EDT 2018

(hangs!)

^C

real    0m30.243s
user    0m0.090s
sys 0m0.028s
Thu Oct  4 08:47:11 EDT 2018
-rw-r--r--  1 xxxx  staff  80865 Oct  4 08:46 lewis-carroll.aac

That's only the beginning of the problems. Further attempts to run say also hang, no matter how small the input or where the output is supposed to go (e.g., say Hello). Worse still, as soon as the first say command hangs, Chrome starts beachballing. Happily, there's a straightforward workaround to return the system to normal operation:

$ pkill speechsynthesisd say

(Workaround found here.)

I can understand Apple wanting to put rate limits (or something) on speech synthesis, to prevent people from generating cheap audiobooks. (Which would be fine; that's not what I'm trying to do.) This would be a pretty awful way to implement rate limiting.

I don't understand this failing so badly that other software (such as Chrome) gets messed up.

I did something like this back in 2012 (on a couple of kilobytes of text) without running into anything like this. I don't have enough history to reproduce that.

Is there any way around this mess?

Run on: macOS 10.13.6 (17G65)

Update:

Like @ashley, I'm able to convert a big chunk of the dictionary to speech:

$ date; time head -c 2000 /usr/share/dict/words | say -o words.aac; date; ls -l words.aac
Mon Oct  8 09:54:50 EDT 2018

real    0m2.552s
user    0m0.555s
sys 0m0.111s
Mon Oct  8 09:54:53 EDT 2018
-rw-r--r--  1 xxxx  staff  543542 Oct  8 09:54 words.aac

Looking more carefully at my input, I discovered it was in DOS format (lines end with CR-LF) instead of macOS's native Unix format (lines end with LF). I made a copy in the latter format, removing six CR characters from the beginning of my file … and now say can handle six fewer characters before hanging:

$ date; head -c 304 ~/Downloads/lewis-carroll-lf.txt | time say -o lewis-carroll.aac; date; ls -l lewis-carroll.aac
Mon Oct  8 09:49:51 EDT 2018
        0.18 real         0.09 user         0.02 sys
Mon Oct  8 09:49:52 EDT 2018
-rw-r--r--  1 xxxx  staff  81426 Oct  8 09:49 lewis-carroll.aac
$ date; head -c 305 ~/Downloads/lewis-carroll-lf.txt | time say -o lewis-carroll.aac; date; ls -l lewis-carroll.aac
Mon Oct  8 09:49:55 EDT 2018
Command terminated abnormally.
       29.72 real         0.09 user         0.02 sys
Mon Oct  8 09:50:25 EDT 2018
-rw-r--r--  1 xxxx  staff  80865 Oct  8 09:49 lewis-carroll.aac

(I'll add more info about my input at the end of this question.)

As to @ashley's other suggestions:

  • I've tried breaking down the input into smaller files and converting them individually. This is very helpful in exploring what I'm playing around with. I need to jump through many, many hoops to make this work. (I can document this further if it helps.)

  • I was hoping to do all this from the command line, without resorting to audio capture. It may end up being my best option for making one big audio file.

  • I can reproduce this problem with the Alex voice — the default for me, and my preference at the moment — but not the Daniel voice (though I get 79699 instead of 69867):

    $ date; head -c 305 ~/Downloads/lewis-carroll-lf.txt | time say -v Daniel -o lewis-carroll.aac; date; ls -l lewis-carroll.aac
    Mon Oct  8 19:53:11 EDT 2018
            0.68 real         0.08 user         0.03 sys
    Mon Oct  8 19:53:11 EDT 2018
    -rw-r--r--  1 xxxx  staff  79699 Oct  8 19:53 lewis-carroll.aac
    $ date; head -c 305 ~/Downloads/lewis-carroll-lf.txt | time say -v Alex -o lewis-carroll.aac; date; ls -l lewis-carroll.aac
    Mon Oct  8 19:53:21 EDT 2018
    Command terminated abnormally.
           21.75 real         0.08 user         0.02 sys
    Mon Oct  8 19:53:43 EDT 2018
    -rw-r--r--  1 xxxx  staff  80865 Oct  8 19:53 lewis-carroll.aac
    

    This suggests an obvious workaround…. I'll try it in a little while.


Here's my current input:

$ head -n 11 ~/Downloads/lewis-carroll-lf.txt
Alice was beginning to get very tired of sitting by her sister on the
bank, and of having nothing to do: once or twice she had peeped into the
book her sister was reading, but it had no pictures or conversations in
it, 'and what is the use of a book,' thought Alice 'without pictures or
conversations?'

So she was considering in her own mind (as well as she could, for the
hot day made her feel very sleepy and stupid), whether the pleasure
of making a daisy-chain would be worth the trouble of getting up and
picking the daisies, when suddenly a White Rabbit with pink eyes ran
close by her.
$ head -n 11 ~/Downloads/lewis-carroll-lf.txt | od -c
0000000    A   l   i   c   e       w   a   s       b   e   g   i   n   n
0000020    i   n   g       t   o       g   e   t       v   e   r   y    
0000040    t   i   r   e   d       o   f       s   i   t   t   i   n   g
0000060        b   y       h   e   r       s   i   s   t   e   r       o
0000100    n       t   h   e  \n   b   a   n   k   ,       a   n   d    
0000120    o   f       h   a   v   i   n   g       n   o   t   h   i   n
0000140    g       t   o       d   o   :       o   n   c   e       o   r
0000160        t   w   i   c   e       s   h   e       h   a   d       p
0000200    e   e   p   e   d       i   n   t   o       t   h   e  \n   b
0000220    o   o   k       h   e   r       s   i   s   t   e   r       w
0000240    a   s       r   e   a   d   i   n   g   ,       b   u   t    
0000260    i   t       h   a   d       n   o       p   i   c   t   u   r
0000300    e   s       o   r       c   o   n   v   e   r   s   a   t   i
0000320    o   n   s       i   n  \n   i   t   ,       '   a   n   d    
0000340    w   h   a   t       i   s       t   h   e       u   s   e    
0000360    o   f       a       b   o   o   k   ,   '       t   h   o   u
0000400    g   h   t       A   l   i   c   e       '   w   i   t   h   o
0000420    u   t       p   i   c   t   u   r   e   s       o   r  \n   c
0000440    o   n   v   e   r   s   a   t   i   o   n   s   ?   '  \n  \n
0000460    S   o       s   h   e       w   a   s       c   o   n   s   i
0000500    d   e   r   i   n   g       i   n       h   e   r       o   w
0000520    n       m   i   n   d       (   a   s       w   e   l   l    
0000540    a   s       s   h   e       c   o   u   l   d   ,       f   o
0000560    r       t   h   e  \n   h   o   t       d   a   y       m   a
0000600    d   e       h   e   r       f   e   e   l       v   e   r   y
0000620        s   l   e   e   p   y       a   n   d       s   t   u   p
0000640    i   d   )   ,       w   h   e   t   h   e   r       t   h   e
0000660        p   l   e   a   s   u   r   e  \n   o   f       m   a   k
0000700    i   n   g       a       d   a   i   s   y   -   c   h   a   i
0000720    n       w   o   u   l   d       b   e       w   o   r   t   h
0000740        t   h   e       t   r   o   u   b   l   e       o   f    
0000760    g   e   t   t   i   n   g       u   p       a   n   d  \n   p
0001000    i   c   k   i   n   g       t   h   e       d   a   i   s   i
0001020    e   s   ,       w   h   e   n       s   u   d   d   e   n   l
0001040    y       a       W   h   i   t   e       R   a   b   b   i   t
0001060        w   i   t   h       p   i   n   k       e   y   e   s    
0001100    r   a   n  \n   c   l   o   s   e       b   y       h   e   r
0001120    .  \n                                                        
0001122

Best Answer

I've tried, but I can't reproduce this problem.

On my machine (also running 10.13.6 17G65):

$ date; time head -c 2000 /usr/share/dict/words | say -o words.aac; date; ls -l words.aac
Sun  7 Oct 2018 21:17:52 BST
real    0m2.630s
user    0m0.519s
sys 0m0.152s
Sun  7 Oct 2018 21:17:55 BST
-rw-r--r--  1 ashley  staff  532880  7 Oct 21:17 words.aac

I'm using /usr/share/dict/words (see /usr/share/dict/README) because I don't have lewis-carroll.txt. I've been unable to make say hang.

Perhaps say is choking on something in lewis-carroll.txt (but only when sending the output to a file, which seems odd)?

Two ideas off the top of my head to work around this, if the above doesn't help...

  1. Send one sentence to say at a time, then recombine the output files.

  2. Or, have say send to the audio output device, but record that with eg Audio Hijack.

(Nicely written question, by the way: lots of relevant detail, concisely presented.)