Must it be done with wc
? Because here I've ran into a very nice attempt to use regex as a csplit
pattern. I don't have a system to test it right now but the regex itself seem to do the job.
The expression looks like that:
csplit input-file.txt '/([\w.,;]+\s+){500}/'
The command I think you're looking for is called fmt
.
$ fmt loremipsum.txt
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam vel
lectus ac enim venenatis porttitor in et est. Curabitur ut eros quis risus
consequat dictum a a lectus. Integer ut risus quis augue lobortis molestie
vel id nibh. Aliquam sit amet mattis lorem, vel ornare felis. Donec
pulvinar tempus lorem, at porta sem pretium ut. Cras ut lorem tincidunt,
scelerisque nunc vitae, posuere augue. Vestibulum iaculis libero id congue
ultrices. Nullam mauris ipsum, aliquet eget nisl non, venenatis euismod
enim. Phasellus a eleifend velit. Aenean molestie venenatis turpis,
consectetur convallis velit fringilla non.
You can control the results, such as width, etc.
$ fmt --help
Usage: fmt [-WIDTH] [OPTION]... [FILE]...
Reformat each paragraph in the FILE(s), writing to standard output.
The option -WIDTH is an abbreviated form of --width=DIGITS.
Mandatory arguments to long options are mandatory for short options too.
-c, --crown-margin preserve indentation of first two lines
-p, --prefix=STRING reformat only lines beginning with STRING,
reattaching the prefix to reformatted lines
-s, --split-only split long lines, but do not refill
-t, --tagged-paragraph indentation of first line different from second
-u, --uniform-spacing one space between words, two after sentences
-w, --width=WIDTH maximum line width (default of 75 columns)
--help display this help and exit
--version output version information and exit
With no FILE, or when FILE is -, read standard input.
Best Answer
Assuming your definition of word is a sequence of non-blank characters separated by blanks, here's an
awk
solution for your single-line file