For my research I have a number of text (txt or doc) files. These have a large quantity of newspaper clippings in them.
I'd like to split these text files. Each clipping starts with
document X of Y
I know of the split
command line tool – is there some way to use split to divide the large text file into the Y number of files as indicated in the larger single (doc or txt) file generated by LexisNexis?
Best Answer
Split allows for a regexp pattern match so simply:
split -p pattern longfile.doc
Would start each new file when pattern was found. Nailing down which regexp matches your specific file might be better suited for http://stackoverflow.com but perhaps you know how to craft regexp and didn't realize split would match a pattern.