MacOS – Splitting LexisNexis text file

command linemacosterminal

For my research I have a number of text (txt or doc) files. These have a large quantity of newspaper clippings in them.

I'd like to split these text files. Each clipping starts with

document X of Y

I know of the split command line tool – is there some way to use split to divide the large text file into the Y number of files as indicated in the larger single (doc or txt) file generated by LexisNexis?

Best Answer

Split allows for a regexp pattern match so simply:

split -p pattern longfile.doc

Would start each new file when pattern was found. Nailing down which regexp matches your specific file might be better suited for http://stackoverflow.com but perhaps you know how to craft regexp and didn't realize split would match a pattern.