Splitting a large binary file into sections determined by context patterns

filessplit

I have a large (2GB) file that looks like this:

^%%-=-=-=-=-=-=-=-=-=-=-=-=-=-%%^
<binary data>
^%%-=-=-=-=-=-=-=-=-=-=-=-=-=-%%^ 
<binary data>
^%%-=-=-=-=-=-=-=-=-=-=-=-=-=-%%^
<binary data>
...

The ^%%-=-=-=-=-=-=-=-=-=-=-=-=-=-%%^ lines are separators. The binary segments are large. There are about fifty of them in the file.

I am trying to extract the binary parts of this file. Each binary segment needs to go into its own file.

I tried using csplit,

csplit --digits=2 --prefix=out stu.ear '/\^%%-=-=-=-=-=-=-=-=-=-=-=-=-=-%%\^/'

but received the following output and two out?? files,

1
2097951144

Is there a tool for this job (a csplit implementation that works with binary files, perhaps?)

Best Answer

The following will work:

      awk '/\^%%-=-=-=-=-=-=-=-=-=-=-=-=-=-%%\^/{n++}{print >"out" n ".ear" }
Related Question