Having a file of the following contents:
1111,2222,3333,4444
aaaa,bbbb,cccc,dddd
I seek to get a file equal to the original but lacking a n-th column like, for n = 2 (or may it be 3)
1111,2222,4444
aaaa,bbbb,dddd
or, for n = 0 (or may it be 1)
2222,3333,4444
bbbb,cccc,dddd
A real file can be gigabytes long having tens thousands columns.
As always in such cases, I suspect command line magicians can offer an elegant solution… 🙂
In my actual real case I need to drop 2 first columns, which can be done by dropping a first column twice in a sequence, but I suppose it would be more interesting to generalise a bit.
Best Answer
I believe this is specific to cut from the GNU coreutils:
Normally you specify the fields you want via -f, but by adding --complement you reverse the meaning, naturally. From 'man cut':
One caveat: if any of the columns contain a comma, it will throw cut off, because cut isn't a CSV parser in the same way that a spreadsheet is. Many parsers have different ideas about how to handle escaping commas in CSV. For the simple CSV case, on the command line, cut is still the way to go.