Using a high-throughput microscope, we produce thousands of images. Let's say our system names them:
ome0001.tif
ome0002.tif
ome0003.tif
ome0004.tif
ome0005.tif
ome0006.tif
ome0007.tif
ome0008.tif
ome0009.tif
ome0010.tif
ome0011.tif
ome0012.tif
...
We would like to alternatively insert c1
and c2
with respect to the numerical value of the images, and then change the original numbering so that each successive c1
and c2
harbor the same incremental number, respecting numerical order (1, then 2… then 9, then 10) rather than alphanumeric order (1, then 10, then 2…).
In my example, that would give:
ome0001c1.tif
ome0001c2.tif
ome0002c1.tif
ome0002c2.tif
ome0003c1.tif
ome0003c2.tif
ome0004c1.tif
ome0004c2.tif
ome0005c1.tif
ome0005c2.tif
ome0006c1.tif
ome0006c2.tif
...
We have not been able to do that via terminal command-line (biologist speaking…).
Any suggestion would be greatly appreciated!
Best Answer
rename
performs bulk renaming, and it can do the arithmetic you need.Different GNU/Linux distributions have different commands called
rename
, with different syntax and capabilities. In Debian, Ubuntu, and some other OSes,rename
is the Perl renaming utilityprename
. It is quite well suited to this task.First I recommend telling
rename
to just show you what it would do, by running it with the-n
flag:That should show you:
Assuming that's what you want, go ahead and run it without the
-n
flag (i.e., just remove-n
):That command is somewhat ugly--though still more elegant than using a loop in your shell--and perhaps someone with more Perl experience than I have will post a prettier solution.
I highly recommend Oli's tutorial Bulk renaming files in Ubuntu; the briefest of introductions to the rename command, for a gentle intro to writing
rename
commands.How that specific
rename
command works:Here's what
s/\d+/sprintf("%04dc%d", int(($& - 1) \/ 2) + 1, 2 - $& % 2)/e
does:s
means to search for text to replace./\d+/
matches one or more (+
) digits (\d
). This matches your0001
,0002
, and so forth.sprintf("%04dc%d", int(($& - 1) / 2) + 1, 2 - $& % 2)
is built.$&
represents the match./
normally ends the replacement text, but\/
makes a literal/
(which is division, as detailed below)./e
means to evaluate the replacement text as code.(Try running it with just
/
instead of/e
at the end, but make sure to keep the-n
flag!)Thus your new filenames are the return values of
sprintf("%04dc%d", int(($& - 1) \/ 2) + 1, 2 - $& % 2)
. So what's going on there?sprintf
returns formatted text. It first argument is the format string into which values are placed.%04d
consumes the first argument and formats it as an integer 4 characters wide.%4d
would omit leading zeros, hence%04d
is needed. Not being covered by any%
,c
means just a literal letterc
. Then%d
consumes the second argument and formats it as an integer (with default formatting).int(($& - 1) / 2) + 1
subtracts 1 from the number extracted from the original filename, divides it by 2, truncates the fractional portion (int
does that), then adds 1. That arithmetic sends0001
and0002
to0001
,0003
and0004
to0002
,0005
and0006
to0003
, and so forth.2 - $& % 2
takes the remainder of dividing the number extracted from the original filename by 2 (%
does that), which is 0 if it's even and 1 if it's odd. It then subtracts that from 2. This arithmetic sends0001
to1
,0002
to2
,0003
to1
,0004
to2
, and so forth.Finally,
ome????.tif
is a glob that your shell expands to a list of all the filenames in the current directory that start withome
, end in.tif
, and have exactly four of any characters in between.This list is passed to the
rename
command, which will attempt to rename (or with-n
, tell you how it would rename) all of the files whose names contain a match to the pattern\d+
.\d+
with\d{4}
in the regular expression appearing in the commands shown above, to ensure they aren't renamed, or just inspect the output produced with-n
carefully, which you should be doing anyway.\d+
instead of\d{4}
to avoid making the command more complex than necessary. (There are many different ways to write it.)