Command-Line – How to Add Alternating Strings to Filenames and Renumber Them Pairwise

batch-renamecommand line

Using a high-throughput microscope, we produce thousands of images. Let's say our system names them:

ome0001.tif
ome0002.tif
ome0003.tif
ome0004.tif
ome0005.tif
ome0006.tif
ome0007.tif
ome0008.tif
ome0009.tif
ome0010.tif
ome0011.tif
ome0012.tif
...

We would like to alternatively insert c1 and c2 with respect to the numerical value of the images, and then change the original numbering so that each successive c1 and c2 harbor the same incremental number, respecting numerical order (1, then 2… then 9, then 10) rather than alphanumeric order (1, then 10, then 2…).

In my example, that would give:

ome0001c1.tif
ome0001c2.tif
ome0002c1.tif
ome0002c2.tif
ome0003c1.tif
ome0003c2.tif
ome0004c1.tif
ome0004c2.tif
ome0005c1.tif
ome0005c2.tif
ome0006c1.tif
ome0006c2.tif
...

We have not been able to do that via terminal command-line (biologist speaking…).

Any suggestion would be greatly appreciated!

Best Answer

rename performs bulk renaming, and it can do the arithmetic you need.

Different GNU/Linux distributions have different commands called rename, with different syntax and capabilities. In Debian, Ubuntu, and some other OSes, rename is the Perl renaming utility prename. It is quite well suited to this task.

First I recommend telling rename to just show you what it would do, by running it with the -n flag:

rename -n 's/\d+/sprintf("%04dc%d", int(($& - 1) \/ 2) + 1, 2 - $& % 2)/e' ome????.tif

That should show you:

rename(ome0001.tif, ome0001c1.tif)
rename(ome0002.tif, ome0001c2.tif)
rename(ome0003.tif, ome0002c1.tif)
rename(ome0004.tif, ome0002c2.tif)
rename(ome0005.tif, ome0003c1.tif)
rename(ome0006.tif, ome0003c2.tif)
rename(ome0007.tif, ome0004c1.tif)
rename(ome0008.tif, ome0004c2.tif)
rename(ome0009.tif, ome0005c1.tif)
rename(ome0010.tif, ome0005c2.tif)
rename(ome0011.tif, ome0006c1.tif)
rename(ome0012.tif, ome0006c2.tif)

Assuming that's what you want, go ahead and run it without the -n flag (i.e., just remove -n):

rename 's/\d+/sprintf("%04dc%d", int(($& - 1) \/ 2) + 1, 2 - $& % 2)/e' ome????.tif

That command is somewhat ugly--though still more elegant than using a loop in your shell--and perhaps someone with more Perl experience than I have will post a prettier solution.

I highly recommend Oli's tutorial Bulk renaming files in Ubuntu; the briefest of introductions to the rename command, for a gentle intro to writing rename commands.


How that specific rename command works:

Here's what s/\d+/sprintf("%04dc%d", int(($& - 1) \/ 2) + 1, 2 - $& % 2)/e does:

  • The leading s means to search for text to replace.
  • The regular expression /\d+/ matches one or more (+) digits (\d). This matches your 0001, 0002, and so forth.
  • The command sprintf("%04dc%d", int(($& - 1) / 2) + 1, 2 - $& % 2) is built. $& represents the match. / normally ends the replacement text, but \/ makes a literal / (which is division, as detailed below).
  • The trailing /e means to evaluate the replacement text as code.
    (Try running it with just / instead of /e at the end, but make sure to keep the -n flag!)

Thus your new filenames are the return values of sprintf("%04dc%d", int(($& - 1) \/ 2) + 1, 2 - $& % 2). So what's going on there?

  • sprintf returns formatted text. It first argument is the format string into which values are placed. %04d consumes the first argument and formats it as an integer 4 characters wide. %4d would omit leading zeros, hence %04d is needed. Not being covered by any %, c means just a literal letter c. Then %d consumes the second argument and formats it as an integer (with default formatting).
  • int(($& - 1) / 2) + 1 subtracts 1 from the number extracted from the original filename, divides it by 2, truncates the fractional portion (int does that), then adds 1. That arithmetic sends 0001 and 0002 to 0001, 0003 and 0004 to 0002, 0005 and 0006 to 0003, and so forth.
  • 2 - $& % 2 takes the remainder of dividing the number extracted from the original filename by 2 (% does that), which is 0 if it's even and 1 if it's odd. It then subtracts that from 2. This arithmetic sends 0001 to 1, 0002 to 2, 0003 to 1, 0004 to 2, and so forth.

Finally, ome????.tif is a glob that your shell expands to a list of all the filenames in the current directory that start with ome, end in .tif, and have exactly four of any characters in between.

This list is passed to the rename command, which will attempt to rename (or with -n, tell you how it would rename) all of the files whose names contain a match to the pattern \d+.

  • From your description, it doesn't sound like you have any files in that directory named that way but with some of the characters not digits.
  • But if you do then you can replace \d+ with \d{4} in the regular expression appearing in the commands shown above, to ensure they aren't renamed, or just inspect the output produced with -n carefully, which you should be doing anyway.
  • I wrote \d+ instead of \d{4} to avoid making the command more complex than necessary. (There are many different ways to write it.)