Batch Image Processing

It may initially seem counter-intuitive, but sometimes one needs to process an image file without actually viewing the image file. This is particularly the case if one has a very large number of image files and a uniform change is required. The slow process is to open the images files individually in whatever application one is using and make the changes required, save and open the next file and make the changes required, and so forth. This is time-consuming, boring, and prone to error.

Avoiding such activities is why computers were invented; computers are extremely good at accurate and fast automation of computational tasks (and what can be automated should be automated); leaving humans to carry out the tasks of innovation, invention, discovery, and aesthetics. Automation of regular activities can be easily carried out with shell script loops which are used here.

The following touches the surface of certain automation tasks that I have encountered in the past. I am not a photographer, a graphic designer, or anything of the sort. I consider my skills in such endeavours as sorely lacking. However, I do have a working knowledge of how to get a GNU/Linux based system to do things with a minimal amount of work, and I strongly believe in reducing the amount of work that others have to do.

Install Necessary Software

I will start by assuming that the gentle reader of this document is using Ubuntu 18.04 LTS. It is not necessarily my Linux distribution of choice, but it is usually the one that people are initially exposed to. People who are using other distributions will find that the installation process is slightly different (e.g., use of yum install for RedHat/CentOS, for example), or if they are very keen and performance-sensitive, installing the software from source rather than packages.

Four software applications are suggested here; UFRaw, Unidentified Flying Raw, which convert camera RAW images to standard image files; ghostscript, a PostScript and PDF language interpreter and previewer; imageMagick for image file manipulations; and poppler-utils for modifying PDFs (see a previous post about how awesome they are).

Always worth doing to get a system's package information up-to-date:
sudo apt-get update

Install the applications:
sudo apt-get install ufraw-batch
sudo apt-get install ghostscript
sudo apt-get install imagemagick
sudo apt-get install poppler-utils

Converting Raw Images to Standard Image Files

Photographers will like this one. The best tool for converting raw images to standard image files is unfraw-batch. The following example parses over all the files in a directory with the suffix.CR2, a raw camera image created by Canon digital cameras. The website imaging-resources provides a good collection of some example files; they are quite large!

It is unfortunately common for non-printing characters to make their way into filenames these days. When working with the command-line, life is a lot easier if spaces are removed from filenames, because many core applications read the space a a delimiter; 'My File' will be read as two files, "My" and "File"! The following loop command can be used to get rid of these unnecessary spaces.

for item in ./*; do mv "$item" "$(echo "$item" | tr -d " ")"; done

The following loop will, for the current working directory, loop over each and every file with a.CR2 suffix, and run the ufraw-batch command, outputting a new file with the jpg format. This could also be ppm, tiff, jpeg, jpg, or fits.

for item in ./*.CR2; do ufraw-batch --out-type jpg $item ; done

There is an excellent range of manipulation options with ufraw batch; check the manual page (man ufraw-batch) for some examples.

As an aside, the output format that one uses should depend on what the file is being used for. In brief, a jpg is a compressed lossy format that is handy for web-published photographs due to their size. In comparison, PNG is a lossless compression format, which produces larger files. It is more typically recommended for drawings. TIFF is a lossless format that is typically uncompressed and used for print publications.

Or, rather than typing everything in detail just use the flowchart by Allen Hsu.

Modifying Existing Image Files

As the blockquote in the previous section suggests sometimes one might not have the right image format for the job that one wants to do.

The following simple loops allow for mass conversion of files from one format to another or force particular characteristics into a file. Each of them make use of the convert utility that is part of imagemagick.

for item in ./*.jpg ; do convert "$item" "${item%.*}.png" ; done
for item in ./*.jpg ; do convert "$item" -monochrome "$item"; done
for item in ./*.jpg; do convert "$item" -define jpeg:extent=512kb "${item%.*}.jpg" ; done
for item in ./*.jpg; do convert "$item" ../logo.jpg -gravity southeast -geometry +10+10 -composite "${item%.*}logo".jpg ; done

The first loop converts all jpg files in a directory to png files.

The second loop converts all jpg files in a directory with a copy that is monochrome. The '-monochrome' dither is clearer, but other options that could be used for a similar effect include '-threshold xx%' or '-remap pattern:gray50', for less contrast but retaining more information.

The third loop converts all jpg files to a fixed size (512kb).

The fourth loop adds a logo (logo.jpg) to all jpg files in a directory. Note that the logo file is a directory level above the image folders, otherwise, the loop would engage in the sort of horrible recursion where the logo is placed in the logo, which would be weird.

The following is the sort of insane request that one gets from managers; "could you please put the following jpg files into a PDF? In order?"

Such a request involves two steps; converting the jpg files to PDFs, and then combining the PDF files. An assumption here that the files are each prefixed with an ordered value (and there is no special characters in the file names and using ls to parse.

for item in $(ls -v *.jpg); do convert "$item" "${item%.*}.pdf"; done
pdfunite $(ls -v *.pdf) output.pdf

There certainly is a great deal more that one can go with the now-installed applications and with scripts. Imagemagick, for example, provides excellent documentation on its command-line tools. The scripting methods used here were primarily simple for-loops. There are even far more sophisticated means of engaging in such automation (conditional tests, reads from a file descriptor, continue/break statements, etc). But for now, these examples should serve as a useful short introduction on how to modify dozens or hundreds of images in batch mode without even looking at a single image.


From Michael Deegan
One quite useful tool within my workflow is jpegtran/exiftran - some viewers (especially those commonly found on (or in) Windows) don't pay proper attention to EXIF rotation flags. So normally I take it out of the equation at earliest opportunity, via a `jhead -autorot -ft *jpg` against images fresh off my camera's memory card.