This post is the fifth in a series about the Ancestry.com Image Processing Pipeline (IPP). The IPP is the part of the content pipeline that is responsible for digitizing and processing the millions of images we publish to our site. In this post we finally get to the good part – the part of the pipeline in which we process the images.
The purpose of the IPP is to correct and enhance the images in a way that improves their legibility without, in the process, inadvertently damaging other images. In this and the next couple of posts I will present some details on the kinds of processing we do in the Image Processor and how these operations help improve legibility.
The diagram in the following figure shows the context for the Image Processor.
The Image Processor consumes a source image from the scanner, performs a number of operations on the image and then saves out the processed image along with a thumbnail and a zoomed-in snippet, which are used in the subsequent Image Quality Editor step to review the quality of the image and, if necessary, make changes to fix any problems the operator finds.
To illustrate one of the important operations performed by the Image Processor (auto-normalizing), we will use an image histogram, which is a simple graphical tool to help analyze the distribution of pixel values of an image. It gives you a sense for the frequency with which each grayscale value occurs in the image. On the x-axis are the 255 possible grayscale values (zero corresponds to black; 255 corresponds to white). The height of the vertical line at each of these 255 values corresponds to the frequency of that pixel value in the image.
We first consider a portrait photograph. In most natural grayscale images you would expect to see most or all 255 levels. The following “Lena” test image uses most of the available range. It uses 241 gray levels – It’s only missing the extreme white pixels, which I have indicated with the yellow shading on the histogram.
In images of historical records, such as in the following figure, you hope for a bimodal distribution, one peak (on the left) corresponding to the dark textual content and the other peak (on the right) corresponding to the light background.
Although there are some pixels with values in the midrange, almost all of this image is concentrated at the white extreme (255). I have highlighted in yellow the two extreme ends of the histogram. It’s not clear from this histogram, but it’s the black border that allows the histogram to even show anything at the black extreme. If you crop off this black border, 92% of the pixels have a value of 253, 254, or 255. This image contains a little bit of gray ink on a very white background.
Although you would hope in the previous image to have more black (or dark gray) pixels coming from the printed text, that is an example of an image having pretty good contrast. Contrast is the difference in luminance between, in this case, the printed text and the page background. If the pixels corresponding to the black printed text are near zero, and the pixels corresponding to the white background are near 255, then the image is said to have high contrast, which is a good thing, since it helps with legibility.
Now consider the following more typical image of a historical record:
Notice that the pixel values are more concentrated into the mid-section of the allowed range. Without a bimodal distribution, or a least more spread in the distribution, the printed text and the document background share roughly the same grayscale values. This image is said to have low contrast, which is a bad thing, since this makes the printed text blend into the background, making it difficult to extract that text either by a human or with an image analysis tool.
An important operation that helps improve the contrast in an image is called Auto-Normalization. The goal of auto-normalization is to improve the contrast of the image by “stretching” the range of intensity values to span a greater range of (luminance) values. Auto-normalization is performed on every image we process. It’s a linear, lossless operation that can take a low-contrast image and make it more legible. On images that have good contrast to begin with, auto-normalization has little to no effect.
The following figure shows the result of auto-normalizing the previous (low-contrast) image.
The following figure shows zoomed-in snippets before and after the image was auto-normalized.
From the previous two figures it’s quite clear that the auto-normalizing operation is having the intended effect of increasing contrast in the image and thus (somewhat) improving the legibility of the text by using more of the dynamic range available in the image’s luminance values.
In my future posts I will present other operations in the Image Processor that work in concert with our auto-normalization algorithm to help improve the quality of the images we process.
About Michael Murdock
Michael Murdock is a senior software development manager at Ancestry.com where he has worked for the last 9 years. He holds 8 patents in the areas of image and signal processing, and loves drinking 7-Up while thinking about the cool products he has helped create at the 6 companies he's worked for since graduating a long time ago from the University of Utah. He occasionally runs a 5K with one of his 4 children, recently finishing 3rd in his age group. He loves to read and found time to finish 2 books recently. He loves to travel with his wife, the 1 and only love of his life.