Posted by on May 28, 2013 in Image Processing and Analysis

This post is the fifth in a series about the Image Processing Pipeline (IPP). The IPP is the part of the content pipeline that is responsible for digitizing and processing the millions of images we publish to our site.  In this post we finally get to the good part – the part of the pipeline in which we process the images.

The purpose of the IPP is to correct and enhance the images in a way that improves their legibility without, in the process, inadvertently damaging other images. In this and the next couple of posts I will present some details on the kinds of processing we do in the Image Processor and how these operations help improve legibility.

The diagram in the following figure shows the context for the Image Processor.

Figure 1. Context diagram for the Image Processor component of the IPP.

The Image Processor consumes a source image from the scanner, performs a number of operations on the image and then saves out the processed image along with a thumbnail and a zoomed-in snippet, which are used in the subsequent Image Quality Editor step to review the quality of the image and, if necessary, make changes to fix any problems the operator finds.

Image Histograms

To illustrate one of the important operations performed by the Image Processor (auto-normalizing), we will use an image histogram, which is a simple graphical tool to help analyze the distribution of pixel values of an image. It gives you a sense for the frequency with which each grayscale value occurs in the image. On the x-axis are the 255 possible grayscale values (zero corresponds to black; 255 corresponds to white). The height of the vertical line at each of these 255 values corresponds to the frequency of that pixel value in the image.

We first consider a portrait photograph. In most natural grayscale images you would expect to see most or all 255 levels. The following “Lena” test image uses most of the available range. It uses 241 gray levels – It’s only missing the extreme white pixels, which I have indicated with the yellow shading on the histogram.


Figure 2. The standard Lena test image with its grayscale histogram showing the distribution of pixel values.

In images of historical records, such as in the following figure, you hope for a bimodal distribution, one peak (on the left) corresponding to the dark textual content and the other peak (on the right) corresponding to the light background.


Figure 3. An example image with a strongly bimodal distribution.

Although there are some pixels with values in the midrange, almost all of this image is concentrated at the white extreme (255). I have highlighted in yellow the two extreme ends of the histogram. It’s not clear from this histogram, but it’s the black border that allows the histogram to even show anything at the black extreme. If you crop off this black border, 92% of the pixels have a value of 253, 254, or 255. This image contains a little bit of gray ink on a very white background.


Image Contrast

Although you would hope in the previous image to have more black (or dark gray) pixels coming from the printed text, that is an example of an image having pretty good contrast. Contrast is the difference in luminance between, in this case, the printed text and the page background. If the pixels corresponding to the black printed text are near zero, and the pixels corresponding to the white background are near 255, then the image is said to have high contrast, which is a good thing, since it helps with legibility.

Now consider the following more typical image of a historical record:


Figure 4. An example image with low contrast as indicated in the histogram by a very compressed range of pixel values 

Notice that the pixel values are more concentrated into the mid-section of the allowed range. Without a bimodal distribution, or a least more spread in the distribution, the printed text and the document background share roughly the same grayscale values. This image is said to have low contrast, which is a bad thing, since this makes the printed text blend into the background, making it difficult to extract that text either by a human or with an image analysis tool.



An important operation that helps improve the contrast in an image is called Auto-Normalization. The goal of auto-normalization is to improve the contrast of the image by “stretching” the range of intensity values to span a greater range of (luminance) values. Auto-normalization is performed on every image we process. It’s a linear, lossless operation that can take a low-contrast image and make it more legible. On images that have good contrast to begin with, auto-normalization has little to no effect.

The following figure shows the result of auto-normalizing the previous (low-contrast) image.


Figure 5. The auto-normalized output of the Image Processor with the histogram showing a greater spread in the luminance distribution.

The following figure shows zoomed-in snippets before and after the image was auto-normalized.


Figure 6. Zoomed-in snippets showing the improvement to the contrast due to the auto-normalizing operation in the ImageProcessor.

From the previous two figures it’s quite clear that the auto-normalizing operation is having the intended effect of increasing contrast in the image and thus (somewhat) improving the legibility of the text by using more of the dynamic range available in the image’s luminance values.

In my future posts I will present other operations in the Image Processor that work in concert with our auto-normalization algorithm to help improve the quality of the images we process.



Join the Discussion

We really do appreciate your feedback, and ask that you please be respectful to other commenters and authors. Any abusive comments may be moderated. For help with a specific problem, please contact customer service.