Competition as Collaboration – Ancestry.com Handwriting Recognition Competition

Posted by Michael Murdock on March 14, 2014 in Image Processing and Analysis

We are excited to announce that the Ancestry.com handwriting recognition competition proposal was accepted as one of seven, official International Conference on the Frontiers of Handwriting (ICFHR-2014) competitions. As part of our competition on word recognition from segmented historical documents, we are announcing the availability of a new image database1, ANWRESH-1, which contains segmented and labeled… Read more

Document Analysis and Recognition – What is Document Analysis?

Posted by Michael Murdock on October 5, 2013 in Image Processing and Analysis

I recently attended the three-day biennial International Conference on Document Analysis and Recognition (ICDAR-2013) in Washington, DC. ICDAR is sponsored by the International Association for Pattern Recognition and is the premier event for those working in the field of Document Analysis (DA). Primarily the attendees are from corporate and university labs; they are professors, graduate students and technologists involved… Read more

Image Processing at Ancestry.com – Part 6: Auto-Sharpening

Posted by Michael Murdock on July 17, 2013 in Image Processing and Analysis

This post is the sixth in a series about the Ancestry.com Image Processing Pipeline (IPP). The IPP is the part of the content pipeline that is responsible for digitizing and processing the millions of images we publish to our site.  The core functionality of the IPP is illustrated in the following diagram. In this post I continue with the material from… Read more

Throttling Image Processing

Posted by Tyler Jensen on June 21, 2013 in Distributed Computing, Image Processing and Analysis

Ancestry.com, like any other site with millions of subscribers, experiences predictable load patterns throughout the day. To maximize site performance and customer satisfaction, we make every effort to schedule maintenance during off-peak intervals. Content processing, especially our repository of hundreds of millions of images, on the other hand, is a constant ongoing effort, and in… Read more

Image Processing at Ancestry.com – Part 5: Auto-Normalization

Posted by Michael Murdock on May 28, 2013 in Image Processing and Analysis

This post is the fifth in a series about the Ancestry.com Image Processing Pipeline (IPP). The IPP is the part of the content pipeline that is responsible for digitizing and processing the millions of images we publish to our site.  In this post we finally get to the good part – the part of the pipeline in which we process the… Read more

Image Processing at Ancestry.com – Part 4: Microfilm Scanning

Posted by Michael Murdock on May 13, 2013 in Image Processing and Analysis

This post is the fourth in a series about the Ancestry.com Image Processing Pipeline (IPP). The IPP is the part of the content pipeline that is responsible for digitizing and processing the millions of images we publish to our site.  In this post I will present a bit of information about our microfilm scanning process.… Read more

Image Processing at Ancestry.com – Part 3: Where Do Images Come From?

Posted by Michael Murdock on May 3, 2013 in Image Processing and Analysis

This post is the third in a series about the Ancestry.com Image Processing Pipeline (IPP). The IPP is the part of the content pipeline that is responsible for digitizing and processing the millions of images we publish to our site.  In part 1 of this series, The Good, the Bad, and the Ugly, I gave… Read more

Distributed Parallel Computing at Ancestry.com

Posted by Tyler Jensen on April 24, 2013 in Distributed Computing, Image Processing and Analysis

About 450 years ago John Heywood wrote, “many hands make light work.” The same can be said of image and data processing. Distributed parallel computing (DPC) makes it possible for us to do the work described by Michael Murdock in his series on the image processing pipeline. If you haven’t already, take a moment to… Read more

Image Processing at Ancestry.com – Part 2: Living in the Mesosphere

Posted by Michael Murdock on April 20, 2013 in Image Processing and Analysis

  Last week I began this series of blog posts about the Ancestry.com Image Processing Pipeline (IPP) by briefly describing how the IPP is the part of the Ancesty.com Content Pipeline that is responsible for digitizing and processing the millions of images we publish to our site. With this blog post I would like to… Read more

Image Processing at Ancestry.com – Part 1: The Good, The Bad, and the Ugly

Posted by Michael Murdock on April 10, 2013 in Image Processing and Analysis

  Images of original historical records play a key role in the way Ancestry.com presents family history information to the user. An image of a historical record is much more than evidentiary support for some family history assertion. An image can become the anchor for an engaging and compelling historical narrative. A properly captured and… Read more