Posted by on 25 November 2010 in General, Record Collections

Authored by Jack Reese from Ancestry

In 2005 we finished scanning the UK Census collection from the microfilm reels that were available at the time and discovered that the writing on many of the images was very faint and unreadable in many cases.  Certified genealogist and then-head of Ancestry’s Indexing department, noted that as many as 730,000 names from 1841 and 1851 had been categorized as missing, damaged, or of poor quality.  While some of the original pages were known to be unavailable, it was believed that hundreds of thousands of these missing names could be recovered by imaging directly from original content rather than from microfilm. Knowing that these damaged records would only continue to deteriorate, we were determined to rescue and preserve images of these documents before the pages deteriorated completely.

In September 2005, we went to The National Archives in Kew to inspect the damaged 1841 and 1851 census pages first hand and found that using high-res, powerful digital cameras with low-light photography techniques combined with a range of specifications, we were able to reveal hints of the faded writing on the pages (Figure 1).


However, it quickly became apparent that the project was going to be more difficult than we had anticipated. The image processing technique we were using to make the script readable once again, required a certain amount of time per image and the “enhanced” writing was still so faint that reading or transcribing would have been too difficult and inaccurate.
 
Fortunately, we had already set in motion plans to deal with more difficult content that may require analysis of the effects of imaging with different wavelengths – also known as ‘spectral analysis’.  After a couple of days of limited success using the traditional image processing techniques, we proceeded with our spectral analysis.  For those of you who may be unfamiliar with the term, HDR is the process of taking overexposed and underexposed images and combining them to get a resulting image that reveals more detail throughout the image – especially in dark and bright regions of the image.

Using the best forensic document analysis equipment available, we analyzed foreground (ink) and background (paper) regions of numerous sample pages. Within the first few minutes of analysis the writing began to emerge (Figures 2-3).


Encouraged by the initial results, we proceeded to complete the spectral analysis testing hundreds of combinations of light sources and light source filtering combined with filtering of the wavelengths reflected and sometimes fluorescing from the document.  The term ‘fluorescing’ refers to the emission of electromagnetic radiation. 

With data captured from dozens of sample documents from the collection we identified the most effective configurations for revealing the information on the pages within the collection. Given that many census enumerators were using different writing instruments, a variety of ultra-violet (UV) and infrared (IR) imaging techniques proved effective (Figure 4). 


After completing our analysis it was clear that no commercially available system was going to meet the resolution, spectral sensitivity, lighting, filtering, size, and speed requirements necessary to enable us to digitize these damaged documents. Determined to capture and preserve what information remained on these documents, we set out to build a bespoke camera – our very own DaRC (Document Restoration Camera) system (Figure 5) specially designed to safely and efficiently capture images revealing previously hidden information on these damaged documents.

 

To read more about the restoration work carried out on these records and to start searching them, go to www.ancestry.co.uk/Manchester.

Jack Reese is a Digital Imaging expert in Preservation and spearheaded the  restoration of the damaged 1851 Census records.  

6 Comments

Mac McCree 

Wow,a fascinating read is a bit of an understatement to say the least, congratulations to the whole team who have made this possible, I can see why Jack is working with Ancestry, one look at his impressive CV would be enough in my estimation for any him to succeed at anything he decides to do, I now know who to call when I have a problem with my box brownie.

Look at this and see why Jack is where he is!

http://www.linkedin.com/in/jackreese

best wishes,
Mac

25 November 2010 at 7:19 pm
Tweets that mention How we restored more than 16,500 water-damaged records from the 1851 Census -- Topsy.com 

[...] This post was mentioned on Twitter by Old PostcardsEtc and Photos Reunited, Ancestry.co.uk. Ancestry.co.uk said: How we restored more than 16,500 water-damaged records from the 1851 Census: Authored by Jack Reese from Ancestr… http://bit.ly/dKlrR3 [...]

25 November 2010 at 8:08 pm
bromaelor 

Amazing! They take all that time and effort to restore these census pages and then Ancestry produce another shoddy piece of transcription!

26 November 2010 at 4:00 pm
JUY 

Considering that these documents and the information there on were effectively lost, not only to the UK as a whole, but the whole world and ancestry have found away to give us back the data that was lost i think the above statement is unfair. yes the transcriptions might not be the best but it is a GREAT IMPROVEMENT to what we had before from these DAMAGED UNREADABLE DOCUMENTS and that was a big fat ZERO.

praise should be give to their ternasity and persistence to give back this slice of history.

well done ancestry

4 December 2010 at 3:43 am
jaykay 

Not true to say that what was available before was a “big fat ZERO” – A transcription of these records has been available through Manchester & Lancashire Family History Society for some years. It was published first on CD, then through the FamilyHistoryOnline website (now defunct) of the Federation of Family History Societies, and is currently available on one of the opposition websites. The work was undertaken by volunteers from that society, firstly at the Public Record Office (PRO) in Chancery Lane, London, and later at the PRO now The National Archives at Kew also in London.

Furthermore whilst the technology currently available, and the digital images produced by Ancestry, is much more sophisticated than was available to MLfhs, ultraviolet light was certainly used for much of the work which could not be read using natural or electric light.

It is also rather disingenuous to suggest that Ancestry have somehow completed the task: not quite true as much of what they have now produced and indexed was already transcribed – apparently with more accuracy. In any case some of this census will always be “missing” as the original documents are in such a dreadful condition.

I must agree that the availability of digital images is a definite advantage, and much to be praised – any faults of indexers can now be re-interpreted by users. Which is all to the good. This kind of technology is far too expensive for the average family history society, and any service provider, such as Ancestry, who can fund such projects, is also deserving of praise. Pity those providers who don’t use their subscriptions to provide digital images of originals, despite charging high amounts for access.

Unfortunately, even with these images, I still cannot find great grandfather in this census – another lost soul!

JK

5 December 2010 at 3:10 am
Chagoi 

The title of this piece is “How we restored more than 16,500 water-damaged records …”. On Ancestry’s ‘homepage’ for the 1851 Census under the heading “Known problems with the 1851 Census,” it is claimed they salvaged 165,000 names. Does one so-called record equal ten names, or is there a simple typo somewhere?

Like #5 jaykay I can’t find my lot, although in my case I KNOW they’re there and I really wanted to see them for myself. Both the MLFHS and FindMyPast have them listed in their indexes, and I have a transcript showing the whole family’s full names, ages, occupations and birthplaces so the page must have been legible to some degree.

I note that the list of Salford Enumeration Districts jumps from District 1m to District 1u. Does that mean that not all of these water-damaged pages have been put on-line yet? Do the Districts 1n-1t exist?

Must say that it’s been fascinating looking at the enumeration district descriptions. Some of them have a tiny piece of a map stuck to the page showing exactly what streets were to be covered.

:-)

5 December 2010 at 12:12 pm