The end goal of an indexing project such as ours is to make records findable by those who are researching their family history. But, the reality of the records we deal with – spellings, misspellings, abbreviations, terrible handwriting, faint and blurry images, etc. – makes it challenging to key. And the inconsistency – record types, time periods, form types, languages, customs, and on and on – make it a monumental task for us to provide consistent instructions across all projects when so many things seem to be an exception to whatever rules we try to establish. But, we soldier on – always trying to make the best decisions we can with the information available at any given moment. And we appreciate that you do as well.
With that said, we (the content managers, the Indexing team, the AWAP team, the post-production team, the search engineers, and everyone else in the company who touches any database that ultimately will go live on Ancestry.com) are constantly working to systematize, normalize and otherwise make consistent those rules by which we produce content. You have come to play a large part in that endeavor as it relates to AWAP projects. You constantly “keep us on our toes” by questioning and pushing back when we ask you to key things in certain ways or when we change the rules on you. We appreciate that as well.
The Key As Seen blog post generated quite a few comments, some personal emails and some message board posts. There were a lot of really great issues raised. And, I will do my best to address them in the coming weeks. Right now I just want to say one more thing about key as seen and it stems from a quote I love that is often attributed to Socrates:
“Tell me sufficiently why a thing should be done and I will move heaven and earth to do it.”
So, this blog post, rather than telling you WHAT to do will, I hope, give you a glimpse of my view into WHY we do it.
There are two main reasons we employ the key as seen rule and have come to rely on it so heavily. The first is, of course, the integrity of the record. We want to create an index that remains true to the information as it was originally recorded.
Additionally, we often don’t have enough context to “correct” what we see. How do you know that Jno is short for John? In my family I have parents who named one child John and one child Jonathan (and both were still living). In another family there was a Cathleen and a Catherine, which one gets to be Cath on a record? How do you know that Arh is Arthur and not Archibald? Are you sure that Jos is Joseph not Joshua? Is Em for Emily or Emma? Is Mart for Margaret or Martha or maybe even Marguerite?
If we key what we see and leave it up to the “magic of search” (we’ll talk more about that in another post) to find the variations so the descendants who are searching for them can find them. They will, likely, have better context (and more records to compare) to determine the true name of that individual.
Then there are the place names. For one thing, standardized spelling is a relatively new phenomena – and even that hasn’t been standardized today across languages or even different cultures with common languages. For another, places might not always be where we think they are. Here in the U.S. the abbreviation IA didn’t always stand for Iowa, it used to stand for Indiana.
On top of that, if we only allow people with a geographical and historical knowledge of that place in a specific time to key those records, not a whole lot would get done. So we key as seen to allow more people to participate in making more records accessible. We can then “interpret” that abbreviation or “mis” spelling in post-production and/or search, so that the record surfaces correctly.
The second reason we rely on key as seen so heavily is really just a matter of consistency. If we were only keying each record once and then reviewing your contribution, it really wouldn’t matter all that much whether you keyed the month out in full or not. It likely wouldn’t matter too much whether you put a 0 in front of the single digit days or put a space between Mc and Donald in the surname. No one would care too much that you spelled out Lieutenant or corrected the spelling of the record-keeper who spelled it wrong. Between post-production and the magic of search on our website all of those things will be normalized and accounted for so the right records would surface to the right people when they need them.
Right now, we double-key records then send them along to arbitration. That means we strive to have some level of consistency in how records are keyed so that our computers can match the most number of fields possible. Then our faithful arbitrators can spend their time comparing handwriting interpretations and fixing the occasional typo instead of choosing one keyer over another due to a standardization issue.
I really hope this all makes sense. I stay awake nights thinking about these things and how to best communicate them to you. I spend hours keying and arbitrating so I have a better understanding of the records and the various issues you see every day. I am (in this season of gratitude, in particular) very grateful for those of you who understand just how important it is that we do this, and that we do this right. Your time and effort, your care and concern are noticed and appreciated, certainly by this seasoned genealogist.
For those of you who are interested I’ll keep posting and I’ll keep “listening” to your comments as it relates to quality issues of all shapes and sizes. As I see it we still need to talk about normalization that occurs in post-production, the magic of search and keyer feedback. If you see anything else you’d like to discuss just leave a comment and I’ll see what I can do.
Until next time – Happy Keying!