Key As Seen: Revisited


The end goal of an indexing project such as ours is to make records findable by those who are researching their family history.  But, the reality of the records we deal with – spellings, misspellings, abbreviations, terrible handwriting, faint and blurry images, etc. – makes it challenging to key.  And the inconsistency – record types, time periods, form types, languages, customs, and on and on – make it a monumental task for us to provide consistent instructions across all projects when so many things seem to be an exception to whatever rules we try to establish.  But, we soldier on – always trying to make the best decisions we can with the information available at any given moment.  And we appreciate that you do as well.

With that said, we (the content managers, the Indexing team, the AWAP team, the post-production team, the search engineers, and everyone else in the company who touches any database that ultimately will go live on Ancestry.com) are constantly working to systematize, normalize and otherwise make consistent those rules by which we produce content.  You have come to play a large part in that endeavor as it relates to AWAP projects.  You constantly “keep us on our toes” by questioning and pushing back when we ask you to key things in certain ways or when we change the rules on you.  We appreciate that as well.

The Key As Seen blog post generated quite a few comments, some personal emails and some message board posts.  There were a  lot of really great issues raised.  And, I will do my best to address them in the coming weeks.  Right now I just want to say one more thing about key as seen and it stems from a quote I love that is often attributed to Socrates:

“Tell me sufficiently why a thing should be done and I will move heaven and earth to do it.”

So, this blog post, rather than telling you WHAT to do will, I hope, give you a glimpse of my view into WHY we do it.

Record Integrity

There are two main reasons we employ the key as seen rule and have come to rely on it so heavily.  The first is, of course, the integrity of the record.  We want to create an index that remains true to the information as it was originally recorded.

Additionally, we often don’t have enough context to “correct” what we see.  How do you know that Jno is short for John?  In my family I have parents who named one child John and one child Jonathan (and both were still living).  In another family there was a Cathleen and a Catherine, which one gets to be Cath on a record?  How do you know that Arh is Arthur and not Archibald?  Are you sure that Jos is Joseph not Joshua?  Is Em for Emily or Emma?  Is Mart for Margaret or Martha or maybe even Marguerite?

If we key what we see and leave it up to the “magic of search” (we’ll talk more about that in another post) to find the variations so the descendants who are searching for them can find them.  They will, likely, have better context (and more records to compare) to determine the true name of that individual.

Then there are the place names.  For one thing, standardized spelling is a relatively new phenomena – and even that hasn’t been standardized today across languages or even different cultures with common languages.  For another, places might not always be where we think they are.  Here in the U.S. the abbreviation IA didn’t always stand for Iowa, it used to stand for Indiana.

On top of that, if we only allow people with a geographical and historical knowledge of that place in a specific time to key those records, not a whole lot would get done.  So we key as seen to allow more people to participate in making more records accessible.  We can then “interpret” that abbreviation or “mis” spelling in post-production and/or search, so that the record surfaces correctly.

Consistency

The second reason we rely on key as seen so heavily is really just a matter of consistency.  If we were only keying each record once and then reviewing your contribution, it really wouldn’t matter all that much whether you keyed the month out in full or not.  It likely wouldn’t matter too much whether you put a 0 in front of the single digit days or put a space between Mc and Donald in the surname.  No one would care too much that you spelled out Lieutenant or corrected the spelling of the record-keeper who spelled it wrong.  Between post-production and the magic of search on our website all of those things will be normalized and accounted for so the right records would surface to the right people when they need them.

Right now, we double-key records then send them along to arbitration.  That means we strive to have some level of consistency in how records are keyed so that our computers can match the most number of fields possible.  Then our faithful arbitrators can spend their time comparing handwriting interpretations and fixing the occasional typo instead of choosing one keyer over another due to a standardization issue.

 

I really hope this all makes sense.  I stay awake nights thinking about these things and how to best communicate them to you.  I spend hours keying and arbitrating so I have a better understanding of the records and the various issues you see every day.  I am (in this season of gratitude, in particular) very grateful for those of you who understand just how important it is that we do this, and that we do this right.  Your time and effort, your care and concern are noticed and appreciated, certainly by this seasoned genealogist.

For those of you who are interested I’ll keep posting and I’ll keep “listening” to your comments as it relates to quality issues of all shapes and sizes.  As I see it we still need to talk about normalization that occurs in post-production, the magic of search and keyer feedback.  If you see anything else you’d like to discuss just leave a comment and I’ll see what I can do.

Until next time – Happy Keying!

Information and Links

Join the fray by commenting, tracking what others have to say, or linking to it from your blog.


Other Posts

Write a Comment

Take a moment to comment and tell us what you think. Some basic HTML is allowed for formatting.

Reader Comments

As always, thank you Crista! I am on a mission to see completion to the image project I began working last summer, for personal reasons (genealogical and personal in nature). It is a challenge, just as it was when piecing my family geneaology beginning three years ago. I suspect the issues you have written have been discussed before; it can become frustrating to continue keying when at times I do not know what I am keying incorrectly, and perhaps need to review the process for completing an image correctly more often. It can be just as easy not to review the correct keying rules here, as I have been a keyer for other websites that may have different rules. Plus, mother of a part pit-bull, and stubborn at times, well, what can I say–someday someone will be keying me off an image :) What is most frustating to accept is reminding myself that I am the keyer on this project (maybe later an arbitrator), not the genealogist. I respect the petitioner’s written name in the end when necessary, and remember that it is beyond my control once I submit a completed image. I know in my heart that I’ve done my best to help someone in another time to construct his family’s genealogy, and thank you for the challenge of working this archive project on multiple levels. Sue

Thank you for your explanation; especially about the post-production and how it may normalize some of the interpretation of the handwriting.

I think this was one of the best posts I’ve seen on this blog. I think this is what most contributors are looking for: definitions, explainations and a little guidance to remind them why they do it.
Thank You!

Crista, Thank you for understanding why most of us keyers and arbitrators want to get it right! Additionally, for acknowledging that it is A LOT of fun.

Thanks Crista for this further revisting of this hot topic.

I fully understand the reasons why “key as seen” is adopted and I can also fully understand the need for standardisation, i.e. don’t add a 0 to e.g. 1 Jan to make it 01 Jan, etc.

A number of difficulties have been expertly discussed and I have seen all of them: blurred images, faded images, bad (very bad) handwritting, torn pages, etc. Unfortunately, we have to live with these.

Appart from these, the main problem for both keyers and arbiters is that “key as seen” rule is made quite prominant. However, what causes more difficulty is the rule is actually implemented as “key as seen, BUT …”. The “buts” are less prominant and it tends to be the BUTs which generate most heat (expand months Jan to January, don’t key periods after abbreviations, etc, etc.).

As we work through the data sets, Arbiters tend to see more records and wish that keyers would follow the rules, but have no direct way of communicating with the keyer. I also key, often multiple data sets; and my accuracy sometimes does go down. The problem then, is which data set am I getting wrong and what am I doing wrong on that data set?

I think that it needs to be made easier for both the keyers and the arbitors to know which “BUTs” apply to any given data. The wiki is a good step forward for this: provided that it is used.

Thank you Crista this helps more then you know.

Have a wonderful day,
Ann

Thank you for the explanation. The thing that bugs me the most about “key as seen” is in the state area. I have keyed in only the state, which means the red flat does not appear, and I’ve “keyed as seen” which means a searcher may have info previously unknow–but there’s that red flag! I’ve done it both ways, and am beginning to learn toward to “red flag solution”, but am still not sure what is correct.

“Key as seen”. Pretty simple when the handwriting is clear or typewritten and there is simply no question as to what the characters are. However, in the cases of horrific handwriting or poor quality images, you could have a whole group of people look at a record and you might get a whole bunch of different answers as to what it says.

This is my biggest pet peeve. I might be keying what I see, but Keyer B is also keying what THEY see. To attach an accuracy rating to indexed records that are really up to interpretation is wrong – and honestly, the arbitrator might look at what both keyers entered and say “Nah, it’s not that, OR that, it’s THIS” and type in their own interpretation of the word. Who’s to say the arbitrator isn’t wrong too?

It’s a very muddy field we’re tromping through when it comes to “Key as Seen”.

I loaded the World Archives nearly a year ago, but due to personal and health problems, I’ve only started to key in the last few days. I havr followed the instructions for keying and for improving stats and have thrown in several years of experience with old records for good measure. I am horrified with myself for the low score I have achieved. Is there any way I can get an error report to see where I am going wrong and stop it before you kick me out or I quit in frustration? I would really appreciate the help.

I don’t have a problem with “key as seen” as others seem to. I do know it is frustrating to key “Switzerlan” when I know it is “Switzerland” and that sometimes I’m supposed to use the correct country using the pull down menus. However, I know when I am searching for my own ancestors and those of others that I help, that I often get all the spellings and mis-spellings of these names and places. I don’t care about my “score” so much, as long as it is fairly high so that I know I am basically getting things correct.

I enjoy doing this to help myself and others be able to have access to this valuable information that we would otherwise perhaps not even know existed.

Kudos to all of the teams at Ancestry that deal with these databases! Kudos to the keyers and arbitrators who persevere!

A tip to new keyers…Study all the help in the learning center regarding old handwriting…that has helped me improve my scores. And even now I learn more about it as I read the blogs.

Happy Holidays to you all!

Thank you for the reasonable update. As for “key as seen”, it makes sense. Yes, keying Oklohomo was bit hard. (Buffalo Soldiers) It is also is part of the story. My personal problem is that when I arbitrating and find mistakes that a little research could have corrected. While there are many reasons to key the records, the history lesson received is worth any of the time spend squinting at fuzzy tiny writing. I too am a big fan of the feedback to the keyers. Thanks for all the hard work!

Amen to Patti’s comments!
I’ve been working on the NM census images and the handwriting is generally terrible. Often I “see” half a dozen possibilities for what a name might be, and have a hard time deciding whether to attempt to “key as seen” or just enter “illegible.” Any choice I make is likely to impact my “accuracy” rating!

Thank you for these articles. They do help a lot.
I don’t know whether the Welsh census records were done in the same way, by double keying then arbitration, but I doubt it. There are awful, glaring errors all over the place, making searches virtually impossible to be accurate. They were certainly not done as key what you see, or those errors would not occur. The Welsh records really need to be rechecked.

In a perfect world, WAP would just forget about accuracy ratings for handwritten records, but I realize that logistically speaking, that would be a nightmare. It’s all or none, I suppose. I guess we just need to take into consideration that when Keyer A keys what THEY see and Keyer B keys what THEY see and the arbitrator decides what THEY see – the arbitrator always wins, whether or not what they’re seeing is actually what the record says.

I do arbitrate, so I know this scenario comes into play frequently. Sometimes I am seeing something completely different from the entries each keyer made. Does that mean I’m right? Nope. It just means that the handwriting was atrocious and we all “saw” different things. We could all be wrong – which makes rating accuracy in such situations kind of silly. Just my two cents.

To back pedal a bit, though, I do enjoy keying, and will continue to do so, even if my ratings are dropping. I’ll just tell myself that I’m not a terrible keyer, it’s just that different eyes see different things.

I find if I can’t read something it sometimes helps to go away from it for a couple of minutes and then when you come back what was as clear as mud suddenly becomes as clear as day.

Crista –

An ounce of explanation and reasoning goes a long way toward that pound of cure we are all looking for. Thanks for sharing your viewpoint.

Key as seen is certainly the fundamental principal, and I think we all understand and respect that. My problem00 and that of many other keyers, I suspect, is when to apply the second principle — use common sense. If the records are Pennsylvania church registers, to make up an example, and the record says residence is “Phila, Penna” common sense suggests that that is Philadelphia Pennsylvanis. That, however, is still a guess, and “key as seen” should be the rule that’s applied.

On the other hand, if the typesetter in a Cleveland Ohio City Directory has screwed up and set a town name as Kirtland Oiho, common sense dictates that keying an obvious typo as seen is just plain wrong.

So what do we, as intelligent human beings do? I, for one, try to Key as seen tempered by the directions for the project I am working on, and a small dose of common sense.

The key for WAP is making those directions EXPLICIT–consistent would be nice, but explicit is essential. For example– the following phrase appears frequently in Location Field helps. “Use the provided dictionary to assist you, but otherwise, key as seen.” I would interpret this to mean “If you can almost read the name of the town, look in the pull-down listing to help figure out the possibilities. If that doesn’t work, key what you see. If, on the other hand, the town name is written
(or printed) clearly but simply doesn’t appear in the pull-down just key what you see–don’t try to make it fit the choices we gave you.” The problem is, that’s an interpretation–a reasonable and conservative one, but an interpretation nonetheless.

As a writer an editor for a living
(I work for the government) I have discovered that all the rules of spelling an capitalization and what not that I learned and thought we “standard” are anything but. What’s more, what my young analysts were taught and what I was taught are sometimes radically different. Consequently, I find myself spelling things out in the most explicit terms possible even at the risk of sounding like an idiot myself.

In short, I think the best thing WAP can do for keyers and arbitratore alike is to go back over the Field helps–not to make new decisions about how to key things, or even to clarify exceptions–but to make the directions as explicit as possible. If you want us to spell the names of states properly when we can be 100% sure of the state in question, say so, if you want us to key “Oklahomo” even if Oklahoma is the only possible choice, say so. Trust me, we wont take offense :-)

Arrgh –

Apologies for the many typographical problems in the previous comment. My editorial head is hanging in shame. The only excuse is that this page comes up in a tiny font on my laptop and my aging eyes are, well, aging.:-(

Thanks Crista.

The additional information is always welcome. I can echo many of the comments that have been made. My issues center around getting different information from different sources within the organization itself. For example, one source indicates to type only state/country for birthplace while a second source indicates type exactly as seen including city or county.

The purpose is to be as accurate as is possible. That’s hard when one gets mixed signals.

There are several good, helpful points in this blog, and several good comments. I would add to something Virginia said:
“If you can almost read the name of the town, look in the pull-down listing to help figure out the possibilities. If that doesn’t work, key what you see.”
With several projects, my favorite is a good example (Buffalo Soldiers), a little research can do wonders when trying to find that “almost readable” place name. Our project doesn’t have a places drop-down chart, so we’re on our own with it. Therefore, the research is a huge help. If we know what it “supposed” to be, it makes it much easier to figure out what is actually written in the record (spelling errors and all).
I only have one other thing – I know we’ve been asking for (polite, concise, anonymous) feedback between keyers and arbitrators for a long time, and it hasn’t happened yet. We still really, really wish it would happen! I’m sure it would vastly improve the quality of work we do!
As the saying goes, Happy Keying!

Crista, thanks for your sincere efforts and your willingness to communicate. I suppose to some degree it is the mode of communication, Blog Statements, Forums and Boards which tend to muddy the water.
I also understand what you took the time to explain in making the standards for each project. However, I think in most cases the “rules” per project are generally not sufficiently thought out or updated so we end up with all this confusion and frustration.
Please re-read the last paragraph of Mary Ann’s previous post and seriously think about it. And, if you just can’t do that, perhaps you could address the issue in some way.
All said and done, like so many of you- regardless of all the above I key because I enjoy the history lessons and I want to help others find information that would have been locked away by geography or an actual lock and key…I try to keep my scores up but when they dip I expel a few expletives and move on..they come right back up…happy keying and Holidays to all…John N

I’m doing a page of the Invercargill NZ directory and almost every forename is abbreviated, by leaving out most vowels.Imagine trying to figure out what Cath stands for. It could be Cath, or Cathy, or Catherine, or Cathleen or any of a dozen spellings. Is Sid short for Sidney or is it just Sid. It doesn’t really matter, but I wondered why there were so many librarians in this city. I suspect they were labourers. Actually it is hard to type what is there, since the tendency is to spell out Frank rather than Frnk (or is it Franklin?)

i am working on the Canadian pay lists and keying ranks is a nightmare. Key as seen would produce Private, Pte, Pvt. Sometimes one or all of these are in the drop down list and sometimes they are no. Thus deciding how to key in say Pvt when private and Pte are included on a drop down list is confusing.Also many of the ranks are not included such as col. Sgt, Pioneer, Pion Sgt, B’ndsman although bandsman is included. Maybe a useful exercise would be to set out a schedule of all ranks and their shortened forms

I am keying NSW Convict Indents. I have come across header? information on ship arrivals. I have details of two (2) boats arriving in Sydney on two different days followed by a list of names and details. There is no indication of who is on which boat. How do I record the arrival data and the convict details

If we are “keying as seen,” and the name is difficult to make out, can’t the Keyer type in as much of the name as s/he can see like this: Mou?thouse. The other Keyer would key their interpretation and then the Arbitrator would review and make their choice.
When I search for ancestors, I would rather see a partial name than an incorrect one.
The ability to give/receive feedback would help tremendously. I’ve just starting keying and the more I do, the more I understand, especially since I am keying the same type of records (military paylists). Feedback from an Arbitrator after the first few image sets would shorten the learning curve and lessen future errors.

Common problem is that keyers do insert some entries on the NM census, such as HEAD when the line is blank. Most keyers type out Male and Female when the instructions say to use M or F. I am a keyer and arbitor and am being penalized as a keyer when I use M or F. Also, key as seen is so important. We, as keyers, should not alter a historical record.

re: Key as seen
Well, I typed my first set yesterday and must admit I was unprepared for how difficult it would be. (How the Heck did I get myself into this? I’ll never, I mean NEVER do this again! Oh God will it ever end???…Yes it did, but it took about 8 hours)And now, after reading all these posts, I may just set a new all time high on error rates!
I keyed NY but New York ‘popped up’ and I used it.. spell it out kept popping up so I did… Jos became Joseph… ugh.. I am so ashamed…
So from a newbie perspective (and a professional project manager with a teaching background) could I make a few observations?
1-Create teaching videos that people can watch on You Tube or Ancestry.com ( Getting started, Top 10 Errors/Mistakes, Arbitration and Ratings…) Nothing fancy needed.

2- Create an on-line test we can take and be scored on. We all need to see our errors with the correct responses in order to improve. If this is not possible, take a few original archives and put the final typed version beside it for people to review.

3-Type as seen or spell it out? Pick One.

4-There really needs to be a way to highlight a field that you are guessing on. I would rather guess than mark it illegible, but I WANT someone to really look at it. Maybe the original document is a bit cleaner?

It’s a new day, and I have had about 6 cups of coffee. Maybe I’ll try this again…next week. :)