This is a test. This is only a test.


We occasionally see this warning on television directing us to a specific station to receive information in case of an emergency.  Although this is not an emergency we are conducting a test.  Ancestry’s ultimate goal is to get more content onto the site to help you, others and future generations have greater success in finding that missing link.

This test is to determine if a slight change to our indexing process in the World Archives Project keying tool will get us closer to our goal and still maintain an acceptable quality standard.  We have taken a few rolls of previously keyed content and placed them into a new review format to test a Single Key with Review option.  We would like to invite those of you who are arbitrators to help us test out this process.

The two collections we have chosen are located in the “Arbitration Projects” tab under the project names:

TESTING – Jacksonville, Florida Area City Directions

TESTING – Pennsylvania, U.S. Naturalization Originals

For more complete instructions click here.

Once you have completed reviewing a few image sets we would love to hear your thoughts about the new process.  Please send feedback to worldarchivesfeedback@ancestry.com.

Thank you!

Information and Links

Join the fray by commenting, tracking what others have to say, or linking to it from your blog.


Other Posts

Write a Comment

Take a moment to comment and tell us what you think. Some basic HTML is allowed for formatting.

Reader Comments

The testing Jacksonville project has been “in processing” for some time now. only keyer “testtgn” with 250,000 records keyed.

Only arbitrators Anna F and Christa C of WAP.

Also, it does not as of the time of this note) appear in the arbitration tab for me.

The instructions of the Jacksonville project state it’s one of two using the A – review system.

But it’s not: it’s one of 5. The other 3 were released some time ago to a small group of test keyers. (One is now In Processing already)

My comments about this so far: jacksonville is going to be SUPER frustrating to arbitrate because missing records will not be so easy to find in the match records phase. You will catch them only on the Review phase.

Because of bug (you can call it intended behavior, if you want, it’s still bad.) Any time you have to go back to the match records (or Review) portion. you must RE-arbitrate all records up to the point you had to insert a record.

So what I say to A-review for more than a handful or records per image: forget about it. Or fix the tool so records can be added, deleted and moved without having to trash previous work.

If it sounds like I’m peeved at WAP right now, that would be a correct impression. For dumping the State data we keyed in the Illinois Nat index and replacing it with a default. And the wrong default.

Type your comment here.

More here: If your test data in Jacksonville is in fact keyings done from early in the original project, there is another complication. Early on: we did not key spouses at that time, and we did key ads. It’s going to be very painful to add all that missing data in review mode.

By the way, if this is only being released for arbitration, who gets to key all the rejects?

I totally disagree with a one keyer/one reviewer type system. It may get your data online quicker, but at what costs? So the intregrity of the data is wrong? You can have an arbitrater that has not even keyed a project, arbitrate it that doesn’t know how it should be properly keyed.

I would like to give this a try, but where are they? I do not see that option.

I also disagree with the one keyer / one reviewer system. Admittedly, it could work well with projects that have both experienced keyers and experienced reviewers. However, it will work very badly with both inexperienced keyers and reviewers.

It aught to be fairly obvious what it happen in the worst cases: keying instructions disregarded (not intentionally, perhaps) and data entered in raw data sequence into keying fields regardless of their title; and wrong keying forms selected. If the one keyer makes a mistake or leaves out a record, under this new system the arbitrator becomes the keyer for that record and there is no arbitration – well not until the end user finds the problem and reports it.

Arbitrators having to correct both keyers or rejecting data sets from both keyers does happen; and undoubtedly arbitrators sometimes get it wrong.

With an experienced / inexperienced pair, either the keyer will have good data rejected of the arbitrator will have excessive work.

I would also suggest that it involve WAP with more work both before the keying starts and at the post-processing stages. More effort will be needed to remove ambiguity from keying instructions and more time will be needed afterwards cleaning up the data.

WAP don’t pay for keying and abritration (since we do it free) so the only costs at this stage appear to be server costs. So I don’t see any savings at this stage; and if pre and post processing costs go up, what benefit is there?

Woops, I think I might have got hold of the wrong end of the stick. Having looked at the instructions, its still a two keyer system.

HOWEVER, those test sets don’t appear in my arbitration list, so I can’t try them out.

I also feel I wasted all that time looking at many posts and discussions trying to figure out the correct way to key the residence information for the Illinois naturalization indexes, when in the end, it all got thrown out of the window.

I have just randomly checked the live indexes and Illinois also has a problem with the place of birth. All of the entries we keyed as Great Britain; Ireland or Ireland; Great Britain default to just Ireland which is really misleading. If they wanted to pick one they should have used Great Britain. For a common name, and if you know they didn’t come from Ireland you wouldn’t bother checking out the original.

I also disagree with one keyer/one reviewer.

I wasn’t participating in the beginning of the project, but if I remember correctly, keyers were told to record only one country in case of multiple ones. But I’ve checked the index and it looks to me, that the seperation method did not matter at all in the end processing. All kinds are there, with or without spaces, using semicolons or hypens, and keeping the & symbol apparently did not cause any problems aither.

As often happens with testing, we have encountered a small glitch. You should be seeing those projects on the Arbitration tab shortly.

Thanks for your patience. And we look forward to even more feedback after you’ve had a chance to look at the process.

I’m not wild about this one keyer/one reviewer arbitration deal. For one thing, I can do this already, as there are a lot of keyers that don’t key all the fields.

2nd, I’m afraid I’ll miss mistakes. With 2 keyers it’s highlighted and accuracy is going to be higher.

This is like having a keyer and a ‘superkeyer’ and if I’m going to key, I’d just as soon type it all in. I don’t see doing much arbitrating if this is the way it will proceed.

One other thought: You’re speeding up indexing, but slowing down arbitrating. Given the likelihood of lower accuracy, is there all that much to be gained?

But then again, it’s early. Maybe it will grow on me (but I seriously doubt it)

One more quick thought: I’m arbitrating a set now where the 1st keyer has entered name and birthdate and nothing mone.

This makes me the sole keyer for 50% of the fields on this set. At what point do we reject? Also, who is checking my accuracy?

Just sayin’ :)

I totally disagree with a one keyer/one reviewer type system. Have just arbitrating Pennysylvania records and find many records where not all the fields are keyed in. In this case the arbitor would have to key in the data again. WPA have just got to accept that some projects will take longer to complete than others. Might it better to split say Pennyslyvania records into certain periods than trying the period 1795 to 1972 in one go. Was very difficult to do Pennysylvania records at right because of the unreadable handwriting but better now that a lot of the records are type written.

The reviewer might as well do the keying too if all the records are as badly filled in as the ones I have just done!!!!

This system will take forever.

I’m sorry but I don’t think this is going to work. The records are taking to long to arbitrate.

We are looking into the error message that appears during the “Match Record” process. In the mean time please focus you’re testing on the Pennsylvania Naturalization Originals which does not seem to have the problem.

We appreciate your willingness to work through these issues with us as we conduct this test.

The reason the city directories are crashing is More than 100 records per image. It would happen on the other sets given the right number of records in an image. I’m disappointed that no one found out in the several weeks this project lived “In-house” only. It really was (and still is) stuck in the “in processing” section of the dashboard, just waiting for someone notice; has been there for weeks.

The reason the directories are “badly” keyed is that the test data was pulled from early on. When the rules were different. Mainly that spouses were not keyed (and ads were). In short you can expect the vast majority of these to be missing spouses. Which works out to about 30-40% of the records. These indices would have been acceptable, and preferred when originally submitted. But not now.

The silver lining: we have an opportunity to add back spouses in a section that was missing them. The tarnish on the silver lining is this is either going to mean a lot of work in Review mode, or a lot of rejects, and 100% rekeys.

When they fix the tool, its slightly less overall work to add the spouses. However, I feel that rejection is also perfectly acceptable for these.

Further disappointed that this was either not tested with the update (1.1.0.69) Or that, if it was, that the release of said update was not postponed pending resolution of this quite serious issue.

Every time you push the new version, a small group has problems with it. And a flood of support calls. So why go through it twice in rapid succession?

I too have found a lot of keyed records where only the name has been entered for Pennysylvania Immigration records. Obviously some lazy person who is trying to get their number of keyed records higher the easy slovenly way. About time there was feedback to such keyers. Name and shame them.
If the new method of arbitration is introduced I will not take part

I find the new system may actually improve the accuracy of arbitrating, as it requires me to look at each section, which ensures I don’t miss anything. With the previous system, it can be a little too easy for some people to miss errors, especially when both keyers make the same error, and the system doesn’t flag it. With the new system, nothing is in the final box until you actually put it there, so things can’t just slip through.

I am opposed to the simple review.

It is sometimes helpful to have a “second opinion.” And one of the two usually has the correct entry, so keying is not required. There would be no one to correct my keying, which isn’t too serious with the occasional entry but could become a problem if many fields have to be keyed.

I have my keying tool set to review every record, so the new system would change nothing there.

Please don’t change from the arbitrated 2-keyer system.

The only cases were this is doable are typed documents with only a single record. I have been arbitrating post returns and this would be a nightmare. Comparing the two keyers is a very useful tool to insure all the names are entered. And then there is the spelling problem with handwritten records.

I’m thinking the practical limit will be under 10 records per image. I’ve rarely found a city directory with zero missing records. Even with excellent keyers, it’s just too easy to miss one record in 70 or so.

I do not understand why they re-released the Jacksonville project before they fixed the tool. What’s been going on at WAP lately?!