Automated Entity Extraction Making German Historical Records Searchable

Posted by Laryn Brown on May 6, 2016 in Big Data, Image Processing and Analysis

For several years now Ancestry has been publishing collections of records from the U.S. that have been “transcribed” using a method we call Entity Extraction. One example is the U.S. City Directory collection. A precursor to modern telephone books, city directories listed all of the inhabitants of a city, along with their address, occupation, and Read More

Visualizing Data with Tableau

Posted by Camille Penrod on November 11, 2015 in Big Data, Technical Management

At Ancestry we quickly analyze billions of rows of data to deliver insights from our massive database to internal and external audiences. To do this we need tools that have the right capabilities and can be easily adopted by our tech team. In a recent Q&A video Bill Yetman, VP of Commerce, Data and Analytics Read More

Data Science Thought Series: How to Embrace Technology Evolution

Posted by Lei Wu on October 27, 2015 in Big Data, Data Science

  Technology evolution is a very common and natural process throughout human history.  Some people like it while some people are hesitant.  Despite attitudes towards technology evolution, new technology will eventually come.  That is how technology trends advance, that is how quality of life advances and that is how human civilization advances. As a technology Read More

Over one million DNA samples and many new scientific research findings

Posted by Julie Granka on October 2, 2015 in Big Data, DNA, Science

Ancestry science team gives two platform presentations on Friday, October 9th at human genetics conference ASHG With over one million customers who have submitted their DNA, AncestryDNA has the fastest-growing and one of the largest collections of human genetic data around the world.   That amount of DNA data powers the AncestryDNA science team to perform Read More

Lessons Learned Building a Messaging Framework

Posted by Xuyen On on July 1, 2014 in Big Data

We have built out an initial logging framework with Kafka 0.7.2, a messaging system developed at LinkedIn. This blog post will go over some of the lessons we’ve learned by building out the framework here at Ancestry.com. Most of our application servers are Windows-based and we want to capture IIS logs from these servers. However, Read More

Adventures in Big Data: Commodity Hardware Blues

Posted by Bill Yetman on June 20, 2014 in Big Data

One of the real advantages of a system like Hadoop is that it runs on commodity hardware. This will keep your hardware costs low. But when that hardware fails at an unusually high rate it can really throw a wrench into your plans. This was the case recently when we set up a new cluster Read More

Ancestry.com to Present Jermline on DNA Day at the Global Big Data Conference

Posted by Jeremy Pollack on April 9, 2014 in Big Data, Data Science, Development, DNA, Science

Interested in genealogy?  Curious about DNA?  Fascinated by the world of big data?  If so, come check out my talk  at the Global Big Data Conference on DNA day this Friday, April 25 at 4pmPT in the Santa Clara Convention Center!  I’ll cover Jermline, our massively-scalable DNA matching application.  I’ll talk about our business, give a run-through Read More