Every week, AncestryDNA analyzes thousands of peoples’ DNA, decoding their family origins and finding their long-lost relatives. To that end, we used GERMLINE, an algorithm for finding hidden family relationships within a pool of DNA. However, the reference implementation of GERMLINE didn’t scale, and we were running up against its limitations. This Thursday, I’ll be giving a talk at HBaseCon about how we solved this problem. Find out how AncestryDNA leveraged Hadoop and HBase to implement a scalable cleanroom implementation of the GERMLINE algorithm, resulting in a 1700% performance improvement.
Thursday, June 13, 2013, 5:20pm – 5:40pm
Presented by Jeremy Pollack (Ancestry.com)
At the San Francisco Marriot Marquis, in the Yerba Buena 13-15 room
*Update: View the video of my presentation here.