Posted by Bill Yetman on May 27, 2014 in Events

Interest in direct-to-consumer DNA testing has grown dramatically in the past few years. When you’re measuring more than 700,000 DNA markers for each individual, how do you analyze all that data across a rapidly growing database, while providing actionable results for your customers? At the Hadoop Summit next week, I will be discussing our uses of Hadoop and other open source projects (HBase, Azkaban, etc.) to handle, at scale, the ethnicity predictions and processing behind the AncestryDNA genetic cousin matching algorithm. The talk will include how the development team grew and matured as we worked with Hadoop. As our DNA pool continues to grow, and we continue to improve our algorithms, we are faced with new science and technical problems that need to be overcome.

Please join me as I walk through the science behind processing several hundred thousand DNA samples and how we leveraged both science and technology to solve a business problem. This is a really unique application that shows how versatile Hadoop can be. We’ll dig in and show how the science and development teams collaborated on this project, through various updates, to deliver a significantly improved user experience and what’s next for our team.

We hope to see you next week!

Session Info:

Thursday, June 5, 2014 (3rd day)

11:00 am PT

Hadoop Driven Business Track

San Jose Convention Center

Bill Yetman

Bill Yetman has served as VP of Engineering at since January 2014. Bill has held multiple positions with from August 2002, including Senior Director of Engineering, Director of Sites, Mobile and APIs, Director of Ad Operations and Ad Sales, Senior Software Manager of eCommerce and Senior Software Developer. Prior to joining, he held several developer and programmer roles with Coresoft Technologies, Inc., Novell/Word Perfect, Fujitsu Systems of America and NCR. Mr. Yetman holds a B.S. in Computer Science and a B.A. in Psychology from San Diego State University.