Tech Roots http://blogs.ancestry.com/techroots Ancestry.com Tech Roots Blogs Wed, 19 Nov 2014 23:53:37 +0000 en-US hourly 1 http://wordpress.org/?v=3.5.2 Lessons Learned from a Monster Artisthttp://blogs.ancestry.com/techroots/lessons-learned-from-a-monster-artist/ http://blogs.ancestry.com/techroots/lessons-learned-from-a-monster-artist/#comments Wed, 19 Nov 2014 23:52:23 +0000 Dan Lawyer http://blogs.ancestry.com/techroots/?p=2895 Yes, we made monsters out of clay. If you happened to be in Midway, Utah at the very end of September you might have bumped into the Ancestry product team holding our annual product summit. About 80 of us gathered for an action packed two-day event filled with team building, strategic conversations, and a few… Read more

The post Lessons Learned from a Monster Artist appeared first on Tech Roots.

]]>
Yes, we made monsters out of clay.

monster

If you happened to be in Midway, Utah at the very end of September you might have bumped into the Ancestry product team holding our annual product summit. About 80 of us gathered for an action packed two-day event filled with team building, strategic conversations, and a few non-conventional outside-of-the-box squeeze-your-mind kind of activities aimed at keeping creativity top of mind in our work. It was thrilling to see the passion for our users and our business in the hearts and minds of this talented group.

monster 2

One of the unique and memorable activities was a session led by Howard Lyon, a professional artist. We were lucky enough to have him come in and share some thoughts on how process aids creativity. On the surface many people believe the words process and creativity are not compatible. Howard made a great case for how process is critical to creating works of art. He walked us through some fascinating insights into the processes the masters use and then shared his own process accompanied by visuals of every stage from inception to finished masterpiece.

Things got real for the team when Howard paired us up in twos, passed out modeling clay, and gave us a simplified process for how to build a monster. First we answered a series of questions like “What is the name of my monster?” “Where does my monster live?” “Is it a monster for kids or adults?” “What does it eat?”  Then we drew a small sketch of the monster based on the answers to our questions. After we each had a sketch we paired up with another team member, got our hands dirty (literally) and created some of the masterpieces below. Once we finished our monsters we experimented with the effects of lighting by taking photos of our models from different angles against a green backdrop. We then submitted our photos to our designers to add in some final effects.

At the end of our summit we took home our cool monsters plus and a renewed determination to build excellent experiences for our amazing users.

Monster 3

 

Monster 4

 

 

The post Lessons Learned from a Monster Artist appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/lessons-learned-from-a-monster-artist/feed/ 0
Monitoring progress of SOA HPC jobs programmaticallyhttp://blogs.ancestry.com/techroots/monitoring-progress-of-soa-hpc-jobs-programmatically/ http://blogs.ancestry.com/techroots/monitoring-progress-of-soa-hpc-jobs-programmatically/#comments Fri, 17 Oct 2014 14:15:27 +0000 Chad Groneman http://blogs.ancestry.com/techroots/?p=2873 Here at Ancestry.com, we currently use Microsoft’s High Performance Computing (HPC) cluster to do a variety of things.  My team has multiple things we use an HPC cluster for.  Interestingly enough, we don’t communicate with HPC exactly the same for any distinct job type.  We’re using the Service Oriented Architecture (SOA) model for two of… Read more

The post Monitoring progress of SOA HPC jobs programmatically appeared first on Tech Roots.

]]>
Here at Ancestry.com, we currently use Microsoft’s High Performance Computing (HPC) cluster to do a variety of things.  My team has multiple things we use an HPC cluster for.  Interestingly enough, we don’t communicate with HPC exactly the same for any distinct job type.  We’re using the Service Oriented Architecture (SOA) model for two of our use cases, but even those communicate differently.

Recently, I was working on a problem where I wanted our program to know exactly how many tasks in a job had completed (not just the percentage of progress), similar to what can be seen in HPC Job manager.  The code for these HPC jobs uses the BrokerClient to send tasks.  With the BrokerClient, you can “fire and forget”, which is what this solution does.  I should note that the BrokerClient can retrieve results, after the job is finished, but that wasn’t my use case.  I thought there should be a simple way to ask HPC how many tasks had completed.  It turns out that this is not as easy as you might expect, when using the SOA model.  I couldn’t find any documentation on how to do it.  I found a solution that worked for me, and I thought I’d share it.

HPC Session Request Breakdown, as shown in HPC Job Manager

HPC Session Request Breakdown, as shown in HPC Job Manager

With a BrokerClient, your link back to the HPC job comes from the Session object used to create the BrokerClient.  From a Scheduler, you can get your ISchedulerJob that corresponds with the Session by matching the ISchedulerJob.Id to the Session.Id.  My first thought was to use ISchedulerJob.GetTaskList() to retrieve the tasks and look at the task details.  It turns out that for SOA jobs, tasks do not correspond to requests.  The tasks don’t have any methods on them to indicate how many requests they’ve fulfilled, either.

My solution was found while looking at the results of the ISchedulerJob.GetCustomProperties() method.  I was surprised to find the solution there, since the MSDN documentation states that this is “application-defined properties”.

I found four name-value pairs which may be useful for knowing the state of tasks in a SOA job, with the following keys:

  • “HPC_Calculating”
  • “HPC_Caclulated”
  • “HPC_Faulted”
  • “HPC_PurgedProcessed”

I should note that some of these properties don’t exist when the job is brand new, with no requests sent to it yet.  Also, I was disappointed to find no key corresponding to the “incoming” requests, since some applications might not be able to calculate that themselves.

With that information, I was able to write code to monitor the SOA jobs.

With all that said, I should also say that our other SOA HPC use case monitors the state of the tasks, and is capable of more detailed real-time information.  We do this by creating our own ChannelFactory and channels.  By using that, the requests are not “fire and forget” – we get results back from each request individually as it completes.  We know how many outstanding requests there are, and how many have completed.  If we wanted to, we could use the same solution presented for the BrokerClient to find out how many are in the “calculating” state.

One last disclaimer:  These “Custom Properties” are not documented, but they are publicly exposed.  Microsoft could change them.  If they ever do, I hope they would consider it a breaking change, and document it.  There are no guarantees of that, so use discretion when considering this solution.

The post Monitoring progress of SOA HPC jobs programmatically appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/monitoring-progress-of-soa-hpc-jobs-programmatically/feed/ 2
2 Talks and 4 Posters in 4 Days at the ASHG Annual Meetinghttp://blogs.ancestry.com/techroots/2-talks-and-4-posters-in-4-days-at-the-ashg-annual-meeting/ http://blogs.ancestry.com/techroots/2-talks-and-4-posters-in-4-days-at-the-ashg-annual-meeting/#comments Wed, 15 Oct 2014 20:11:47 +0000 Julie Granka http://blogs.ancestry.com/techroots/?p=2865 For the AncestryDNA science team, October brings more than fall foliage and pumpkins.  It also brings us the yearly meeting of the American Society of Human Genetics (ASHG), the main conference of the year in our field. On Saturday, we’ll arrive in San Diego to join thousands of other scientists for a four day conference… Read more

The post 2 Talks and 4 Posters in 4 Days at the ASHG Annual Meeting appeared first on Tech Roots.

]]>
For the AncestryDNA science team, October brings more than fall foliage and pumpkins.  It also brings us the yearly meeting of the American Society of Human Genetics (ASHG), the main conference of the year in our field.

On Saturday, we’ll arrive in San Diego to join thousands of other scientists for a four day conference to discuss topics in genetics, exchange ideas with colleagues, listen to talks and presentations – and importantly, to give some presentations of our own.

We’re always on the lookout for ways that we can translate the latest scientific findings into future features for AncestryDNA customers.  The ASHG Annual Meeting is a chance for all of us to soak up the newest advancements in human genetics.

This year, the number and variety of presentations that we are giving at ASHG attests to the fact that AncestryDNA, too, plays a role in these advancements.

This year, we’re proud to be giving two platform presentations – only 8% of applications for platform presentations at ASHG were accepted. Keith Noto will be giving a platform talk entitled “Underdog: A Fully-Supervised Phasing Algorithm that Learns from Hundreds of Thousands of Samples and Phases in Minutes,” discussing the workings behind an impressive algorithm we’ve developed to phase genotype data extremely quickly and accurately. Yong Wang’s platform talk will reveal a few fascinating discoveries about U.S. population history from studying patterns of ethnicity and identity-by-descent among AncestryDNA customers.

We’ll also be giving a number of poster presentations.  Mathew Barber will be presenting the method behind another algorithm that we’ve developed to better identify true identical-by-descent DNA matches.  I’ll be presenting a method we’ve developed to reconstruct the genomes of ancestors from genotype data of their descendants.  Jake Byrnes will be presenting a poster with a collaborator from Stanford University about inferring sub-continental local genomic ancestry. Finally, Eunjung Han and Peter Carbonetto will each present results from previous research they conducted at the University of California, Los Angeles and the University of Chicago, respectively.

We’re looking forward to engaging in insightful dialogue about our work with the scientific community. Even if we won’t see much fall foliage in San Diego.

The post 2 Talks and 4 Posters in 4 Days at the ASHG Annual Meeting appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/2-talks-and-4-posters-in-4-days-at-the-ashg-annual-meeting/feed/ 0
External APIs: To Explode, or Not to Explode, That is the Questionhttp://blogs.ancestry.com/techroots/external-apis-to-explode-or-not-to-explode-that-is-the-question/ http://blogs.ancestry.com/techroots/external-apis-to-explode-or-not-to-explode-that-is-the-question/#comments Mon, 29 Sep 2014 17:00:30 +0000 Harold Madsen http://blogs.ancestry.com/techroots/?p=2830 Shakespeare might not approve of my taking liberties with his play Hamlet, though prince Hamlet was essentially saying the same thing as I was feeling last year: To be, or not to be, that is the question— Whether ’tis Nobler in the mind to suffer The Slings and Arrows of outrageous Fortune, Or to take… Read more

The post External APIs: To Explode, or Not to Explode, That is the Question appeared first on Tech Roots.

]]>
William Shakespeare - hs-augsburg.de

William Shakespeare – hs-augsburg.de

Shakespeare might not approve of my taking liberties with his play Hamlet, though prince Hamlet was essentially saying the same thing as I was feeling last year:

To be, or not to be, that is the question—

Whether ’tis Nobler in the mind to suffer

The Slings and Arrows of outrageous Fortune,

Or to take Arms against a Sea of troubles…

Would Hamlet go on or cease from all? Yes, I may have felt just as Hamlet in the Nunnery Scene when I thought about my “sea of troubles” just one year ago. Well, maybe I’m waxing a bit too dramatic but there were real concerns on my part regarding last year’s events (oh how “smart a lash” that memory doth make!). What was that worrisome memory? Allow me now to retrace my steps to that challenging day and give you context to my soliloquy.

This story begins last fall and there are many actors on the stage of events. Yes, my story begins with our mobile app and our external API and reaches its climax when seemingly all users download their family trees all at once! Oh the misery. Let us begin our tale of woe.

Ancestry.com (the bastion of family history) has an external API that is used by our mobile apps and other strategic initiatives to share and update a person’s family tree, events, stories and photos. Our external API has been most important in our mobile efforts (11 million downloads) and in working with the popular TV series, “Who Do You Think You Are?” Our mobile team had successfully grown our mobile usage to such an extent that I began to worry it might actually tax our systems. That concern was beginning to bubble up from my subconscious the fall of last year and this leads us to that disquieting day. Last year, our iOS mobile app was promoted in the Apple App Store because of the updates we had made for the release (along with other Ancestry promotions). Those promotions led to large numbers of simultaneous family-tree downloads and it weakened the mighty knees of our backend services. We endured a week of utmost traffic and were consigned to throttle (limit usage of) our mobile users (the API team saved the day by throttling usage and thus preserving the backend services). After experiencing that calamitous week, we might well have cried, “Get thee to a nunnery!” or “O, woe is me” but we repressed such impulses.

OK, it wasn’t actually a “calamitous week”, I was just getting into my Hamlet character. Given that impact to our website was quite minimal, most of our users had a good experience. However, it was a bit frustrating for many of our mobile users – it took too long for many to successfully download their family tree to their mobile device. This really is great news that our mobile traffic has been growing. We realized that we must architect a plan to take us through the next round of application and user growth. Here’s how it happened:

That experience caused us to reconsider how we deploy our mobile apps, how our mobile apps interact with the API, how we call our backend services, how we deliver a tree download and if we should continue to aggregate our services at the API layer. Each of these areas of the company went under a review to see how we might mobile to api to servicesoptimize our systems. After holding periodic meetings, discussions and code reviews over several months a plan began to gel. Below is a list of some changes made to our systems and application:

  • Pass Through: Rather than aggregate our services at the API layer, we took the strategy of creating a “pass-through” layer back to our backend services. This put the responsibility directly on our services to further optimize their code, and in some cases, create new endpoints specifically with mobile usage in mind. This methodology also enabled our mobile teams to more effectively cache data according to their needs and Service team recommendations. More on that below.
  • Mobile Usage: As our users became more mobile we have increased traffic through our APIs from mobile devices. Last fall our mobile usage at Ancestry.com reached critical mass and put serious pressure on our services (especially during big promotions and app updates). Because mobile usage differs from the website in important ways it was time to address this in our backend services. After several meetings involving cross-functional teams, a few service calls were designed with mobile usage in mind. One of the results was that downloading your entire family tree became much faster. Downloading a tree with 10 thousand persons (and all their associated events etc.) decreased in time from several minutes to under 1 minute.
  • Cashing: Because we changed our API model to a pass-through, our mobile app could now cache data from each call at appropriate intervals thus taking pressure off of our backend services and network. This meant fewer calls (in the long run) to the external API.
  • Mobile App Optimization: One area of review was our mobile application. After the code review we theorized that our app might have put undue pressure on our services. What was the root cause? Apple has two new, interesting features:
    • Apple can automatically download and install new applications on your iPhone or iPad
    • Apple can wake up apps in the background and do tasks

When we released our app last year, we believe it was automatically downloaded by Apple (onto most Apple devices) and then, in the background, automatically downloading that user’s main tree. To be sure, this process would have happened anyway once the end-user opened the app manually (that was required for that app update) but doing it manually would have helped spread the traffic over several days rather than all at once. Of course this is just theory but we wanted to ensure it was not happening and would not happen next time.

  • User Queuing: As you know, queue is just another word for getting in a line. People get in line to buy a new iPhone or to buy tickets for a concert. That’s what we do when there are too many requests at a given moment. Anticipating high traffic from our new 6.0 mobile app (plus other site promotions at that time of year), we created a new way of throttling too-high traffic. Rather than throttling a percentage of calls to our API (making it hard for any-one user to successfully download their family tree to their mobile device) we created a system called User Queuing which allowed a certain # of users into our system at one time. By allowing X number of users into our systems for 10 minutes of uninterrupted usage ensured each would have a pristine experience. This would also protect our backend services from being overloaded as well. We could adjust on the fly how many users were allowed through our API at any one moment. Thus more individuals would have a better experience and the others would be invited to return in a few minutes. We would only turn-on User Queuing if too many users made requests at the same moment.
  • Load Tests: To ensure our systems and new service calls could handle beyond-expected peak calls we ran them through a gauntlet of load tests. These series of tests ensured we had proper capacity.

Now, once our app was approved by Apple, we could have immediately released our app but there were things to consider. Here is how we timed the successful release:

  • We received permission from Apple to release our app in the app store the day before the Apple promotion – thus helping us take some of the steam off of the release.
  • We decided to release at a time of day when we anticipated traffic would be somewhat low.
  • We decided to release when our engineers and database administrators were all available in case we needed to react quickly and also to monitor traffic.

Finally, the day arrived and we were ready. All hands on deck. User Queuing ready to trigger. There was great excitement and nerves. How would our systems hold out? Which internal system might buckle under pressure or show up with a previously undiscovered bug? How long after the launch would we need to kick in User Queuing and how many users would be temporarily turned away by the queue? Did we have enough servers, or memory or database throughput? On the other hand, we had tested our code so well, how could it fail? There was much excitement in the air.

All engineers were readied…and…the button was pushed to release our new mobile app!

Did it all collapse? Were there cascading failures? Was the load too much to bear? Did everything explode?

Nope!

Nothing happened, OK, it seemed like nothing. The load gradually increased over the next few hours but our systems held up wonderfully. No strain, no collapse, no running low on memory, no bottlenecks. Nothing. Yes, there were a few minor bugs to fix but most customers had a great experience and it went very smoothly. team lunchThe time, effort, and planning paid off. It worked!

We were so happy – and relieved. We had done our job. In the coming days several teams went to lunch to celebrate the successful execution of months of planning and work. Some of the engineers actually smiled on the day that nothing happened. Even Hamlet dropped by and asked me a question: “Didst thou not explode with a sea of troubles?” And I said, “not on your life!”

The post External APIs: To Explode, or Not to Explode, That is the Question appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/external-apis-to-explode-or-not-to-explode-that-is-the-question/feed/ 0
Big Data for Developers at Ancestryhttp://blogs.ancestry.com/techroots/big-data-for-developers-at-ancestry/ http://blogs.ancestry.com/techroots/big-data-for-developers-at-ancestry/#comments Thu, 25 Sep 2014 22:59:00 +0000 Seng Lin Shee http://blogs.ancestry.com/techroots/?p=2800 Big Data has been all the craze. Business, marketing and project managers like it because they can plot out trends to make decisions. To us developers, Big Data is just a bunch of logs.  In this blog post, I would like to point out that Big Data (or logs with context) can be leveraged by… Read more

The post Big Data for Developers at Ancestry appeared first on Tech Roots.

]]>
Big Data has been all the craze. Business, marketing and project managers like it because they can plot out trends to make decisions. To us developers, Big Data is just a bunch of logs.  In this blog post, I would like to point out that Big Data (or logs with context) can be leveraged by development teams to understand how our APIs are used.

Developers have implemented logging for a very long time. There are transaction logs, error logs, access logs and more. So, how has logging changed today? Big Data is not all that different from logging. In fact, I would consider Big Data logs as logs with context. Context allows you to do perform interesting things with the data. Now, we can correlate user activity with what’s happening in the system.

A Different Type of Log

So, what are logs? Logs are record of events, and frequently created in the case of applications with very little user interaction. It goes without saying that many logs are transaction logs or error logs.

However, there is a difference between forensics and business logs. Big Data is normally associated with events, actions and behaviors of users when using the system.  Examples include records of purchases, which are linked to a user profile and spanned across time. We call these business logs.  Data and business analysis would love to get a hold on this data; run some machine learning algorithms and finally predict the outcome of a certain decision to improve user experience.

Now back to the developer. How does Big Data help us? On our end, we can utilize forensics logs. Logs get more interesting and helpful if we can combine records from multiple sources. Imagine; hooking in and correlating IIS logs, method logs and performance counters together.

Big Data for Monitoring and Forensics

I would like to advocate that Big Data can and should be leveraged by web service developers to:

  1. Better understanding the system and improve performance of critical paths
  2. Investigate failure trends which might lead to errors or exacerbate current issues.

Logs can include:

  1. Method calls (including context of call – user login, ip address, parameter values, return values etc.)
  2. Execution time of method
  3. Chain of calls (e.g. method names, server names etc.)
    This can be used to trace where method calls originate

With the various data being logged for every single call, it is important that the logging system is able to hold and process huge volume of data. Big Data has to be handled on a whole different scale. The screenshots below are charts from Kibana. Please refer here to find out how to set up data collection and dashboard display using this suite of open source tools.

Example Usage

Based on the decision as to what kind of monitoring is required, the relevant information (e.g. context, method latency, class/method names) should be included in Big Data logs.

Detecting Problematic Dependencies

Plotting time spent in classes of incoming and outgoing components provides us with visibility into the proportion amount of time spent in each layer of the service. The plot below revealed that the service was spending more and more time in a particular component; thus warranting an investigation.

Time in Classes

Discovering Faulty Queries

Logging all exceptions, together with the appropriate error messages and details, allows the developers to determine the circumstances under which a method would fail. The plot below shows that MySql Exceptions started occurring at 17:30. Due to the team including parameters within logs, we were able to determine that invalid queries were used (typos and syntax errors).

Exceptions

Determine Traffic Pattern

Tapping into the IP address of incoming request reveals very interest traffic patterns. In the example below, the graph indicates a spike in traffic. However, upon closer look, this graph shows that spike spanned across ALL countries. This concludes that this spike in traffic is not due to user behavior and this leads to further investigation other possible causes (e.g., DOS attacks, simultaneous updates for mobile apps, error in logs etc.) In this case, we found out it was a false positive; repeated reads in log forwarders through the logging infrastructure.

Country Traffic With Indicator

Determine Faulty Dependents (as opposed to dependencies)

Big Data log generations can be enhanced to include IDs to track the chain of service calls from clients through to the various services in the system. The first column below indicates that traffic from the iOS mobile app passes through the External API gateway before reaching our service. Other columns indicate different flows, thus allowing developers enough information to detect and isolate problems to different systems if needed.

Event Flows

Tracking Progression Through Various Services

Ancestry.com has implemented a Big Data framework across all services to support call tracking across different services. This helps developers (who are knowledgeable on the underlying architecture) to debug whenever a scenario doesn’t work as expected. The graph below depicts different methods being exercised across different services, where each color refers to a single scenario. Such data provides full visibility on the interaction amongst different services across the organization.

Test Tracking

Summary

Forensic logs can be harnessed and used with Big Data tools and framework to greatly improve the effectiveness of development teams. By combining various views (such as the examples above) into a single dashboard, we are able to provide developers with a health snapshot of the system at any time in order to determine failures or to improve architectural designs.

By leveraging Big Data for forensics logging, we, as developers are able to determine faults and reproduce errors messages without the conventional debugging tools. We have full visibility into the various processes in the system (assuming we have sufficient logs). Gone were the days when we need to instrument code on LIVE boxes because the issue only occurs in the LIVE environment.

All of these work are done independently of the Business Analysts and are in fact, very crucial to the agility of the team to quickly react to issues and to continuously improve the system.

Do your developers use Big Data as part of daily development and maintenance of web services? What would you add to increase visibility in the system and to reduce bug-detection time?

The post Big Data for Developers at Ancestry appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/big-data-for-developers-at-ancestry/feed/ 2
Ancestry Opens Its Doors for NewCo.SFhttp://blogs.ancestry.com/techroots/ancestry-opens-its-doors-for-newco-sf/ http://blogs.ancestry.com/techroots/ancestry-opens-its-doors-for-newco-sf/#comments Mon, 08 Sep 2014 13:47:53 +0000 Melissa Garrett http://blogs.ancestry.com/techroots/?p=2763 Ancestry was selected as a 2014 NewCo.SF host company. Come join us at our San Francisco office on Thursday, Sept. 11 at 4:30pm PT to hear from Eric Shoup, EVP of Product at Ancestry.com. He will provide an inside look at the unique and meaningful business of family history and the tech, science, and product… Read more

The post Ancestry Opens Its Doors for NewCo.SF appeared first on Tech Roots.

]]>
Ancestry was selected as a 2014 NewCo.SF host company. Come join us at our San Francisco office on Thursday, Sept. 11 at 4:30pm PT to hear from Eric Shoup, EVP of Product at Ancestry.com. He will provide an inside look at the unique and meaningful business of family history and the tech, science, and product experience that enables millions of people to make powerful discoveries about their ancestors and in turn, themselves.

 

How do I sign up?

Register for a NewCo.SF general admission pass for free here.  Then sign up for the session at Ancestry’s office: http://sched.co/1pbfVb3

Are there perks to coming aside from listening to the speaker?

Yes. Enjoy free appetizers, beer, and wine. We’ll also be giving away ten AncestryDNA kits, each paired with an Ancestry.com membership.

newco

 

Session details

Date & time: Thursday, September 11 from 4:30pm – 6:00pm

Location:

153 Townsend St., Ste. 800
San Francisco, CA 94107
Ph. 415-795-6000

*Please note that parking may be limited due to a Giants’ game.

Speaker: Eric Shoup has over fifteen years of experience in high-tech development, product management, professional services and general business management, where he combines exceptional technical understanding and analytical ability with outstanding product marketing. He has served as Ancestry.com’s Executive Vice President of Product since February 2012. He joined the Company in August 2008 as Vice President of Product. Prior to joining the family history giant, Eric was at eBay for more than five years, where he focused on growing the eBay Stores product and ProStores business unit and also assembled and led eBay’s global mobile product team. Prior to eBay, Eric was a Director of Product Management at Commerce One, a leading provider of B2B e-commerce solutions and worked at US Interactive, designing and managing consumer e-commerce and marketing Web sites for established companies such as Lexus and Wellcome Supermarkets (Hong Kong). Eric holds a B.A. from the University of California, Los Angeles.

We look forward to welcoming you!

The post Ancestry Opens Its Doors for NewCo.SF appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/ancestry-opens-its-doors-for-newco-sf/feed/ 2
Data Driven: Presentation on the Journey to Self-Serve Analyticshttp://blogs.ancestry.com/techroots/data-driven-presentation-on-the-journey-to-self-serve-analytics/ http://blogs.ancestry.com/techroots/data-driven-presentation-on-the-journey-to-self-serve-analytics/#comments Fri, 05 Sep 2014 23:21:25 +0000 Adam Davis http://blogs.ancestry.com/techroots/?p=2755 Data analytics and visualization is everywhere we turn. We find it in websites, newspaper articles, smartphone apps, and more.  With so many options for data tools it can seem overwhelming at times to know what will work and what won’t for your data goals.  At Ancestry.com we constantly evolve to find or create the tools… Read more

The post Data Driven: Presentation on the Journey to Self-Serve Analytics appeared first on Tech Roots.

]]>
Data analytics and visualization is everywhere we turn. We find it in websites, newspaper articles, smartphone apps, and more.  With so many options for data tools it can seem overwhelming at times to know what will work and what won’t for your data goals.  At Ancestry.com we constantly evolve to find or create the tools that fit our culture and need to move quickly and nimbly with our data discovery, analytics, and reporting.  I invite you to come listen to our presentation at the Tableau Conference 2014 where we will share the Ancestry story in becoming a more data driven organization and how using Tableau helped the process. Bill Yetman and I will discuss how we implemented, adopted, and adapted Tableau to fit our needs and unique work; not only as a data discovery tool, but as a corporate enterprise BI tool as well.  We will talk about the changes in our data culture and strategy, challenges faced with a self-service model, and how we adapted the tool to integrate into our current data ecosystem.

We hope to see you there!

Session Info:

Tuesday, September 9, 2014

4:00-5:00pm PT

Washington State Convention Center

The post Data Driven: Presentation on the Journey to Self-Serve Analytics appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/data-driven-presentation-on-the-journey-to-self-serve-analytics/feed/ 1
Stop using anchors as buttons!http://blogs.ancestry.com/techroots/buttons-vs-anchors/ http://blogs.ancestry.com/techroots/buttons-vs-anchors/#comments Tue, 02 Sep 2014 21:21:12 +0000 Jason Boyer http://blogs.ancestry.com/techroots/?p=2700 Semantic buttons and links are important for usability as well as accessibility. Hyperlinks indicate a URL change, whereas buttons are used to perform an action. I thought this post up in response to a question asked on Stack Overflow over 5 years ago. Which one should you use? <a href="#" onclick="doSomething()">Do Something</a> <a href="javascript:void(0);" onclick="doSomething()">Do… Read more

The post Stop using anchors as buttons! appeared first on Tech Roots.

]]>
Semantic buttons and links are important for usability as well as accessibility. Hyperlinks indicate a URL change, whereas buttons are used to perform an action. I thought this post up in response to a question asked on Stack Overflow over 5 years ago.

Which one should you use?

  1. <a href="#" onclick="doSomething()">Do Something</a>
  2. <a href="javascript:void(0);" onclick="doSomething()">Do Something</a>
  3. <a href="javascript:doSomething()">Do Something</a>
  4. <a href="" ng-click="app.doSomething()">Do Something</a>

Answer: None of those!

All elements in HTML (except for <div> and <span>) have a semantic meaning. This semantic meaning is interpreted by browsers, screen readers, and SEO crawlers. Browsers use this semantic value to properly display the elements and to bind interactions. Screen readers also use this semantic value to bind additional different keyboard shortcuts and interactions. Lastly, SEO crawlers use this meaning to measure importance of content and define relationships.

The above examples are using the <a> element to solely perform a JavaScript action. This is incorrect, as the semantic meaning of an <a> element as defined by the W3C is as follows:

The <a> element represents a hyperlink.

A hyperlink is “a link from a hypertext file or document to another location or file.1“. Since those examples are not linking to another location, they are incorrect.

So what is the correct element to use when its sole purpose is to perform a JS action? That would be the <button> element.

The <button> element with a type attribute whose value is "button" represents a button with no additional semantics.

I feel that this documentation could be slightly improved to imply that a “button” means the element should be “clickable” or “interactive.” Applying these definitions, the correct answer would be marked up like so:

<button type="button" onclick="doSomething()">Do Something</button>

Semantic buttons provide a cross-browser focusable area and indicate an interaction is expected. If you read no further than this, please only use an <a> element if you can answer yes to this question:

1. Does the element change the URL?

If it does not, please use the <button> element. Now for some further explanations:


Why is it so commonplace to use the <a> element to perform JS actions?

The <a> provides some key benefits:

  1. It is styleable.
  2. It is focusable (you could tab to it)
  3. It provides cross-browser support for :hover, :active, & :focus states

Way back in the day, the <button> element only provided the latter two benefits. It was impossible to style it consistently cross-browser (just think of trying to style some of the new HTML5 inputs cross-browser). This has long been remedied. At Ancestry, we created a generic “.link” class to style a <button> exactly like a text link, when needed. This allows us to use the proper HTML markup and retain control of the appearance. The CSS is similar to this:

.link { -webkit-appearance:none; background:none; border:0; -webkit-box-shadow:none; box-shadow:none; color:#03678B; cursor:pointer; display:inline; font-size:inherit; margin:0; padding:0; text-align:left; text-decoration:none; }
.link:active,
.link:hover,
.link:focus { color:#03678B; text-decoration:underline; }

This resets the styles for all our supported browsers and gave us the flexibility we needed in the markup to use the proper <button> element rather than the <a>.

One note about using the <button> element – Don’t leave off the [type="button"] attribute. The default type of a button in most browsers is a submit. This means without the [type="button"] attribute/value pair, users could accidentally submit a form.


What about using the <a> as a named target?

Hyperlinks can link to sections of content in a document, as well as external links. Again, way back in the day, markup like <a name="myRegion"></a> was the only cross-browser way to link to a section on the page. All modern browsers (even back to at least IE7) allow you to link to any section that has the id attribute.

Old way:

<a href="#my-section">Go to my section</a>
...Content...
<a name="my-section"></a>
<div>My section content</div>

Better way:

<a href="#my-section">Go to my section</a>
...Content...
<div id="my-section">My section content</div>

Simply use an id attribute on the containing element of the region you want “linkable.” Don’t use an empty <a name="my-section"></a> element.


Are there any reasons you would bind a JS action to an <a>?

Of course! A common one is a hyperlink that also fires off a tracking pixel. Something like the following:

<a href="http://www.ancestry.com" onclick="fireTracking('clicked home link');">Home</a>

In the case of single page apps, you definitely need to prevent the browser from navigating away from your page, but you should still use an <a> element since the result would be changing the URL. For example:

<script>
function doPushStateStuff(newPage) {
	// pushState magic here
	return false
}</script>
...
<a href="page-2" onclick="doPushStateStuff('page-2');">Page 2</a>

This actually brings up the other key rule for <a> elements:

  1. Users should always be able to open links in a new tab (middle-click or right-click to select “open link in new tab”)

I loathe single page apps that do not allow opening pages in a new tab. It’s a huge usability issue as it prevents users from using your site in their normal routine.


In summary:

  1. Semantic buttons (<button type="button">) should be used when an element’s sole purpose is to perform a JavaScript action.
  2. <a> elements should only be used when changing the page URL and should always be openable in a new tab.
  3. You should never write html like this: <a href="#" onclick="doSomething()">Do Something</a> or this <a href="javascript:void(0);" onclick="doSomething()">Do Something</a>

Note: Use Unobtrusive JavaScript rather than onclick attributes. They simply shortened the examples.

The post Stop using anchors as buttons! appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/buttons-vs-anchors/feed/ 0
The DNA matching research and development life cyclehttp://blogs.ancestry.com/techroots/the-dna-matching-research-and-development-life-cycle/ http://blogs.ancestry.com/techroots/the-dna-matching-research-and-development-life-cycle/#comments Tue, 19 Aug 2014 20:24:30 +0000 Julie Granka http://blogs.ancestry.com/techroots/?p=2672 Research into matching patterns of over a half-million AncestryDNA members translates into new DNA matching discoveries  Among over 500,000 AncestryDNA customers, more than 35 million 4th cousin relationships have been identified – a number that continues to grow rapidly at an exponential rate.  While that means millions of opportunities for personal discoveries by AncestryDNA members,… Read more

The post The DNA matching research and development life cycle appeared first on Tech Roots.

]]>
Research into matching patterns of over a half-million AncestryDNA members translates into new DNA matching discoveries 

Among over 500,000 AncestryDNA customers, more than 35 million 4th cousin relationships have been identified – a number that continues to grow rapidly at an exponential rate.  While that means millions of opportunities for personal discoveries by AncestryDNA members, it also means a lot of data that the AncestryDNA science team can put back into research and development for DNA matching.

At the Institute for Genetic Genealogy Annual Conference in Washington, D.C. this past weekend, I spoke about some of the AncestryDNA science team’s latest exciting discoveries – made by carefully studying patterns of DNA matches in a 500,000-member database.

 

Graph showing growth in the number of 4th cousin matches between pairs of AncestryDNA customers over time

Graph showing growth in the number of 4th cousin matches between pairs of AncestryDNA customers over time

DNA matching means identifying pairs of individuals whose genetics suggest that they are related through a recent common ancestor. But DNA matching is an evolving science.  By analyzing the results from our current method for DNA matching, we have learned how we might be able to improve upon it for the future.

 

Life cycle of AncestryDNA matching research and development

Life cycle of AncestryDNA matching research and development

The science team targeted our research of the DNA matching data so that we could obtain insight into two specific steps of the DNA matching procedure.

Remember that a person gets half of their DNA from each of their parents – one full copy from their mother and one from their father.  The problem is that your genetic data doesn’t tell us which parts of your DNA you inherited from the same parent.  The first step of DNA matching is called phasing, and determines the strings of DNA letters that a person inherited from each of their parents.  In other words, phasing distinguishes the two separate copies of a person’s genome.

 

Observed genetic data only reveals the pairs of letters that a person has at a particular genetic marker.  Phasing determines which strings of letters of DNA were inherited as a unit from each of their parents.

Observed genetic data only reveals the pairs of letters that a person has at a particular genetic marker. Phasing determines which strings of letters of DNA were inherited as a unit from each of their parents.

If we had DNA from everyone’s parents, phasing someone’s DNA would be easy.  But unfortunately, we don’t.  So instead, phasing someone’s DNA is often based on a “reference” dataset of people in the world who are already phased.  Typically, those reference sets are rather small (around one thousand people).

Studies of customer data led us to find that we could incorporate data from hundreds of thousands of existing customers into our reference dataset.  The result?  Phasing that is more accurate, and faster.  Applying this new approach would mean a better setup for the next steps of DNA matching.

The second step in DNA matching is to look for pieces of DNA that are identical between individuals.  For genealogy research, we’re interested in DNA that’s identical because two people are related from a recent common ancestor.  This is called DNA that is identical by descent, or IBD.  IBD DNA is what leads to meaningful genealogical discoveries: allowing members to connect with cousins, find new ancestors, and collaborate on research.

But there other reasons why two people’s DNA could be identical. After all, the genomes of any two humans are 99.9% identical. Pieces of DNA could be identical between two people because they are both human, because they are of the same ethnicity, or because they share some other more ancient shared history.  We call these pieces of DNA only identical by state (IBS), because the DNA could be identical for a reason other than a recent common ancestor.

We sought to understand the causes of identical pieces of DNA between more than half a million AncestryDNA members.  Our in-depth study of these matches led us to find that in certain places of the genome, thousands of people were being estimated to have DNA that was identical to one another.

What we found is that thousands of people all having matching DNA isn’t a signal of all of them being closely related to one another.  Instead, it’s likely a hallmark of a more ancient shared history between those thousands of individuals – or IBS.

 

Finding places in the genome where thousands of people all have identical DNA is likely a hallmark of IBS, but not IBD.

Finding places in the genome where thousands of people all have identical DNA is likely a hallmark of IBS, but not IBD.

In other words, our analysis revealed that in a few cases where we thought people’s DNA was identical by descent, it was actually identical by state.  These striking matching patterns were only apparent after viewing the massive amount of matching data that we did.

So while the data suggested that our algorithms had room for improvement, that same data gave us the solution.  After exploring a large number of potential fixes and alternative algorithms, we discovered that the best way to address the problem was to use the observed DNA matches to determine which were meaningful for genealogy (IBD) – and distinguish them from those due to more ancient shared history.  In other words, the matching data itself has the power to help us tease apart the matches that we want to keep from those that we want to throw away.

The AncestryDNA science team’s efforts – poring through mounds and mounds of DNA matches – have paid off.  From preliminary testing, it appears that these latest discoveries relating to both steps of DNA matching may lead to dramatic DNA matching improvements. In the future, this may translate to a higher-quality list of matches for each AncestryDNA member: fewer false matches, and a few new matches too.

In addition to the hard work of the AncestryDNA science team, the huge amount of DNA matching data from over a half-million AncestryDNA members is what has enabled these new discoveries.  Carefully studying the results from our existing matching algorithms has now allowed us to complete the research and development “life cycle” of DNA matching: translating real data into future advancements in the AncestryDNA experience.

The post The DNA matching research and development life cycle appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/the-dna-matching-research-and-development-life-cycle/feed/ 6
Core Web Accessibility Guidelineshttp://blogs.ancestry.com/techroots/core-web-accessibility-guidelines/ http://blogs.ancestry.com/techroots/core-web-accessibility-guidelines/#comments Wed, 13 Aug 2014 22:08:39 +0000 Jason Boyer http://blogs.ancestry.com/techroots/?p=2655 How do you ensure accessibility on a website that is worked on by several hundred web developers? That is the question we are continually asking ourselves and have made great steps towards answering. The approach we took was to document our core guidelines and deliver presentations and trainings to all involved. This included our small… Read more

The post Core Web Accessibility Guidelines appeared first on Tech Roots.

]]>
How do you ensure accessibility on a website that is worked on by several hundred web developers?

That is the question we are continually asking ourselves and have made great steps towards answering. The approach we took was to document our core guidelines and deliver presentations and trainings to all involved. This included our small team of dedicated front-end web developers but also the dozens of back-end developer teams that also work within the front-end. This article will be the first in a series going in-depth on a variety of web accessibility practices.

Our following core guidelines, though encompassing hundreds of specific rules, have helped focus our accessibility efforts.

A website should:

  • Be Built on Semantic HTML
  • Be Keyboard Accessible
  • Utilize HTML5 Landmarks and ARIA Attributes
  • Provide Sufficient Contrast

Our internal documentation goes into detail as to why these guidelines are important and how to fulfill each requirement. For example, semantic HTML is important because it allows screen readers and browsers to properly interpret your page and helps with keyboard accessibility. Landmarks are important for they allows users with screen readers to navigate over blocks of content. Contrast is important because people need to be able to see your content!

Do our current pages meet all of these requirements? Nope. That’s why we’ve documented them so that we can provide structure to this effort and have measurable levels of success.

We have been learning a lot about accessibility during the past few months. The breadth of this topic is amazing. Lots of good people in the web community have made tremendous efforts in helping others learn. W3C’s Web Accessibility Initiative documentation and WebAIM’s explanations are definitely worth the time to study.

In my following posts, I will outline many of the rules with practical examples for each of our core guidelines. Some of the items that I’ll describe are:

  • Benefits of Semantic HTML
    Should all form elements be wrapped in a form element? Do I need to use the role attribute on HTML5 semantic elements? How do decide between these: <a> elements vs <button> elements, input[type="number"] vs input[type="text"], <img> elements vs CSS backgrounds, and more.
  • Keyboard Accessibility 101
    Should I ever use [tabindex]? Why/when? How should you handle showing hidden or dynamically loaded content? Should the focus be moved to popup menus?
  • HTML5 Landmarks
    Why I wish landmarks were available outside of screen reader software. Which landmarks should I use? How do I properly label them?

Web accessibility is the reason semantic HTML exists. Take the time to learn how to make your HTML, CSS, and JavaScript accessible. If you’re going to take the time to create an HTML page, may as well do it correctly.

The post Core Web Accessibility Guidelines appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/core-web-accessibility-guidelines/feed/ 0