Tech Roots » Testing http://blogs.ancestry.com/techroots Ancestry.com Tech Roots Blogs Tue, 16 Dec 2014 18:29:21 +0000 en-US hourly 1 http://wordpress.org/?v=3.5.2 Big Data for Developers at Ancestryhttp://blogs.ancestry.com/techroots/big-data-for-developers-at-ancestry/ http://blogs.ancestry.com/techroots/big-data-for-developers-at-ancestry/#comments Thu, 25 Sep 2014 22:59:00 +0000 Seng Lin Shee http://blogs.ancestry.com/techroots/?p=2800 Big Data has been all the craze. Business, marketing and project managers like it because they can plot out trends to make decisions. To us developers, Big Data is just a bunch of logs.  In this blog post, I would like to point out that Big Data (or logs with context) can be leveraged by… Read more

The post Big Data for Developers at Ancestry appeared first on Tech Roots.

]]>
Big Data has been all the craze. Business, marketing and project managers like it because they can plot out trends to make decisions. To us developers, Big Data is just a bunch of logs.  In this blog post, I would like to point out that Big Data (or logs with context) can be leveraged by development teams to understand how our APIs are used.

Developers have implemented logging for a very long time. There are transaction logs, error logs, access logs and more. So, how has logging changed today? Big Data is not all that different from logging. In fact, I would consider Big Data logs as logs with context. Context allows you to do perform interesting things with the data. Now, we can correlate user activity with what’s happening in the system.

A Different Type of Log

So, what are logs? Logs are record of events, and frequently created in the case of applications with very little user interaction. It goes without saying that many logs are transaction logs or error logs.

However, there is a difference between forensics and business logs. Big Data is normally associated with events, actions and behaviors of users when using the system.  Examples include records of purchases, which are linked to a user profile and spanned across time. We call these business logs.  Data and business analysis would love to get a hold on this data; run some machine learning algorithms and finally predict the outcome of a certain decision to improve user experience.

Now back to the developer. How does Big Data help us? On our end, we can utilize forensics logs. Logs get more interesting and helpful if we can combine records from multiple sources. Imagine; hooking in and correlating IIS logs, method logs and performance counters together.

Big Data for Monitoring and Forensics

I would like to advocate that Big Data can and should be leveraged by web service developers to:

  1. Better understanding the system and improve performance of critical paths
  2. Investigate failure trends which might lead to errors or exacerbate current issues.

Logs can include:

  1. Method calls (including context of call – user login, ip address, parameter values, return values etc.)
  2. Execution time of method
  3. Chain of calls (e.g. method names, server names etc.)
    This can be used to trace where method calls originate

With the various data being logged for every single call, it is important that the logging system is able to hold and process huge volume of data. Big Data has to be handled on a whole different scale. The screenshots below are charts from Kibana. Please refer here to find out how to set up data collection and dashboard display using this suite of open source tools.

Example Usage

Based on the decision as to what kind of monitoring is required, the relevant information (e.g. context, method latency, class/method names) should be included in Big Data logs.

Detecting Problematic Dependencies

Plotting time spent in classes of incoming and outgoing components provides us with visibility into the proportion amount of time spent in each layer of the service. The plot below revealed that the service was spending more and more time in a particular component; thus warranting an investigation.

Time in Classes

Discovering Faulty Queries

Logging all exceptions, together with the appropriate error messages and details, allows the developers to determine the circumstances under which a method would fail. The plot below shows that MySql Exceptions started occurring at 17:30. Due to the team including parameters within logs, we were able to determine that invalid queries were used (typos and syntax errors).

Exceptions

Determine Traffic Pattern

Tapping into the IP address of incoming request reveals very interest traffic patterns. In the example below, the graph indicates a spike in traffic. However, upon closer look, this graph shows that spike spanned across ALL countries. This concludes that this spike in traffic is not due to user behavior and this leads to further investigation other possible causes (e.g., DOS attacks, simultaneous updates for mobile apps, error in logs etc.) In this case, we found out it was a false positive; repeated reads in log forwarders through the logging infrastructure.

Country Traffic With Indicator

Determine Faulty Dependents (as opposed to dependencies)

Big Data log generations can be enhanced to include IDs to track the chain of service calls from clients through to the various services in the system. The first column below indicates that traffic from the iOS mobile app passes through the External API gateway before reaching our service. Other columns indicate different flows, thus allowing developers enough information to detect and isolate problems to different systems if needed.

Event Flows

Tracking Progression Through Various Services

Ancestry.com has implemented a Big Data framework across all services to support call tracking across different services. This helps developers (who are knowledgeable on the underlying architecture) to debug whenever a scenario doesn’t work as expected. The graph below depicts different methods being exercised across different services, where each color refers to a single scenario. Such data provides full visibility on the interaction amongst different services across the organization.

Test Tracking

Summary

Forensic logs can be harnessed and used with Big Data tools and framework to greatly improve the effectiveness of development teams. By combining various views (such as the examples above) into a single dashboard, we are able to provide developers with a health snapshot of the system at any time in order to determine failures or to improve architectural designs.

By leveraging Big Data for forensics logging, we, as developers are able to determine faults and reproduce errors messages without the conventional debugging tools. We have full visibility into the various processes in the system (assuming we have sufficient logs). Gone were the days when we need to instrument code on LIVE boxes because the issue only occurs in the LIVE environment.

All of these work are done independently of the Business Analysts and are in fact, very crucial to the agility of the team to quickly react to issues and to continuously improve the system.

Do your developers use Big Data as part of daily development and maintenance of web services? What would you add to increase visibility in the system and to reduce bug-detection time?

The post Big Data for Developers at Ancestry appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/big-data-for-developers-at-ancestry/feed/ 2
Why Have a Browser Support Policy?http://blogs.ancestry.com/techroots/browser-support-policy/ http://blogs.ancestry.com/techroots/browser-support-policy/#comments Mon, 06 Jan 2014 21:21:44 +0000 Jeff Lord http://blogs.ancestry.com/techroots/?p=1649 With the growing number of web browsers and mobile devices being used to access content on the internet, it has become increasingly important for organizations to solidify a browser/device support policy. Internally, this type of policy can help with the development and testing of new features and pages by focusing time, effort, and resources on… Read more

The post Why Have a Browser Support Policy? appeared first on Tech Roots.

]]>
With the growing number of web browsers and mobile devices being used to access content on the internet, it has become increasingly important for organizations to solidify a browser/device support policy. Internally, this type of policy can help with the development and testing of new features and pages by focusing time, effort, and resources on a select set of browsers and devices. Externally, users will have a clear understanding of expected functionality and the adjustments they can make to ensure the best experience possible when using the site.

With approximately 2.7 million subscribers and hundreds of thousands of unique visitors a day all using their preferred browsers and devices to access our site, Ancestry needed to define where our teams should focus and prioritize their time. To accomplish this, a committee of development, product, and QA representatives was organized and tasked to develop a browser support policy that accurately reflected the latest industry standards, as well as our particular users’ preferences.

As a result, the following tier system is based not only on the latest global web browser and mobile device usage statistics, but also specific analytics and percentages for our own unique users.

 

browser

Tier 1 – Both Functionality and Visual Design
Browsers accounting for at least 10% or more of unique visitors for two consecutive months will be fully supported. This includes basic functionality as well as proper visual design behavior. These browsers will be tested during regular regression and when pages are changed. All bugs will also be triaged and fixed in the indicated timeframe. Browsers in this category will continue to be fully supported until they account for less than 10% of visitors for two consecutive months. Those browsers will then receive Tier 2 support (until/unless their usage drops below 5% for two consecutive months).

Tier 2 – Functionality
Browsers accounting for 5-10% of unique visitors will receive Tier 2 support. This means visual elements on the site need not appear perfectly, but all features must be functional. Basic testing is required, and major bugs will be triaged and fixed in the indicated timeframe. A browser whose traffic falls below 5% for two consecutive months will receive Tier 3 support.

Tier 3 – No Support
Browsers accounting for less than 5% of unique visitors in two consecutive months will not be individually supported. This group will be separated into two groups: uncommon browsers, and out of date browsers. Since the majority of uncommon browsers tend to follow web standards, they will generally receive an adequate experience on Ancestry.com and therefore shouldn’t be prompted to download a supported browser. Visitors using out of date browsers who can upgrade to a supported browser, however, should be prompted to do so.

As for mobile devices, we have designed a majority of our pages to be responsive to the width of the browser. Pages that have been converted receive Tier 1 support, with all other pages receiving Tier 2 support.

The hope is that with this policy in place, it will save time and effort internally, while providing customers and users with the best experience possible on the browsers and devices they use the most.

The post Why Have a Browser Support Policy? appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/browser-support-policy/feed/ 0
Featured Article: Migration to Continuous Delivery at Ancestry.comhttp://blogs.ancestry.com/techroots/featured-article-migration-to-continuous-delivery-at-ancestry-com/ http://blogs.ancestry.com/techroots/featured-article-migration-to-continuous-delivery-at-ancestry-com/#comments Sat, 07 Dec 2013 00:50:57 +0000 Seng Lin Shee http://blogs.ancestry.com/techroots/?p=1601 Starting with the adoption of Agile development practices, Ancestry.com has progressed to a continuous delivery model to enable code release whenever the business requires it. Transitioning from large, weekly or bi-weekly software rollouts to smaller, incremental updates has allowed Ancestry.com to increase responsiveness and deliver new features to customers more quickly. Ancestry.com has come a… Read more

The post Featured Article: Migration to Continuous Delivery at Ancestry.com appeared first on Tech Roots.

]]>
Starting with the adoption of Agile development practices, Ancestry.com has progressed to a continuous delivery model to enable code release whenever the business requires it. Transitioning from large, weekly or bi-weekly software rollouts to smaller, incremental updates has allowed Ancestry.com to increase responsiveness and deliver new features to customers more quickly. Ancestry.com has come a long way in regards to developing a continuous delivery model and will continue to evolve to further adapt to the fast changing pace of the market.

The lessons learned from our efforts in building a  continuous delivery model have been featured in TechTarget’s SearchSoftwareQuality online magazine. You can view our photo story here.

The post Featured Article: Migration to Continuous Delivery at Ancestry.com appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/featured-article-migration-to-continuous-delivery-at-ancestry-com/feed/ 0
CSS Woeshttp://blogs.ancestry.com/techroots/css-woes/ http://blogs.ancestry.com/techroots/css-woes/#comments Tue, 20 Aug 2013 08:00:03 +0000 Anders http://blogs.ancestry.com/techroots/?p=979 I’m not a Front End Developer, but I often find myself writing, tweaking and adjusting style-sheets to make a particular element look just right, fix layout bugs and deal with cross-browser issues. Most often I will find someone else that has already done what I want to do and look at how they’ve styled a given… Read more

The post CSS Woes appeared first on Tech Roots.

]]>
I’m not a Front End Developer, but I often find myself writing, tweaking and adjusting style-sheets to make a particular element look just right, fix layout bugs and deal with cross-browser issues.

Most often I will find someone else that has already done what I want to do and look at how they’ve styled a given element.

This is hard because for any given element there are hundreds of computed styles, so finding the crucial style differences between the element I’m styling and the element I want it to look like, is a little like finding a needle in a haystack (particularly for someone who isn’t a CSS wizard).

Thus, I wrote a tool to point out the differences for me: http://elementcomparer.aws.af.cm/

As an [overly-obvious] example, if I was working on the front page of Ancestry.com, and couldn’t figure out how to make a couple of buttons look the same, I would click on:

Image 004

And then click on:

Image 005

And inspect the differences:

Image 006

In this case, I can tell that I just need to add a few classes to the first element, let’s try ‘orange’ and ‘lrg’:

Image 007

Problem solved. I wish it was always as easy as just adjusting a couple of classes…

Clearly, this is a contrived example, but I use it all the time for more complex styling issues, and I’ve passed the tool along to my front-end developer friends here. I can only imagine the complex CSS styling issues they have to deal with.

There are likely bugs and browsers the tool doesn’t work in yet, but please give it a try and leave some feedback below.

The post CSS Woes appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/css-woes/feed/ 0
Creating Random Data for Testinghttp://blogs.ancestry.com/techroots/creating-random-data-for-testing/ http://blogs.ancestry.com/techroots/creating-random-data-for-testing/#comments Wed, 12 Jun 2013 14:21:29 +0000 Anders http://blogs.ancestry.com/techroots/?p=80 In my experience, tests that emulate real-world usage and use real-world data, find more relevant bugs, convey intent more clearly, and exercise the system under test more thoroughly than tests that do not. Consider testing a cab service to assert that a given vehicle arrives at its destination: cabService.SendVehicleToDestination(vehicle, destination); Assert.AreEqual(vehicle.Location, destination); Now, it shouldn’t… Read more

The post Creating Random Data for Testing appeared first on Tech Roots.

]]>
In my experience, tests that emulate real-world usage and use real-world data, find more relevant bugs, convey intent more clearly, and exercise the system under test more thoroughly than tests that do not. Consider testing a cab service to assert that a given vehicle arrives at its destination:

cabService.SendVehicleToDestination(vehicle, destination);
Assert.AreEqual(vehicle.Location, destination);

Now, it shouldn’t particularly matter what vehicle is sent, or to where it is sent, but it’s often valuable to provide objects to methods that looks like the kind of objects an actual consumer of the service would send. Enter Randomator, a tool I created to use when scaffolding test objects:

public Vehicle MakeRandomVehicle()
{
    return new Vehicle
    {
        Color = Randomator.RandomColor(),
        Year = Randomator.RandomNumber(2000, DateTime.Now.Year + 1),
        Make = _makes[Randomator.RandomNumber(_makes.Length)],
        Owner = string.Format("{0} {1}", Randomator.RandomFirstName(
                Randomator.Gender.Any), Randomator.RandomLastName())
    };
}

A sample run produces:

Vehicle

Put it all together and the test looks like:

// Arrange
CabService cabService = new CabService();
Vehicle vehicle = MakeRandomVehicle();
String destination = Randomator.RandomLocation();

// Act
cabService.SendVehicleToDestination(vehicle, destination);

// Assert
Assert.AreEqual(vehicle.Location, destination);

This is, of course, a rather contrived example. We don’t have a cab service at Ancestry.com, fun as that would be, but as a family history company, we do have a lot person, event, location, and relationship objects for which this tool proves useful. Randomator has seen extensive use, enough so that I’ve ported it from C# to JavaScript and Java and made it available to the public under the MIT license. Check it out at: https://github.com/Ancestry/Testing-Utilities

The post Creating Random Data for Testing appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/creating-random-data-for-testing/feed/ 0
Testing, Code Coverage, and Other Ways You Could Be Wasting Timehttp://blogs.ancestry.com/techroots/testing-code-coverage-and-other-time-wasters/ http://blogs.ancestry.com/techroots/testing-code-coverage-and-other-time-wasters/#comments Wed, 22 May 2013 14:45:43 +0000 Anders http://blogs.ancestry.com/techroots/?p=292 I’ll be the first to say that testing and code metrics can improve software quality and increase productivity, but an overzealous application of either could incur a heavy cost.Tests are code, code is overhead, and while some overhead is necessary and even advisable, overhead is debt and should be minimized whenever possible.There is no perfect… Read more

The post Testing, Code Coverage, and Other Ways You Could Be Wasting Time appeared first on Tech Roots.

]]>
I’ll be the first to say that testing and code metrics can improve software quality and increase productivity, but an overzealous application of either could incur a heavy cost.

Tests are code, code is overhead, and while some overhead is necessary and even advisable, overhead is debt and should be minimized whenever possible.

There is no perfect product, bugs will be deployed to live systems (no matter how many tests and quality checks are in place), and for the most part, customers not only tolerate this, they expect it to one degree or another.  What will really win over clients is how fast bug fixes are delivered.

Time is often better spent refactoring existing code, and improving feedback and logging systems. Making code testable will make it far easier to diagnose problems and expedite repairs.  Tests should be put in place to prevent the same bug from surfacing more than once, but more time should be spent improving existing code quality than extending a safety net of tests to compensate for poor code quality.

I hear the term “code coverage” tossed around as if it were a panacea, it’s like the thought is, “if we improve our code coverage, it will solve all our quality problems”.  Code coverage is useful tool, but certainly not the most important measure of quality. Depending on the code, 30% code coverage could be plenty, or in another case, 70% might not be enough – and keep in mind, that even if a particular piece of code is 100% covered, it may still need more testing. Code coverage is only one metric, and at best it’s an incomplete indicator of test coverage. All too often, a rule that requires 80% code coverage encourages testing of things like:

class Foo
{
    public string Bar { get; set; }

    public Foo(string bar)
    {
        Bar = bar;
    }
}

 

[Test]
public void TestFoo()
{
    var foo = new Foo("bar");

    Assert.AreEqual("bar", foo.Bar);
}


This is not testing the “Foo” class or business logic, and while it does increase code coverage, ultimately all that’s being tested is the C# language itself. This is not a good use of time since you can trust the language to do its job correctly.

Testing, code coverage, static code analysis, and related tools and practices, while important, have lower value than improving code quality and testability. Ultimately, no amount of success in testing can compensate for failure in code quality.

The post Testing, Code Coverage, and Other Ways You Could Be Wasting Time appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/testing-code-coverage-and-other-time-wasters/feed/ 2
Acceptance Testing at Ancestry.comhttp://blogs.ancestry.com/techroots/acceptance-testing-at-ancestry-com/ http://blogs.ancestry.com/techroots/acceptance-testing-at-ancestry-com/#comments Tue, 02 Apr 2013 15:31:11 +0000 Seng Lin Shee http://blogs.ancestry.com/techroots/?p=143 What Are Acceptance Tests? Many developers are confused with the jargon used by test and software engineers when developing tests. Even test developers (TE/SET/SDET) are confused with these terms. In general, test suites occur in the following varieties: Unit tests Integration tests End-to-end tests To add to the confusion, there are: Functional tests Acceptance tests… Read more

The post Acceptance Testing at Ancestry.com appeared first on Tech Roots.

]]>
What Are Acceptance Tests?

Many developers are confused with the jargon used by test and software engineers when developing tests. Even test developers (TE/SET/SDET) are confused with these terms.

In general, test suites occur in the following varieties:

  • Unit tests
  • Integration tests
  • End-to-end tests

To add to the confusion, there are:

  • Functional tests
  • Acceptance tests

We understand that it is good practice to run a product against a set of Acceptance Test suites before releasing code, or before a feature can be considered complete.  As our team performs Test-driven development (TDD) and the Behavioral-driven Development (BDD), developers will need to create an acceptance test first, and are initially confused on what the tests should look like.

I refer you to Jonas Bandi’s blog. Jonas provided a good explanation of what an acceptance test is.

AcceptanceVsIntegrationTests

Acceptance Tests with relation to unit, integration and end-to-end tests

To summarize, an Acceptance Test is a way to represent a test case; a test case that meets the acceptance criteria of a story.  An Acceptance Test is basically the contract between the product owner and the developers, when feature development starts in a development life cycle (be it Sprint or Waterfall model).  A story in this context refers to a feature set that product management wants delivered to the customer.

So, the question would be: What would be the best approach to write acceptance tests?

My answer would be: It depends…

To clarify my answer, let me provide 3 scenarios. You see, unit tests, integration tests and end-to-end tests can all be written as Acceptance Tests; depending on what the feature is.

  • A unit test may be appropriate if we are delivering something which is simple, doesn’t require complex dependencies and strict behavior between interfaces.
  • Integration tests are commonly utilized when the acceptance criteria are restricted to interactions between well-known modules.
  • Most of the time, features are shipped based on its usability and the ability to perform a certain task or function. This would involve user scenarios that may involve multiple feature sets to work together to achieve that wanted result. This is when an end-to-end test would be handy. Such tests are normally the closest to real world scenarios.

The key point here is that the feature specification should be known beforehand. If there is an unknown, either that has to be fleshed out, or the story needs to be sliced into smaller pieces and addressed independently.

Integration of Acceptance Tests into Development Cycle

Be it the Waterfall or Agile model, Acceptance Tests are an integral part of the TDD and the BDD methodology applied in a software development process.

The Ancestry API team uses Agile for their development process. In order to fully benefit from Acceptance Testing, the act of designing and writing has to be part of the Sprint cycle. This means that everyone (stake-holders, dependent parties, developers, testers etc.) has to participate in a grooming meeting to vet out the requirements of a proposed feature which will be developed.

The grooming/planning meeting for vetting out the acceptance criteria, which will in turn be used to construct the acceptance test; provides an avenue for the customer/product owner to convey the requirements and expected scenarios to the development team. The important concept is that there should be a two-way communication between the development team and product owner (e.g. stake holders, program managers etc.). If there are any ambiguities during the conversation (related to implementation, dependencies, testability etc.), those have to be fleshed out before a story can be considered READY for development to start.

Catching Problems in Specification Phase

The READINESS of the story is critical because unknown variables may impact the velocity of the team. Problems such as incorrect architecture may throw up the entire sprint cycle and render the implementation unusable.

Our team has benefited greatly from this approach when unknowns were identified by upstream dependencies. This resulted in a conclusion that there is not enough features and data provided by library methods in order to implement the required scenarios. As a result, the story was considered not ready and was sent back to the drawing board. This saved the team plenty of time compared to not detecting the problem and much effort being wasted in designing the wrong implementation which does not meet the requirement.

Acceptance Criteria to Test Code

The development of the acceptance test is an integral part of the development process. One approach that we have taken is to represent an acceptance test in a form which is readable to the product owner. In our case, we utilize Gherkin in .NET via the SpecFlow framework.

Benefits of using the SpecFlow framework:

  • Reusability of GIVEN, WHEN and THEN statements
  • Isolation of responsibility statement makes execution and validation easier in the test
  • Regular expressions in statement definition allows flexibility in reusing statements for various scenarios
  • Good integration with Visual Studio (debugging etc.)

An Acceptance Test would ideally be a direct translation from an acceptance criteria define by the stake holders of the feature. There is a melting of skill sets between stake holders and development during this stage. Stake-holders would write acceptance criteria in the Gherkin form, being aware of:

  • Initial states of the scenario (GIVEN statements)
  • Feature to implement (WHEN statements)
  • Expected outcome (THEN statements)

What results is a specification that ensures:

  • Development team is building the feature correctly
  • Feature team is building the right feature
  • Product is testable – which allows automation
  • All dependencies and ambiguities have been fleshed out

The development team would rewrite this in SpecFlow, automate the tests and introduce the new acceptance tests into the test suite. When all acceptance tests pass, the feature would be considered completed in terms of meeting the requirements of stake holders.

Is That The Only Testing Needed For a Feature?

Of course not. There are plenty of other tests to ensure the quality of the product; a topic which is not the focus of this article. However, acceptance testing drives the behavioral-driven design approach of the development process and is critical in ensuring a healthy communication channel between stake holders and the development team.

Future Talks

I will be giving a presentation on Continuous Delivery at Ancestry.com on June 5, 2013 at Caesars Palace, Las Vegas, NV during the Better Software & AGILE Development Conference WEST.

AGILEWEST2013

The post Acceptance Testing at Ancestry.com appeared first on Tech Roots.

]]>
http://blogs.ancestry.com/techroots/acceptance-testing-at-ancestry-com/feed/ 1