overview

I worked with a group of designers to evaluate the performance of two versions of a website by analyzing user behavior. After creating the two prototypes, we formulated a number of hypothesis about relative usability. Visitors were randomly presented with version A or B and we collected data about their sessions. We used this data to test and reflect on our hypothesis. The second part of the study involved eye tracking. We prepared an additional set of hypothesis and then conducted a number of eye tracking tests to identify how users navigate the two versions of the website. We reflected on our findings and made final comparisons based on the user tests.

From here you will randomly be take to version A or B of the website: https://peaceful-island-36741.herokuapp.com/

Team Members: David Walden, Lauren Hung, Julia Lemle and Jennifer Spatz

 

Screen Shot 2017-12-07 at 12.49.02 PM.png
Screen Shot 2017-12-07 at 12.48.18 PM.png

Part I: A/B Testing

8 Hypotheses

I. Click through rate : proportion of sessions that had a click (at least one click)

A. Null hypotheses: The click through rate on Version A will be equal to the click through rate on Version B.

B. Alternative hypotheses: The click through rate from Version A will be greater than that of Version B, because Version A has more colorful imagery .

II. Time to click : average time it took a user to do the first click (this value only exists if there was a click)

A. Null hypotheses: The time to click will be equal for Version A and Version B.

B. Alternative hypotheses: The time to click will be shorter for Version A than that of Version B because Version B requires users to scroll for more information.

III. Dwell time: average time a user spent on the external page before returning to the landing page

A. Null hypotheses: The dwell time will be the same for version A and B.

B. Alternative hypotheses: The dwell time will be longer for version B than version A because users will be reluctant to return to a page that takes longer to navigate.

IV. Return rate : proportion of sessions that left the landing page and returned

A. Null hypotheses: The return rate will be the same for version A and B.

B. Alternative hypotheses: The return rate will be greater for version A than version B because users will be more likely to return to a page that takes less time to navigate.

 

Analysis

I. Click through rate : proportion of sessions that had a click (at least one click)

(number of unique clicks) / (number of unique sessions)

A. 59%

B. 28%

  • The correct statistical test for click-through rate is the Chi-square test because we are measuring categorical data. p = 0.0354
  • The p-value is small enough that we can reliably conclude Version A of the website is more successful than Version B in terms of click through rate. This supports our alternative hypothesis.

II. Time to click : average time it took a user to do the first click (this value only exists if there was a click)

Average of [(click time) - (page load time)]

A. 14124 ms

B. 25000 ms

  • The correct statistical test for time to click is the T-test because we are measuring difference in means. p = 0.13
  • This p-value suggests that Version B has a significantly greater average time to click, but the p-value is large enough that there remains some doubt as to the validity of this observation.

III. Dwell time: average time a user spent on the external page before returning to the landing page

Average of [(2nd page load time) - (click time)]

A. 5006 ms

B. 4230 ms

  • The correct statistical test for dwell time is the T-test because we are measuring difference in means. p = 0.74
  • This p-value indicates that there’s no significant difference between the dwell time of Version A compared to Version B.

IV. Return rate : proportion of sessions that left the landing page and returned

(number of returns) / (number of clicks)

A. 64%

B. 43%

  • The correct statistical test for return rate is the Chi-square test because we are measuring categorical data. p = 0.3
  • This p-value tells us that there is no significant difference between the return rate of Version A and Version B. A larger sample size would potentially yield more conclusive results.

 

Part II: Eye Tracking

Group Hypotheses

Version B will have more distributed eye gaze fixations than Version A, because Version B requires scrolling to see all taxi options, while Version A has all options in the initial window.

 

Analysis

The eye tracking results seem to support our hypothesis that eye gaze fixations would be more distributed on Version B. Users tended to focus on the center of the page where most of the content is concentrated when navigating Version A, but their eyes would move up and down the page when scrolling through Version B. The time-to-click of Version A was much shorter than that of Version B, presumably because all the content was much more condensed.

One thing that I didn’t expect to observe was the fact that the users tended to fixate their gaze at the top of the page for a significant period of time when viewing both versions of the website. They seemed to be reading the title and fine text before proceeding to navigate through the rest of the website. I question weather a typical user would spend this much time studying the top of the page outside of a laboratory setting.

 

Part III: Comparison

Recommendations

Our company should use Version A of the website because it had a significantly higher click through rate than Version B according to the A/B Testing data. Both the A/B Testing and the Eye Tracking data indicated that the time-to-click was shorter for Version A, which suggests that Version A is more effective in facilitating access to taxi-related content. It would be a good idea to make some minor changes to Version A of the website and perform additional A/B Testing and Eye Tracking tests.

 

A/B Testing Data vs. Eye Tracking Data

While both tests suggested that the time to click was shorter on Version A than Version B, the A/B test gave a more quantitative statistical analysis of the data and the Eye Tracking test helped shed light on why the time to click was shorter. Version A displayed all the pertinent information on the initial screen, whereas Version B required the user to scroll down and scan to find the options; this is an important observation that was only obvious through Eye Tracking analysis.

The A/B Testing provides a more objective evaluation of data relating to the usage of our websites. It gives us concrete statistical figures that can be used to optimize certain specific features of a website or application, but sometimes these numbers are too specific. A/B testing is a time consuming process and certain performance metrics might not be measurable with this approach. Eye Tracking is quicker to implement, and you can immediately see visual trends just by glancing at the data. However, this technique is less precise than A/B Testing, and there’s more room for testers to be influenced by personal biases.