SEO Step 4:

Data Analysis

After Identifying Site BottlenecksDelivering Solutions, and understanding Measurement tools, we are now in a position to begin the deeper data analysis and providing tangible meaning to the data presented.

To first understand if a metric is under or overperforming, benchmarks must first be set. A benchmark is a carefully calculated value per metric that is used as a standard, or a “what we achieve on a regular day without influence”. Benchmarks are key to understanding how successful a solution delivery is, as it gives us a standard to work with. Benchmarks also ensure that we are comparing apples to apples when looking at data. For example, a deficient benchmark would be to use toy sales in the last week before Christmas to compare against full-year sales: the last week of Christmas has the highest volume of sales for toys, and to compare the rest of the year to that benchmark would make it seem that the rest of the year “does not perform to that same level”, which could be an unnecessarily negative outcome.

There are a lot of ways to create benchmarks dependent on what we are measuring. For our purposes, as we are looking for performance improvements for a site element, webpage or the site entirely. Also, we are looking for key business KPIs, such as page views, clicks, average session duration, conversions, etc. Subsequently, we need to look over a period of time to create an average or a median metric benchmark that will reflect a business’s prior success. For businesses that have seen significant growth in a short period, it may be difficult to look at, for example, a year-long web report and take a rough average. So, we will break down benchmark creation based on two methods: long-term historical benchmarking and forecasted growth benchmarking.

1. Long-run Historical Benchmarking

Long-run historical benchmarking is a method of benchmarking that looks at a lot of past web performance data to create benchmarks. This method tends to work with websites that have existed with web data capabilities for a reasonable period, websites that have seen steady (not growing or declining) web performance, or sites that do not directly drive the majority of business growth/sales. To create an annual performance benchmark (as one of the simplest benchmarks), a company can compare of number of years’ data and create an average/median benchmark for web metrics. It is key for an annual benchmark to be compared against other years to account for any external variables that affected web performance. For example, if one year one certain month received an unusual/unplanned amount of page views or conversions, then looking at longer time frames ensures that such discrepancies do not skew the benchmark too heavily. Long-run historical benchmarking is also very useful for creating month-over-month benchmarks. For example, if you wish to see what your benchmark page views are for December, then, using the long-run historical benchmarking methods, you will compare December 2019’s total page views with December 2018’s, December 2017’s, and so on for as many years of data that are possible. This method is the easiest method to create benchmarks for web performance because they do not require intensive calculations or models, and these benchmarks can easily be created even just by looking at graphs. But there are some caveats to this technique. First, the amount of usable data to create a benchmark is highly constrained by the growth of the business’ website. If a business is focused around e-commerce or it is a major channel of business growth, then the business will obviously make constant efforts to increase web performance through marketing efforts. Thus, creating a flat benchmark using yearly data will not be accurate. Thus, the first main constraint is the type of product/service that the business provides. This benchmarking model works best for B2B, landing pages exclusively designed to redirect customers to physical POS locations, etc.

The second caveat for this model is that if the first round of solutions deliveries there is a positive change in web performance, then the first version of the benchmark will no longer be relevant, and another benchmark will need to be developed. But, because the “new performance” values are relatively new and there is not enough data to measure through an entire year or possibly even for a month. Thus, if there is a significant change in performance after the creation of the first benchmark, new data will need to be analyzed to create a new benchmark. The benefit to this model is that it is a very easy and non-technical process to create new benchmarks.

2. Forecasted Growth Rate Benchmarking

Forecasted growth rate benchmarking is another method to create benchmarks for web performance. It is vastly different to long-run historical benchmarking in that this model does not require very much prior data, it is a highly technical and quantitative calculation process, and is suited for a very different host of sites. For example, companies with constant site performance growth, B2C businesses and sites that drive the majority of business growth/sales will benefit more from using this model. Forecasted growth rate benchmarking uses external business KPIs (i.e sales volume, revenue, engaged users, etc.) to create a forecasted growth rate for future site KPIs which then acts as a benchmark.

The reasoning behind using external business KPIs as a rough correlation to drive website performance is dependent on the type of business (as stipulated above): if an e-commerce site is seeing an increase in volume being sold, then there will be higher web performance. The specific KPI that will show high correlation with changes in website performance is dependent on business intelligence analysis, and businesses will need to invest their own time into finding the most relevant KPI.

Once a KPI is selected, we need to take a look at as much past data as possible exists for this KPI. For our study, we will look at revenue as the business KPI. Follow the steps below to develop a model and to calculate a KPI growth rate:

Acquire as much historical data as possible for the selected KPI

Draft a table in Microsoft Excel with all the organized datapoints

Use the “Create graph” function to create a scatter plot with the KPI on the y-axis

Insert a trendline with the following conditions:

  1. The trendline cannot be a “logarithmic” or a “moving average” function
  2. The trendline cannot be a polynomial function with an order greater than 2
  3. The R2 must be greater than 0.95
  4. The trendline equation must be viewable

Please note that although this process permits the use of exponential, polynomial, and a number of other different types of functions, I highly recommend only using a linear or a quadratic function, because the calculations will be significantly simpler and business KPI graphs tend not to look like very complicated functions.

Now, with a trendline that now closely mirrors the empirical data, we can use the trendline equation to compute a KPI growth rate. By taking the function’s derivative, we can find the KPI growth rate. For example, if the function for my revenue model is: Revenue = 2*(Time) + 100, then: Revenue growth rate = 2. For functions that have an order greater than two, you will notice that the independent variable (i.e. time) will still exist in the KPI growth rate, and that is completely normal.

With this KPI growth rate, we have developed a benchmark that your solution deliveries should outperform. There are some major caveats to this model. First, creating an assumption that a business KPI and web performance is a very large assumption, and there is very little research to prove this assumption can be made soundly. Second, this model only works for companies that are B2C and depend on online sales to drive growth.

Now that we have understood the two foremost methods of creating benchmarks, we can now try to assess meaning from the solution delivery data. If we can see that, after the time that the solutions were implemented, that there was a clear positive change in performance, then it can be concluded that the solution delivery works. But in many cases, there are 2 drawbacks to trying to analyze website performance data.

First, it is very rare to see a clear change in site performance simply by optimizing a website’s front and back end. For a user to click into a website, they must intrinsically be interested in what the website has to offer. So, even with a highly optimized site, a developer cannot influence the number of incoming traffic off of search engines like Google.

Second, it is very difficult to see clear changes in performance if the website has very low page views in the first place. For example, if a webpage has a benchmark of 5 views a day, and 2 days after the introduction of a few website solutions there are 15 page views, that is over 300% the benchmark. Arguably, that is a very significant increase in performance. But, with such low page views, 15 views could easily have been an anomaly or an error. On the other hand, if the benchmark was 5,000-page views a week, and the next week there were 15,000 page views a week, then clearly something is working well.

The key to understanding the data is to ensure that any changes in performance are sustained and consistent over a reasonable period of time. If you have any questions on any of the concepts or ideas referred to in these articles, please feel to reach out to Siddharth Gupta at