The sites in Sweden with the most unnecessary payload

13 June 2013

How big should a web page be today? Checking HTTP Archive (one of Steve Souders fantastic hobby project that keeps track of how the world's web sites are built) the weight of the average page is about 1.4 megabyte. That is quite large and more interesting is how much of the size is unnecessary payload? We (me and my friend Per-Anders Rangsjö) decided we should investigate that and do it for the biggest Swedish sites.

First, why is the payload size interesting? Well, the data that is sent in each response can add latency to your application (both the cost of the bytes sent and penalties for crossing IP packet boundaries). That's why it is important to not send any information that is unnecessary and could have been removed if the site was built using standard web performance best practice rules.

Following the rules makes the site faster for the end user, cheaper for business and better for the environment! There are many other web performance best practice rules that will help your site, but today we will only focus on the amount of data sent.

The tools

Ok, how shall we measure these kinds of things and how do we automate it? Think I have an idea ...

The nutty professors way of testing

... maybe we can start to check if the images are the right size, by checking images real sizes vs browser size. We could use HTML5 (using PhantomJS) and then crop the images using ImageMagick. Crunch them with ImageOptim and/or JpegMini. Then we would know if the images are compressed and are using the right width & height! By doing some curl magic we could see if the pages are compressed (gzipped), maybe use Zopfli to crunch text assets and Yahoos (old) YUICompressor to minify them? What do you say?

The secret weapon

Hey stop, that is building the wheel once again! You could do this in an easier way by using a tool that already exists. Going through different alternatives Google Page Speed and the API will do the trick! With some hacking, we can automate the tests of the pages. The code is here on Github if you want to test for other sites or develop it further.

The sites

Which sites shall we test? In Sweden the largest ad-driven sites are reporting the number of unique users & page views to an organization called KIA index (in the article, I will write the largest sites and mean those found in KIA index). Lets use that list and take the 100 largest sites and test their start pages. Knowing the amount of unique users and page views are cool, maybe we can use them for some calculations.

The rules

We will use the Google Page Speed ​​API to check how much you can save by following these rules:

You can read more about of how to reduce payload here.

The test

The test was run May 8 with the statistics from KIA Index Week 17. We tested the 100 biggest sites, but KIA index merges the sites that are under the same ownership so the total became 211 sites.

Savings per rule

It looks like this per rule:

Rule Median (kb) Max (kb)
Optimize Images 87.2 864.4
Enable Gzip Compression 56.1 886.6
Serve Scaled Images 55.5 830.4
Minify JavaScript 13.35 422.5
Minify CSS 4.8 49.0
Minify HTML 2.8 228

What do the numbers mean? Well, the median site could save 87 kb if they optimized their images. How much is 87 kb? Check out this image, it is 86 kb. That is the amount of extra data that is sent. And there is a site out there that could bring down the image weight by 864 kb! Wow, staggering numbers that is a huge amount of extra data for one page view!

For an end user perspective it is also really important to use progressive images. We haven't tested that for the Swedish sites and it would be really interesting to know how common it is used for Swedish sites and compare to the rest of the world. You can read more about how often it is used for the largest sites in the world in this article by Ann Robson and this by Patrick Meenan.

However, the next figure is also really surprising! There is a site in the top 100 list in Sweden that can save 886 kb by using Gzip! That is the largest Swedish newspaper on the west coast: And the median value is 55.5 kb. This is a really easy win, because all you have to do, is to turn on compression on your server.

Minifying Javascript / CSS and HTML can also save data on several sites. The median wins here are not so big but there are a couple of sites that really can gain by doing it.

Serving images in the correct scaled size is important and getting more and more relevant now with responsive web sites. For one site, it would actually save 800 kb.

There really is a lot to do there for us performance geeks, great!

The site that sends the most amount of unnecessary data

Which site in Sweden has the largest unnecessary payload? Well it depends. In total kilobytes (meaning how much data you can remove) this is the list:

Site Size (kb) 1361.3 1342.9 1330.7 1262.6 1223.7 1132.5 1126.6 1124.9 1086.4 1080.1

All of them send over 1 megabyte extra worth of data that you can and should remove! The worst one total weight is 4528 kb and they can save 19.3 kb by minifying javascripts, 4 kb minifying HTML, 3.1 kb minifying CSS, 105 kb by optimizing images, 342.8 by serving scaled images and then 886.6 kb by enabling GZIP!!!

If you check percentage (that is what percentage of the total weight could be removed) the list is like this:

Site % 45.56 44.34 44.29 37.19 37.06 36.33 35.68 33.54 32.73 32.3

Wow, the top 3 on the list sends almost 45% of the total weight completely unnecessary! No comment.

How much is wasted in Sweden?

We also made a rough estimate of how many bytes actually sent unnecessarily per month from the largest sites. We made use of unique browsers from KIA, tested the start pages and multiplied. It does not completely gives accurate figures (the perfect calculation had used unique browsers, the number of requests and checked the cache times of objects, and count all items that do not have cache time at all). Our calculation will be a bit biased because we only tested the start pages. On the other hand, most start pages get most hits and gives the most bang for the money if it is optimized, i.e., it should be the most optimized page on the site. In all cases, we found that we can save a total of 8.8 TB per week for the largest sites in Sweden.

What can be saved per site?

We also checked how much you can save per site.

Site Size can be saved per week in traffic (gb) 806.3 728.4 520.1 400.9 382.7 232.7 232.7 200.8 154.2 151.4 (Swedens largest morning newspaper) could save the most if they optimized their site. Interesting also that Aftonbladet (largest site & newspaper in Sweden) is super optimized (they can only save 54.5 kb per page view by optimizing their images) in comparison to other sites. Still, they can save a lot in total, because they have the most page views in Sweden.

The site that sends the least amount of unnecessary data

Which site sends the least unnecessary data? Measuring kilobytes does not feel quite right, a small site will automatically have low values. Let's check out how it looks in percent, that is, the smallest portion of the total weight of the site that disappear if you optimize it.

Site % 0,35 0,38 0.67 1,31 1,51 1,74 1,75 1,84 1,9 1,93

Really good work by the top three on the list!

Sweden's heaviest home page

We got some extra fun details when we did this investigation, here is the list of sites in Sweden that have the largest start page.

Site Size (mb) 9.8 9.7 7.6 6.2 6.0 5.9 5.8 5.4 5.3 5.1

And you remember that the average page weight from HTTP Archive is 1.4 megabyte? Above the average of 1.4 megabyte is bad. 10 megabyte is really bad. However a site could still be fast if it is built the right way for that amount of data, but sending that on a 3G connection is just a killer.

Yes, that's the situation for the largest sites in Sweden today. There are many sites that can remove a lot of unnecessary data!

Written by: Peter Hedenskog