Running A Page Speed Test: Monitoring vs. Measuring<\/h1>\nGeoff Graham<\/address>\n 2023-08-10T08:00:00+00:00
\n 2024-10-14T17:35:03+00:00
\n <\/header>\n
This article is sponsored by DebugBear<\/b><\/p>\n
There is no shortage of ways to measure the speed<\/em> of a webpage. The tooling to get a report with details from the time it takes to establish a server connection to the time it takes for the full page to render is out there. In fact, there\u2019s great tooling right under the hood of most browsers in DevTools that can do many things that a tried-and-true service like WebPageTest offers, complete with recommendations for improving specific metrics.<\/p>\n<\/p>\n <\/p>\n
<\/p>\n
<\/a>\n Lighthouse results. (Large preview<\/a>)
\n <\/figcaption><\/figure>\nI don\u2019t know about you, but it often feels like I\u2019m missing something when measuring page speed performance. Even with all of the available tools at my disposal, I still find myself reaching for several of them. Certain tools are designed for certain metrics with certain assumptions that produce certain results. So, what I have is a hodgepodge of reports that needs to be collected, combined, and crunched before I have clear picture of what\u2019s going on.<\/p>\n<\/p>\n <\/p>\n
<\/p>\n
<\/a>\n Not the best way to get a high-level view of performance. (Large preview<\/a>)
\n <\/figcaption><\/figure>\nThe folks at DebugBear<\/a> understand this situation all too well, and they were kind enough to give me an account to poke around their site speed and core web vitals reporting features. I\u2019ve had time to work with DebugBear and thought I\u2019d give you a peek at it with some notes on my experience using it to monitor performance. If you\u2019re like me, it\u2019s hard to invest in a tool — particularly a paid one — before seeing how it actually works and fits into my work.<\/p>\nMonitoring vs. Measuring<\/h2>\n
Before we actually log in and look at reports, I think it\u2019s worth getting a little semantic. The key word here is \u201cmonitoring\u201d performance. After using DebugBear, I began realizing that what I\u2019ve been doing all along is \u201cmeasuring\u201d performance. And the difference between \u201cmonitoring\u201d and \u201cmeasuring\u201d is big.<\/p>\n
When I\u2019m measuring<\/em> performance, I\u2019m only getting a snapshot at a particular time and place. There\u2019s no context about page speed performance before or after that snapshot because it stands alone. Think of it like a single datapoint on a line chart — there are no surrounding points to compare my results to which keeps me asking, Is this a good result or a bad result?<\/em> That\u2019s the \u201cthing\u201d I\u2019ve been missing in my performance efforts.<\/p>\nThere are ways around that, of course. I could capture that data and feed it into a spreadsheet so that I have a record of performance results over time that can be used to spot where performance is improving and, conversely, where it is failing. That seems like a lot of work, even if it adds value. The other issue is that the data I\u2019m getting back is based on lab simulations where I can add throttling, determine the device that\u2019s used, and the network connection, among other simulated conditions.<\/p>\n
On that note, it\u2019s worth calling out that there are multiple flavors of network throttling. One is powered by Lighthouse, which observes data by testing on a fast connection and estimates the amount of time it takes to load on different connections<\/strong>. This is the type of network throttling you will find in PageSpeed Insights, and it is the default method in Lighthouse. DebugBear explains this nicely<\/a> in its blog:<\/p>\nSimulated throttling provides low variability and makes test quick and cheap to run. However, it can also lead to inaccuracies as Lighthouse doesn’t fully replicate all browser features and network behaviors.<\/p><\/blockquote>\n
In contrast, tools like DebugBear and WebPageTest use more realistic throttling that accurately reflects network round trips on a higher-latency connection.<\/p>\n
Real<\/em> usage data would be better, of course. And we can get that with real-user monitoring<\/strong> (RUM) where a snippet of code on my site collects real data based on from real network conditions coming from real users is sent to a server and parsed for reporting.<\/p>\nThat\u2019s where a tool like DebugBear makes a lot of sense. It measures<\/em> performance on an automated schedule (no more manual runs, but you can still do that with their free tool<\/a>) and monitors<\/em> the results by keeping an eye on the historical results (no more isolated data points). And in both cases, I know I\u2019m working with high-quality, realistic data.<\/p>\nFrom there, DebugBear notifies<\/em> me when it spots an outlier in the results so I am always in the know.<\/p>\nThe DebugBear Dashboard<\/h2>\n
This is probably what you want to see first, right? All I had to do to set up performance monitoring for a page is provide DebugBear with a URL and data flowed in immediately with subsequent automated tests running on a four-hour basis, which is configurable.<\/p>\n
Once that was in place, DebugBear produced a dashboard of results. And kept doing that over time.<\/p>\n<\/p>\n <\/p>\n
<\/p>\n
<\/a>\n (Large preview<\/a>)
\n <\/figcaption><\/figure>\nYou can probably look at that screenshot and see the immediate value of this high-level view of page performance. You get big score numbers, mini charts for a variety of web vital metrics, and a filmstrip of the page rendering with annotations identifying where those metrics sit in the process, among other great pieces of information.<\/p>\n
But I\u2019d like to call out a few especially nice affordances that have made my performance efforts easier and, more importantly, more insightful.<\/p>\n
Working With Page Speed Data<\/h2>\n
I\u2019ve learned along the way that there are actually multiple kinds of data used to inform testing assumptions.<\/p>\n
One type is called lab data<\/strong>. It, in turn, has its own subset of data types. One is observed data<\/em> where CPU and network throttling conditions are applied to the test environment before opening the page — \u201capplied throttling\u201d as it were. Another is simulated data<\/em> which describes the Lighthouse method mentioned earlier where tests are done on a high-powered CPU with a highspeed network connection and then estimates how \u201cfast\u201d a page would load on lower-powered devices. Observed data is the high-quality type of lab data used by tools like DebugBear and WebPageTest. Simulated data, on the other hand, might be convenient and fast, but also can be innacurate.<\/p>\nA second type of data is called real-user<\/strong> data<\/strong>. This is high-quality data from actual website visitors, for example based on<\/a> Google\u2019s Chrome User Experience (CrUX) Report<\/strong>. The report, released in 2017, provides network data from sessions collected from real Chrome users. This is high-quality data, for sure, but it comes with its own set of limitations. For example, the data is limited to Chrome users who are logged into their Google account, so it\u2019s not completely representative of all users. Plus, the data is aggregated over 28 days, which means it may not be not the freshest data.<\/p>\nAlongside the CrUX report, we also have the RUM approach<\/strong> to data that we discussed earlier. It\u2019s another type of real-user monitoring takes real traffic from your site and sends the information over for extremely accurate results.<\/p>\nSo, having both a \u201creal user\u201d score and a \u201clab\u201d score in DebugBear is sort of like having my cake and eating it.<\/p>\n<\/p>\n <\/p>\n
<\/p>\n
<\/a>\n (Large preview<\/a>)
\n <\/figcaption><\/figure>\nThis way, I can establish a \u201cbaseline\u201d set of conditions for DebugBear to use in my automated reports and view them alongside actual user data while keeping a historical record of the results.<\/p>\n
Comparing Tests<\/h2>\n
Notice how I can dig into the data by opening up any test at a specific point in time and compare it to other tests at different points in time.<\/p>\n<\/p>\n <\/p>\n
<\/p>\n
<\/a>\n (Large preview<\/a>)
\n <\/figcaption><\/figure>\nThe fact that I can add any experiment on any page — and as many of them as I need — is just plain awesome. It\u2019s especially great for our team here at Smashing Magazine because different articles use different assets that affect performance, and the ability to compare the same article at different points in time or compare it to other pages is incredibly helpful to see exactly what is weighing down a specific page.<\/p>\n
DebugBear\u2019s comparison feature goes beyond mini charts by providing larger charts that evaluate more things than I can possibly print for you here.<\/p>\n<\/p>\n <\/p>\n
<\/p>\n
<\/a>\n (Large preview<\/a>)
\n <\/figcaption><\/figure>\nRunning Page Test Experiments<\/h2>\n
Sometimes I have an idea to optimize page speed but find I need to deploy the changes to production first so that a reporting tool can re-evaluate the page for me to compare the results. It would be a lot cooler to know whether those changes are effective before<\/em> hitting production.<\/p>\nThat\u2019s what you can do with DebugBear\u2019s Experiments feature — tweak the code of the page being measured and run a test you can compare to other live results.<\/p>\n<\/p>\n <\/p>\n
<\/p>\n
<\/a>\n See the \u201cPrettify Code\u201d option? \ud83d\ude0d(Large preview<\/a>)
\n <\/figcaption><\/figure>\nThis is the kind of thing I would definitely expect from a paid service. It really differentiates DebugBear from something like a standard Lighthouse report, giving me more control as well as tools to help me gain deeper insights into my work.<\/p>\n
Everything In One Place<\/h2>\n
Having all of my reports in a central one-stop shop is worth the price of admission alone. I can\u2019t stand the clutter of having multiple windows open to get the information I need. With DebugBear, I have everything that a mish-mash of DevTools, WebPageTest, and other tools provides, but in one interface that is as clean as it gets. There\u2019s no hunting around trying to remember which window has my TTFB<\/abbr> score for one experiment or which has the filmstrip of another experiment I need.<\/p>\nBut what you might not expect is a set of actionable recommendations to improve page speed performance right within reach.<\/p>\n<\/p>\n <\/p>\n
<\/p>\n
\n 2024-10-14T17:35:03+00:00
\n <\/header>\n
<\/p>\n <\/a> I don\u2019t know about you, but it often feels like I\u2019m missing something when measuring page speed performance. Even with all of the available tools at my disposal, I still find myself reaching for several of them. Certain tools are designed for certain metrics with certain assumptions that produce certain results. So, what I have is a hodgepodge of reports that needs to be collected, combined, and crunched before I have clear picture of what\u2019s going on.<\/p>\n <\/p>\n <\/a> The folks at DebugBear<\/a> understand this situation all too well, and they were kind enough to give me an account to poke around their site speed and core web vitals reporting features. I\u2019ve had time to work with DebugBear and thought I\u2019d give you a peek at it with some notes on my experience using it to monitor performance. If you\u2019re like me, it\u2019s hard to invest in a tool — particularly a paid one — before seeing how it actually works and fits into my work.<\/p>\n Before we actually log in and look at reports, I think it\u2019s worth getting a little semantic. The key word here is \u201cmonitoring\u201d performance. After using DebugBear, I began realizing that what I\u2019ve been doing all along is \u201cmeasuring\u201d performance. And the difference between \u201cmonitoring\u201d and \u201cmeasuring\u201d is big.<\/p>\n When I\u2019m measuring<\/em> performance, I\u2019m only getting a snapshot at a particular time and place. There\u2019s no context about page speed performance before or after that snapshot because it stands alone. Think of it like a single datapoint on a line chart — there are no surrounding points to compare my results to which keeps me asking, Is this a good result or a bad result?<\/em> That\u2019s the \u201cthing\u201d I\u2019ve been missing in my performance efforts.<\/p>\n There are ways around that, of course. I could capture that data and feed it into a spreadsheet so that I have a record of performance results over time that can be used to spot where performance is improving and, conversely, where it is failing. That seems like a lot of work, even if it adds value. The other issue is that the data I\u2019m getting back is based on lab simulations where I can add throttling, determine the device that\u2019s used, and the network connection, among other simulated conditions.<\/p>\n On that note, it\u2019s worth calling out that there are multiple flavors of network throttling. One is powered by Lighthouse, which observes data by testing on a fast connection and estimates the amount of time it takes to load on different connections<\/strong>. This is the type of network throttling you will find in PageSpeed Insights, and it is the default method in Lighthouse. DebugBear explains this nicely<\/a> in its blog:<\/p>\n Simulated throttling provides low variability and makes test quick and cheap to run. However, it can also lead to inaccuracies as Lighthouse doesn’t fully replicate all browser features and network behaviors.<\/p><\/blockquote>\n In contrast, tools like DebugBear and WebPageTest use more realistic throttling that accurately reflects network round trips on a higher-latency connection.<\/p>\n Real<\/em> usage data would be better, of course. And we can get that with real-user monitoring<\/strong> (RUM) where a snippet of code on my site collects real data based on from real network conditions coming from real users is sent to a server and parsed for reporting.<\/p>\n That\u2019s where a tool like DebugBear makes a lot of sense. It measures<\/em> performance on an automated schedule (no more manual runs, but you can still do that with their free tool<\/a>) and monitors<\/em> the results by keeping an eye on the historical results (no more isolated data points). And in both cases, I know I\u2019m working with high-quality, realistic data.<\/p>\n From there, DebugBear notifies<\/em> me when it spots an outlier in the results so I am always in the know.<\/p>\n This is probably what you want to see first, right? All I had to do to set up performance monitoring for a page is provide DebugBear with a URL and data flowed in immediately with subsequent automated tests running on a four-hour basis, which is configurable.<\/p>\n Once that was in place, DebugBear produced a dashboard of results. And kept doing that over time.<\/p>\n <\/p>\n <\/a> You can probably look at that screenshot and see the immediate value of this high-level view of page performance. You get big score numbers, mini charts for a variety of web vital metrics, and a filmstrip of the page rendering with annotations identifying where those metrics sit in the process, among other great pieces of information.<\/p>\n But I\u2019d like to call out a few especially nice affordances that have made my performance efforts easier and, more importantly, more insightful.<\/p>\n I\u2019ve learned along the way that there are actually multiple kinds of data used to inform testing assumptions.<\/p>\n One type is called lab data<\/strong>. It, in turn, has its own subset of data types. One is observed data<\/em> where CPU and network throttling conditions are applied to the test environment before opening the page — \u201capplied throttling\u201d as it were. Another is simulated data<\/em> which describes the Lighthouse method mentioned earlier where tests are done on a high-powered CPU with a highspeed network connection and then estimates how \u201cfast\u201d a page would load on lower-powered devices. Observed data is the high-quality type of lab data used by tools like DebugBear and WebPageTest. Simulated data, on the other hand, might be convenient and fast, but also can be innacurate.<\/p>\n A second type of data is called real-user<\/strong> data<\/strong>. This is high-quality data from actual website visitors, for example based on<\/a> Google\u2019s Chrome User Experience (CrUX) Report<\/strong>. The report, released in 2017, provides network data from sessions collected from real Chrome users. This is high-quality data, for sure, but it comes with its own set of limitations. For example, the data is limited to Chrome users who are logged into their Google account, so it\u2019s not completely representative of all users. Plus, the data is aggregated over 28 days, which means it may not be not the freshest data.<\/p>\n Alongside the CrUX report, we also have the RUM approach<\/strong> to data that we discussed earlier. It\u2019s another type of real-user monitoring takes real traffic from your site and sends the information over for extremely accurate results.<\/p>\n So, having both a \u201creal user\u201d score and a \u201clab\u201d score in DebugBear is sort of like having my cake and eating it.<\/p>\n <\/p>\n <\/a> This way, I can establish a \u201cbaseline\u201d set of conditions for DebugBear to use in my automated reports and view them alongside actual user data while keeping a historical record of the results.<\/p>\n Notice how I can dig into the data by opening up any test at a specific point in time and compare it to other tests at different points in time.<\/p>\n <\/p>\n <\/a> The fact that I can add any experiment on any page — and as many of them as I need — is just plain awesome. It\u2019s especially great for our team here at Smashing Magazine because different articles use different assets that affect performance, and the ability to compare the same article at different points in time or compare it to other pages is incredibly helpful to see exactly what is weighing down a specific page.<\/p>\n DebugBear\u2019s comparison feature goes beyond mini charts by providing larger charts that evaluate more things than I can possibly print for you here.<\/p>\n <\/p>\n <\/a> Sometimes I have an idea to optimize page speed but find I need to deploy the changes to production first so that a reporting tool can re-evaluate the page for me to compare the results. It would be a lot cooler to know whether those changes are effective before<\/em> hitting production.<\/p>\n That\u2019s what you can do with DebugBear\u2019s Experiments feature — tweak the code of the page being measured and run a test you can compare to other live results.<\/p>\n <\/p>\n <\/a> This is the kind of thing I would definitely expect from a paid service. It really differentiates DebugBear from something like a standard Lighthouse report, giving me more control as well as tools to help me gain deeper insights into my work.<\/p>\n Having all of my reports in a central one-stop shop is worth the price of admission alone. I can\u2019t stand the clutter of having multiple windows open to get the information I need. With DebugBear, I have everything that a mish-mash of DevTools, WebPageTest, and other tools provides, but in one interface that is as clean as it gets. There\u2019s no hunting around trying to remember which window has my TTFB<\/abbr> score for one experiment or which has the filmstrip of another experiment I need.<\/p>\n But what you might not expect is a set of actionable recommendations to improve page speed performance right within reach.<\/p>\n <\/p>\n <\/p>\n
\n <\/figcaption><\/figure>\n<\/p>\n
\n <\/figcaption><\/figure>\nMonitoring vs. Measuring<\/h2>\n
The DebugBear Dashboard<\/h2>\n
<\/p>\n
\n <\/figcaption><\/figure>\nWorking With Page Speed Data<\/h2>\n
<\/p>\n
\n <\/figcaption><\/figure>\nComparing Tests<\/h2>\n
<\/p>\n
\n <\/figcaption><\/figure>\n<\/p>\n
\n <\/figcaption><\/figure>\nRunning Page Test Experiments<\/h2>\n
<\/p>\n
\n <\/figcaption><\/figure>\nEverything In One Place<\/h2>\n
<\/p>\n