One more blog on the internets

Antony Belov

Web Performance: Fast, Responsive, Stable in 2025

Web Performance is one of those topics that’s constantly discussed as crucial for the modern web. It’s often framed as a key factor for user experience (UX), conversions, SEO, and overall success on the Internet. Every year kicks off with an article titled “New Challenges and Why Fast Websites Will Win in 20xx.”

Yet when it comes to implementation, performance tasks in the backlog often don’t get the priority they deserve. This happens for two main reasons, two key questions arise:

1.Is it really that important? - Users aren’t complaining; everything loads quickly for me, too. Plus, our internal research hasn’t shown a clear correlation between performance metrics and search rankings. What even is a CLS of 0.4, and why should I care?

2.Is it worth the effort? - There are too many metrics; they’re confusing, and new ones keep popping up all the time. On top of that, the tools are complex and expensive. Wouldn’t it be better to invest resources in product features instead?

This guide-style article explores how to answer these questions in 2025 and how to build a balanced approach to web performance. As always, the balance is found between engineering decisions and business expectations

We’re Not an Online Store

Research on web performance has been conducted by Internet giants like Amazon, Google, and Walmart. Their findings have become industry clichés: every 100ms delay costs 1% in conversions, longer render times lead to lost queries, and half of all users abandon a page by the third second. Sounds alarming, yet somehow… it doesn’t feel that way.

Why? Because these studies seem distant, tailored for massive e-commerce platforms. Their conclusions don’t really apply to our niche fishing store with a specialized audience. Or maybe we run our own social network (where people only tell the truth, even in ads), which is nothing like an online store. Or perhaps our app is a web-based Photoshop alternative - of course, it takes time to load. The real challenge is figuring out how performance optimizations impact a specific business.

If measuring ROI (Return on Investment) is difficult, one thing is clear: in a competitive market, losing any potential advantage is never a good idea. So, a better question might be: how are our direct competitors performing? Maybe users aren’t complaining simply because they’re leaving for a better alternative. And at this point, it doesn’t matter whether we’re talking about a photo editor or a “truthful” social network: the same principle applies.

Quick & Dirty Competitive Benchmarking

Comparing your business to competitors online is a big and important task. Even within the narrow scope of performance metrics, there are dedicated services with ready-made solutions, like SpeedCurve. But for a quick overview, these tools can be expensive and complex.

So where can we get useful data without spending a fortune? Fortunately, Google has already done the hard work: meet Chrome UX Report (CrUX). This is a large, publicly available dataset that tracks how real Chrome users interact with popular websites. And it includes all the essential performance metrics you need to get started.

Google provides several CrUX Tools to access this data (and, of course, third-party companies monetize CrUX-based insights as well). These tools allow you to analyze performance by domain (origin) or specific pages, filter by region and device type, and track performance trends over different time periods.

The simplest way to start benchmarking is CrUX Dashboard. No registration, no hassle, just enter a domain, and you’ll get a ready-made dashboard with monthly performance data.

CrUX Dashboard for Jan 2025

CrUX Dashboard for Jan 2025

Here, we come across three key metrics under the Core Web Vitals heading. For now, we’re only interested in the distribution of measurements: what percentage falls into the green (Good) category, and how much is orange (Needs Improvement) or even red (Poor).

The next step is straightforward: take your competitors’ domains and generate similar dashboards. Defining who your actual competitors are is beyond the scope of this article. Then, compare the amount of non-green results (even a simple Excel sheet will do) and act accordingly:

IF you’re a manager, THEN assign a task to “turn the charts green”
IF you’re an engineer, THEN prepare a presentation titled “They’re green! We’re red!”

Striving not to fall behind competitors is already a strong motivator. If this reasoning gains consensus, the process starts rolling.

However, at this point, it’s crucial to pay attention to the red zones. Regardless of how competitors are performing, anything that falls into the Poor category should be interpreted as “users are unhappy with performance.” There may be no complaints simply because reaching out to support takes extra time. It’s easier to put up with it this time or just close the tab already.

Not a One-Time Effort

If the first benchmarking results looked good or if some optimizations have already been made, it’s important to remember: things can change quickly. For a quick retrospective, the CrUX Vis tool allows you to visualize historical data for the past 25 weeks and track performance trends over time.

CrUX Vis for Jan 2025,<br/>an experimental tool for visualizing CrUX History data

CrUX Vis for Jan 2025,
an experimental tool for visualizing CrUX History data

Performance metrics tend to deteriorate over time if no action is taken. The reasons are simple:

Growing complexity and size of the application
Accumulating technical debt and reliance on third-party resources
External factors such as changes in browsers, networks, and client devices

That’s why Web Performance isn’t something you fix once and forget. Performance requires constant monitoring, and in production, monitoring is a continuous process.

Task Definition

So, we’ve established that Web Performance improvements are important. Now it’s time to set up the task, but where do we start? For now, we have one clear requirement: we don’t want to spend excessive resources or turn performance metrics into a tech cult. With that in mind, let’s outline the key priorities:

First, we need a clear set of metrics and their threshold values. The metrics should be easy to understand, not just for engineers, and the list itself shouldn’t be too long.

Second, it’s crucial to determine how and where to collect these metrics. Ideally, data should come as close to real users as possible, using something convenient and cost-effective. It would also be great to have a way to test performance in advance rather than waiting for issues to surface in production.

Third, the end result should be a structured process: from data collection and analysis to alerts and backlog tickets for fixing issues. The process should be transparent and accessible to everyone.

Vital Metrics

At first glance, it might seem like there’s a whole truckload of performance metrics. And that’s absolutely true, and not without reason.

Some performance metrics for 2025

Some performance metrics for 2025

Before diving in, it's important to understand who defines what matters when it comes to performance. There are only two key sources of requirements:

1.UX - Every application aims to deliver a great (and sometimes unique) user experience. A well-crafted UX directly impacts conversions, and performance is an integral part of it.

2.Search Bot - Since we’re dealing with web applications, search engine recommendations can’t be ignored. Even though ranking algorithms are opaque (and sometimes questionable), it’s still risky to underestimate their influence.

Ideally, the first step in performance optimization should be consulting UX and SEO specialists within your company. They’re sure to have valuable insights to guide the process.

Back in 2020, Google recognized the problem of too many performance metrics, most of them highly technical. This led to the Web Vitals initiative, aimed at creating a unified set of recommendations with a strong emphasis on user experience. Here, both requirement sources come together: Google talks about UX while also tweaking its ranking algorithms accordingly.

In short, Web Vitals introduced three key metrics, each covering a different aspect of user experience:

Largest Contentful Paint (LCP) – measures loading performance
Interaction to Next Paint (INP) – measures interactivity
Cumulative Layout Shift (CLS) – measures visual stability

Google also provides threshold recommendations, and these three metrics and their thresholds are exactly what we see on CrUX dashboards.

The list was initially expected to evolve over time. Nevertheless, in the past four years, the only major changes have been INP replacing FID (First Input Delay): since measuring just the first user interaction was clearly not enough for assessing interactivity. CLS calculation algorithms have also been improved for better accuracy.

In addition to the Core Web Vitals, Google also highlights other vital performance metrics:

Time to First Byte (TTFB) - measures loading, specifically when the server starts sending data
First Contentful Paint (FCP) - measures loading, indicating when the first piece of content appears on the screen

These metrics are recorded during every user session on a page. Some metrics (LCP, TTFB, FCP) are measured during the page load, from the moment the request is made to when the main content is displayed. Others (INP, CLS) are tracked throughout the page’s lifecycle, up until the tab is closed.

Google recommends measuring performance at the 75th percentile (as it does in CrUX) to ensure that at least 75% of users aren’t suffering from poor performance.

A quick note on browser support. Despite efforts to standardize Web Vitals and their widespread recognition, full support is currently available only in Chromium-based browsers (Chrome, Edge, Opera). Is this a problem when Chromium accounts for nearly 80% of web traffic? and the percentage keeps growing every year? Not really. Having reliable, easy-to-use metrics for 80% of users is already a major win. At the same time, it’s still important to keep in mind the potential negative experience of the remaining 20%. And that’s just the global picture. The browser distribution for a specific website’s audience might be different, meaning additional metrics may be necessary to get a more accurate performance assessment.

Web Vitals support in browsers 2025

Web Vitals support in browsers 2025

One way or another, there aren’t that many near-standard metrics that truly reflect user experience or, more precisely, the experience of about 80% of users. Given this, it makes sense to focus on these vital metrics and move on to monitoring.

Field Monitoring

There are different ways to implement monitoring:

1.Collecting data from real user devices
2.Simulating sessions and requests from various locations and devices
3.Analyzing the application in a local development environment

However, monitoring user experience directly on actual users already sounds like the most logical approach (though we’ll get back to the other methods later). This is known as Real User Monitoring (RUM). It offers several advantages: large data volumes, real-world conditions, geographic distribution, and later on - the ability to correlate performance with conversions.

Despite its inherent advantages, this type of monitoring should also provide:

Web Vitals metrics (LCP, INP, CLS, TTFB, FCP) - that part is clear.
Grouping of pages (or sessions) by URL, device type, location, and preferably custom tags - to get a detailed and segmented view.
Metric data with no more than a one-day delay and at least six months of history - to set up alerts and track trends over time.
The ability to configure percentile measurements and various aggregations.

You can start with CrUX, which isn’t just available for entire domains but also for individual pages (CrUX History API) and groups of pages (Google Search Console). Nevertheless, for larger projects, CrUX tools quickly reveal their limitations in the context of monitoring:

Chrome-only, and even then, only a fraction of user sessions are recorded.
Reports are limited to public and sufficiently popular pages, making it unusable for pages behind a login.
Pages are grouped by URL, stripping fragments and query parameters: this isn’t always convenient and may cause issues for SPAs.
Data updates occur monthly or with significant delays.
Aggregated data only, meaning it’s already grouped and processed before we see it.

An alternative approach is to use third-party solutions, though these are usually paid: options include Catchpoint RUM, SpeedCurve RUM, Datadog RUM, or even Sentry RUM (especially if it's already being used for error tracking). These tools all provide Web Vitals metrics and generally meet the previously mentioned requirements more effectively. If you’re considering choosing and purchasing a solution, it’s also worth evaluating the following:

Does it integrate well with your company’s existing infrastructure? e.g., Prometheus, Grafana, GitLab CI, Slack, Opsgenie, etc.
How robust is the API? Can you access everything that’s available in the UI?
Are the dashboards useful? Do you actually like the interface?
Is it easy to configure budgets and alerts for key metrics?

If buying a solution isn’t an option or none of the available tools seem like the right fit, you can take full control of performance metrics yourself. Google offers the open-source web-vitals library, which can be integrated into website pages to collect performance data for the current session and send it to a specified endpoint (where you’ll need to handle aggregation and storage) or directly to Google Analytics (do you need a product analyst for that? Not really).

Third-party solutions often provide their own JS scripts or SDKs, which works similarly to Google's library, relying on the same browser APIs (because there aren’t any others) but are usually easier to set up. So, one way or another, some kind of integration and configuration will be required.

You could build your own solution

You could build your own solution

So, what’s the right choice: develop something in-house or integrate an existing tool? The best advice is to leverage the monitoring tools your company is already using. If Prometheus and Grafana are widely used, it’s worth exploring whether performance metrics can be exported and visualized there using an external service. If that’s not feasible, then it might be time to consider building your own system, or at least parts of it.

User Happiness Index

Even though the number of essential metrics has been minimized, over time, all these abbreviations and numbers still start to blur together, and the issue fails to gain traction. If only there were a single metric that captured how much users suffer from poor performance.

And there is! Well, at least a standard - Apdex (Application Performance Index), which measures how satisfied users are with an application’s performance.

The concept is simple. It categorizes requests (or sessions) into three groups:

Satisfied - the user is happy with performance
Tolerating - performance is acceptable
Frustrated - the user is unhappy with performance

The index is then calculated using a formula, returning a value between 0 and 1:

apdex-math.webp

Originally, Apdex was designed around response time measurements, suggesting a target threshold T for Satisfied requests and 4T for Frustrated ones. This doesn’t apply directly to CLS, for example. Luckily, Google’s Web Vitals already provide recommended thresholds for good and poor values.

Let’s define Satisfied sessions as those where all Web Vitals metrics are in the green zone, and Frustrated sessions as those where at least one metric is red. Everything else falls into Tolerating. With this approach, Apdex calculations fit perfectly.

Using Apdex for Web Vitals based on Google’s thresholds

Using Apdex for Web Vitals based on Google’s thresholds

This algorithm is easy to implement manually, but it’s also widely supported by third-party services like Datadog Apdex or New Relic Apdex. Some even call it exactly what it is - User Happiness.

Apdex is also useful as an SLI (Service Level Indicator) within the SRE framework, which operates on SLI, SLO, and SLA principles. For example, an Apdex score of 0.8 can be interpreted as 80% of users having a satisfactory or tolerable experience. An SLO (Objective) could be something like: Apdex > 0.75 for 99% of the time. An SLA (Agreement) might then define compensation for clients if Apdex drops significantly below the SLO.

Having a single aggregated metric linked to user happiness makes decision-making more visual and intuitive. The availability of such a metric could be one of the criteria when selecting a monitoring solution.

Testing in the Lab

User-based monitoring has its drawbacks. First, it happens in production, meaning performance regressions might already be affecting users - something we’d rather catch earlier. Second, the happiness index and key metrics only provide indicators, not detailed problem diagnostics.

Synthetic testing is essential for both preventative action and in-depth performance audits that provide valuable optimization insights.

A go-to tool for this is Lighthouse, which is built into Chrome DevTools and has proven highly effective. There's also PageSpeed Insights, which is built on top of Lighthouse and is more accessible for non-engineers, plus it enriches results with CrUX data. Another option is WebPageTest - a more advanced and comprehensive tool, though slightly more complex to use.

Lighthouse provides a detailed performance report, which includes:

metrics,
a list of issues and their impact on metrics,
optimization recommendations (e.g., “Reduce image sizes”),
page load timelines and visualizations,
logs and traces for deeper analysis.

Another great feature is that the report can be downloaded as a JSON file and later opened in Lighthouse Report Viewer, making it a useful artifact for the backlog.

It’s important to remember that Lighthouse measures lab data, not real-world user data. Since there’s no actual user involved, it doesn’t track interaction-based metrics. On the other hand, it does provide:

Total Blocking Time (TBT) - Used instead of INP, it measures main thread blocking time. High values suggest a likelihood of slow user interactions.
Speed Index (SI) - Measures how quickly the page’s content visually appears. Unlike single-point metrics like LCP or FCP, it considers the gradual rendering of content, making it a good indicator of perceived load speed.

Lighthouse also provides an overall Performance score, a single number from 0 to 100 that reflects how bad (or good) things are. This score is calculated based on key performance metrics, and its components and weights occasionally change: for example, TTI (Time to Interactive) was deprecated and removed. Having one aggregated score for a quick performance assessment in tests is just as convenient as using the happiness index in RUM.

Lighthouse Performance scoring for 2025,<br/>TTI is no longer part of the calculation

Lighthouse Performance scoring for 2025,
TTI is no longer part of the calculation

Lighthouse analyzes individual pages, not an entire website. And regardless of the tool used, running tests for every page on a large site is time-consuming and impractical. A more efficient approach is to test the top-N pages that represent the most critical functions or sections of the site.

Since Lighthouse is open-source, it can be run via CLI or programmatically using Node.js. This makes it easy to integrate into existing CI pipelines, turning it into a part of the release cycle.

Pre-production Lighthouse testing in CI

Pre-production Lighthouse testing in CI

In addition to running tests on staging, synthetic monitoring can also be set up in production by scheduling tests that simulate devices and user behavior.

Third-party services like SpeedCurve Synthetic Monitoring or Calibre Performance Audit offer their own synthetic testing solutions. That said, they often rely on Lighthouse reports and audits as well. These services may provide their own SDKs for easy setup and test execution, CI integrations, and CLI tools (e.g., speedcurve-cli). When selecting a service, it’s worth considering:

Beyond just integrations, API richness, and UX convenience;
Support for user scenarios, such as authentication and interaction with key UI elements.
Custom test configurations, including device type, network speed, geolocation (for running tests from different regions), and browser settings.
Automation support, allowing tests to be triggered on a schedule, via CI, through an API, or from the command line.

RUM and synthetic testing complement each other perfectly. Using both approaches together helps identify real user performance issues, pinpoint their causes, and get actionable recommendations for improvements.

Worth noting: Some services combine both RUM and synthetic monitoring - for example, Catchpoint, SpeedCurve, or Akamai - making them versatile all-in-one solutions.

Bringing It All Together

In summary, monitoring metrics are condensed into a single RUM indicator and a performance scoring for synthetic tests. The tools are streamlined: either two free options or one universal (but paid) solution. And there are two key alert sources:

Apdex is tracked - Alerts about unhappy users
Performance score is calculated - Alerts from failing CI tests

Ideally, these alerts should reinforce each other: a poor real-world performance index should be accompanied by non-green synthetic test results in CI.

If Apdex shows an unsatisfactory value, the next steps are:

1.Check Web Vitals from real user data to pinpoint the issue: loading speed, interactivity, or visual stability.
2.Reproduce the issue with synthetic tests: some Web Vitals can be measured in lab conditions (loading and CLS), while others (INP) have predictors (TBT).
3.Diagnose the problem using available tools, analyze possible causes, investigating secondary metrics for deeper insights.

Working with metrics: left to right

Working with metrics: left to right

If a problem can’t be reproduced in lab tests, it’s almost certainly due to significant differences in environments. In tests, conditions are highly controlled: far from real-world scenarios, where devices, internet speeds, locations, caches, and background processes all vary. Differences are inevitable, yet they can be minimized by regularly analyzing users and trying to replicate their conditions and behavior in tests.

If Apdex remains stable and positive, but Performance scoring in tests raises concerns, differences in environments could still be the cause. Even so, if an issue is consistently reproducible in tests, why not address it proactively, rather than waiting for it to impact users?

Ultimately, metrics can also be described through their consumers:

Product managers and analysts track the user happiness index
SEO and UX specialists work with Core Web Vitals in their research
Developers and QA investigate and resolve performance issues

Metric consumers: bottom-up (or top-down)

Metric consumers: bottom-up (or top-down)

Who receives which alerts? This ties into escalation (or de-escalation). As well as the level of detail different roles need. Some product managers might be interested in specific performance metrics, just as developers might participate in defining threshold values for an excellent UX.

Building a Working Process

No matter how clear the metrics are or how useful and friendly the monitoring and testing tools might be, the system won’t run itself, or over time, it may be abandoned and turn into technical debt. What’s needed is an effective, continuous, and predictable process.

Step One - Defining Requirements

The primary metric used is Apdex, collected from real user sessions. The goal is to ensure that the average observed value does not fall below:

A) 0.75 - aligns with Google’s 75th percentile recommendation.
B) An estimated value based on Web Vitals data from CrUX for competitor websites.

Most importantly, this requirement must be agreed upon by both business and engineers. Everyone should clearly understand what this single metric represents and why it matters.

Step Two - Defining Responsibilities

There needs to be a clear definition of roles, tasks, and deadlines in case performance metrics fall below the agreed thresholds. Examples of possible agreements:

IF Apdex drops by 10% over a month, THEN the development team commits to an optimization task in the next quarter.
IF Apdex drops by 30% in a week, THEN the development team takes on a performance fix in the next sprint.

It doesn’t really matter whether this is framed as a KPI, SLA, or something else. What matters is that without clear ownership, task prioritization, and allocated resources, the metric will eventually just turn red.

Step Three - Reviewing Requirements

From time to time - say, once a year - it’s necessary to revisit the performance targets.

First, competitors are also improving their performance. To avoid falling behind, it’s worth periodically checking their CrUX data at the domain level or even setting up synthetic tests on key pages.

Second, Google’s 75th percentile recommendation is just a starting point: why not push for the 85th percentile? That’s another 10% of happy users!

Finally, beyond competitors and percentile thresholds, it’s crucial to continuously analyze correlations between performance metrics and conversion rates: understanding how satisfied users turn into paying users. Performance targets should be adjusted based on new insights and hypotheses.

Process blocks

Process blocks

New metrics may emerge under the hood of Apdex and Performance Score, along with updated recommendations, best practices, and better, more cost-effective tools. Keeping an eye on these developments - even occasionally, at least from the engineering side - is essential.

Ideal Tools, Metrics, and UX

The process - however rough - is up and running! Now, it’s time to refine it to perfection.

Lighthouse is an excellent starting point for analysis. Its recommendations help optimize the Critical Rendering Path, images, fonts, caching, browser thread workload, server response times, and API latency. But, Lighthouse doesn’t provide deep insights into rendering issues or server-side problems. Choosing the “ideal” tool depends on the specifics of the application and the technical standards it needs to meet. And both of which may take time to define and will likely evolve.

The same applies to metrics. Core Web Vitals were designed by Google engineers to highlight key aspects of user experience, and they do a great job of surfacing major issues. Even so, sometimes they’re not enough. Depending on UX decisions, technical constraints, or a unique target audience, custom metrics may be required. This is where browser APIs (such as elementtiming) and third-party tools (like Bloomberg’s Container Timing API) come to the rescue. Developing the “ideal” set of UX metrics is neither fast nor easy, yet it can significantly improve key product metrics.

Having performance metrics is great for UX, but it’s even better when they become part of UX itself. Performance requirements should be embedded in company-wide UI development guidelines, especially if there’s a shared design system (e.g., Semrush Design System). These guidelines should define when to show a spinner or progress bar, how to implement skeleton loading without layout shifts, which interactive elements need custom tracking, how lazy-loaded images should behave. When performance is considered from the start, it’s a step toward the “ideal” UX.

Users need to see that something is happening

Users need to see that something is happening

To clarify the quotes around “ideal” - truly ideal solutions only exist in a perfect world. Software development is, by nature, a constant search for trade-offs. That’s why it’s better to follow the Pareto principle and remember: a working not-perfect process is always better than a perfect not-working one.

Takeaways

Top 5 key (and vital) insights to wrap things up:

1.Web Performance is about user experience, not just technical metrics. A well-optimized Web Performance can even become a competitive advantage.
2.Since it’s about users, monitoring should be built on real user data in real conditions.
3.Synthetic tests help prevent issues and provide detailed insights for optimizations.
4.There aren’t that many metrics, and the tools are readily available.
5.Monitoring is a continuous process, and both business and engineers have a stake in it.

Postscript, Web Performance is really that important and worth the effort.