DORA Metrics Explained Simply

Steve Froehlich
3 min readSep 24, 2023

What are the metrics and what are high performing values of these metrics?

What is it

The Dev Ops Research and Assessment (DORA) organization performed what they believe is a scientifically rigorous methodology, to measure the performance of software engineering teams. Their core findings are the DORA Metrics which are 5 (initially 4, as of 2021 there are 5) quantifiable measures of performance. Those metrics are listed below.

The Metrics

The metrics can be grouped into two categories of measures:

  1. Go Fast (deployment frequency and lead time metrics)
  2. Don’t break things (change failure rate, time to restore service and reliability)

Deployment Frequency

What is it: how often changes are deployed to production.

Interpretation: A deployment frequency of once per day means that the team is deploying one code change every day to production.

Note this does not mean live in production. Deployed to production can mean behind a feature toggle so even though the change is in production it is not available for users. Making the change or feature available for use by users is called a release. Releasing the software often means enabling the feature toggle. This distinction between deployment and release is important to keep in mind when interpreting this metric.

Performance Levels (levels as of 2022 state of devops report)

High: On-demand (multiple deploys per day)

Medium: Between once per week and once per month

low: Between once per month and once every six months

Lead Time for Change

What is it: How long it takes for the team to implement a feature request and deploy it to production.

Interpretation: A lead time for change of 5 days means it takes the team (on average over some time period) 5 days from when they first start coding a change to when it is deployed to production.

Performance Levels

Elite: <= 1 day

High: day to one week

Medium: 1 week — 1 month

low: 1–6 months

Change Fail Rate (CFR)

What is it: A measure of how frequently changes fail. Specifically its the ratio of the number of changes that fail divided by the total number of changes.

Change Fail Rate (%) = (Failed Changes / Total Changes) * 100

Example: A team deploys 2 times to prod and 1 deploy causes an error that needs to be fixed, the other does not. The CFR would be 50%.

Performance Levels

High: 0–15%

Medium: 16%-30%

low: 46%-60%

Time to Restore Service

Note: many times this is abbreviated as MTTR (mean time to restore)

What is it: The time it takes to fix the service after an outage has occurred.

Example: A team performs a successful deployment. 5 minutes later the service goes down (or has a high enough error rate to be considered a production incident). After the incident is raised the team rolls back the change by deploying the last working version. Doing this took 30 minutes. So the time to restore service would be 30 minutes for this incident. MTTR is usually given as an average over a period of time.

Performance Levels

High: < 1 day

Medium: between 1 day and 1 week

low: > 1 week

Reliability

Are you meeting your users reliability expectations. This metric is less well defined and could mean ensuring thresholds around

  • Latency responsiveness and speed. Example: 100 milliseconds.
  • Throughput can your service handle your customers load or usage habits. Example 100,000 requests / minute.
  • Error rates are your error rates low enough that customers don’t care about them. Zero errors is probably not obtainable over a long period of time. Example: error ≥ 1% will generate an alert and be considered a breach of user expectations.
  • Availability. Is your service up and working at a level that satisfies your customers expectations. Example: was the average error rate over the last month ≥ 1%.

The above thresholds are examples of measures suitable to a microservice based web app. There could be many other reliability metrics teams use. Metrics and thresholds for a data lake, for example would be much different.

Footnote

The evidence for all the above DORA metrics and how they predict company performance was compiled and published in the state of DevOps report by the DORA organization, now part of Google. A new report comes out each year. As of this writing the 2022 state of devops report is the latest. More details about these findings are also published in the book Accelerate.

--

--

Steve Froehlich

I like to speculate about the future and help engineering teams build great software in e-commerce and digital finance.