Topics

Company News Efficiency Safety Data Insights & Analytics Life at Samsara Engineering at Samsara Perspectives

Engineering at Samsara

What 100 billion miles taught a fuel model

May 12, 2026

Praveen Murugesan

VP Engineering

Get the latest from Samsara

Subscribe now

Spiking fuel prices are a hit for consumers, but they’re an existential threat for commercial fleets. Whether you drive a gas-powered or electric vehicle, chances are your groceries, packages, and services are getting to you on a gas-powered commercial vehicle.

Essential services — your electrician, the person delivering the package to your doorstep, the school bus grabbing your kids from school — predominately run on diesel. That’s why fuel makes up a whopping 21–24% of operating costs for commercial fleets, not to mention the second-order environmental costs. To put it in perspective, commercial vehicles account for around a quarter of global road emissions but just 1% of vehicles. Look no further than the abrupt shutdown of Spirit Airlines to understand how dire rising fuel costs can be for commercial operators in transportation.

We can’t fix gas prices. But we figured we could build something that isolates how much of a driver’s fuel burn is within their control. Getting there was harder than we thought.

Latent space: signal vs noise

Our first approach was to compare Driver A’s miles per gallon with Driver B’s. Seems simple. But once we shipped and started listening to drivers and customers, it became clear this approach was limited. For starters, roads are varied. A driver in the Swiss Alps and a driver on the plains outside of Amsterdam face completely different variables. Things like altitude, temperature, lane configuration, and traffic density influence fuel consumption. What’s more — drivers lack control over these real-world variables.

While our initial eco-driving scoring system baked in metrics like cruise control and excessive throttling, these are contextual. Think about it: you wouldn’t use cruise control in a dense urban environment, and most excessive throttling happens on city streets.

That’s when we realized the problem: we had to separate driver behavior from real-world context.

Our alpha: training data scale

This is exactly the kind of problem ML is good at, learning what normal looks like under different conditions. Solving this ML problem required comprehensive training data on roads including road type, time of day, and vehicle type. We broke down our 25 trillion annual data points spanning 100 billion miles into 5-minute driving segments. That timeframe gave us sufficiently granular data to capture variation between a mountain stretch or a city block.

To get there, we worked with our firmware and hardware teams to rewrite our eco-driving telematics stack. Previously, we batched data into hourly accumulations. But in an hour, a driver might move from highway to city streets to stop-and-go traffic. That mixes completely different conditions into a single data point. That’s too coarse to isolate the exact second of inefficient driving behaviors like excessive throttling. Increasing the granularity of data collection at the edge unlocked more precise ML.

For each 5-minute segment, we trained the model on ~50 variables, things like:

Road context: speed limits, lane counts, road type, urban vs. rural classification
GPS and kinematics: segment speed, traffic ratio (actual speed ÷ speed limit), uphill/downhill grade, altitude
Environment: temperature from vehicle sensors
Vehicle type: expected fuel efficiency by make, model, year, and fuel type
Time encoding: hour of day, day of week — because a Tuesday morning commute and a Friday night long-haul have different baseline expectations

From there, we benchmarked driver segments against other travelers on that road to develop an understanding of fuel efficiency that decouples driver behavior from real-world constraints.

This week, we shipped our advanced driver efficiency ML model for liquid-fuel vehicles. It identifies a bottom tier of drivers (e.g., the bottom 10 %) and gives fleet operators a targeted coaching list with AI-powered coaching tools, so they can act with confidence on what drivers can actually control.

Kudos to James Berglund for crafting the model!

Fuel efficiency is a system problem

While fuel prices aren’t something fleets can control, they can determine how efficiently they use that fuel. Driver behavior is one tool, and a useful starting point for fleets looking to get a handle on costs.

Ultimately, fuel efficiency is a system problem. It extends into predicting maintenance issues that degrade efficiency, optimizing routes for real-world constraints, and deploying mixed EV and diesel fleets more intelligently. More to come.

See you in the field.

View open roles

Get the latest from Samsara

Subscribe now

Engineering at Samsara

What 100 billion miles taught a fuel model

Latent space: signal vs noise

Our alpha: training data scale

Fuel efficiency is a system problem

When AI brings your loved one home safe

What it looks like when everyone builds with AI

Humans, Machines, and the Road Ahead: HumanX Reflections

AI at the Edge: Stopping Bridge Collisions

The hardware challenge: Building the AI Multicam

Behind the build: Smart Trailers and Connected Equipment at Samsara

From idea to impact: Solutions built and shipped by our 2025 interns

How AI tools have changed the way we code

The network advantage: Building the industrial-grade Asset Tag

Engineering at Samsara

What 100 billion miles taught a fuel model

Latent space: signal vs noise

Our alpha: training data scale

Fuel efficiency is a system problem

Read more

When AI brings your loved one home safe

What it looks like when everyone builds with AI

Humans, Machines, and the Road Ahead: HumanX Reflections

AI at the Edge: Stopping Bridge Collisions

The hardware challenge: Building the AI Multicam

Behind the build: Smart Trailers and Connected Equipment at Samsara

From idea to impact: Solutions built and shipped by our 2025 interns

How AI tools have changed the way we code

The network advantage: Building the industrial-grade Asset Tag