S&D Control Center

Balancing machine learning automation with human intervention for operations teams managing €60 million a year.

Lead Product Designer, 2024

A screenshot of the calendar view. A screenshot of the calendar view.

Summary

  • Stuart’s supply and demand tooling had grown into 13 separate solutions across countries, and a critical outage put major client relationships at risk.
  • I led the design work to fix this, proposing an “automate and override” model that resolved a long-running tension between operations wanting manual control and engineering wanting full automation.
  • We shipped a single global platform where ML-generated rate suggestions are visible and easily adjustable, with real-time spend forecasting so teams can make confident decisions.
  • The majority of rates are now set automatically, operations teams only step in for exceptions.

The problem

Stuart uses incentives to balance courier supply with customer demand. When there aren’t enough couriers in an area, it increases payment rates to attract more. Area Multipliers are the main tool to manage this. They control €60 million a year in payments across Europe.

In 2022, the tooling broke. A script running from someone’s laptop failed. Orders were late & Stuart breached key SLAs, putting major enterprise client relationships at risk.

When we looked into it, the problems went deeper than one broken script. Each country had built their own solutions. People didn’t even understand the mess that had grown organically.

Operations wanted manual control because models couldn’t always handle their complexity. Engineering wanted scalability because manual processes kept breaking. The company overall wanted more automation.

As a Lead Product Designer, I was brought in to help work out what to do.

Understanding what we had

Before designing anything, I needed to understand the whole system. It hadn’t been mapped and no one held a complete enough picture.

I spent time with the Head of Global Operations documenting what existed, how tools connected, who used them, what problems they solved, and where things broke.

We found 13 separate solutions doing similar work. Some were proper applications. Others were spreadsheets with scripts. Each had been built to solve a local problem.

At the same time, an argument kept coming up in most conversations:

We need to control this manually. Models can’t handle our complexity.
We need to automate everything.

As long as it was framed as a choice between one or the other, things weren’t moving fast enough.

Automate and override

I proposed something different. Instead of asking whether we should automate or keep manual control, let’s ask: how do we design for both working together?

We used these 4 ideas to make decisions throughout the project.

How this shaped the design

Making the model visible

The ALMO model generates weekly rate suggestions using historical demand patterns, weather forecasts, local events, and budget constraints.

I designed the calendar to show these suggestions as a starting point. The suggestions are accepted by default, but operations teams can adjust them or replace them entirely.

The model does the volume work. People handle the exceptions and the things the model can’t see.

Showing what will happen

Teams kept asking for one thing: show us what changes will cost before we publish them.

Overspend hurts margins. Underspend means late deliveries and breached SLAs. They needed to balance both risks before committing.

I designed real-time spend forecasting. As you edit rates, you see total spend updating, with a comparison to the budget target for that area.

Bulk actions and granular control

Operations teams manage hundreds of delivery areas each week. They needed to work in bulk to be efficient, but they also needed to adjust individual areas when something unusual was happening.

The interface supports both:

The technical constraint was that each change triggers API calls. With 100 areas across 7 days and multiple time slots, that’s potentially thousands of calls.

I worked with Engineering to batch requests and design the interface around a bulk-first-then-refine workflow. We showed processing time so people could see what was happening.

Results

  • ML model now sets the majority of rates automatically
  • Operations teams only intervene for exceptions, not routine work
  • From 13 separate solutions to 1 globally adopted platform
  • 0 new outages caused by the S&D tooling

Explored but didn’t ship

Once the platform was working, I explored having a staging view and letting operations teams describe changes in natural language.

Boost Central London by 0.3 on Saturday evening.

The AI would generate a staging view where the user can review and refine.

In 2024, it was too early. We estimated the effort as much larger than it seems today with recent advancements in AI models.

The system we shipped let the model do the volume work and people handle the exceptions. It had solved the problem, and we could move on to tackle other issues instead.

← Previous ID Scan & Age Verification Next → Stuart Courier App