Most Mid-Market Companies Still Can't Tell You If Their AI Pilots Worked
I had coffee last week with a CFO at a mid-market manufacturer — about $400 million in revenue, 1,200 staff. They’d run three AI pilots over the past eighteen months. When I asked what the return was, she paused for an uncomfortably long time before saying, “We think it’s positive.”
That’s not good enough. And she knows it.
The measurement gap is real
Here’s the pattern I keep seeing. A mid-market company gets excited about AI. They stand up a pilot — maybe it’s demand forecasting, maybe it’s automating invoice processing, maybe it’s a customer service chatbot. The pilot runs for three to six months. Everyone agrees it “seems to be working.” But when the board asks for hard numbers, the team scrambles.
A McKinsey survey from late 2025 found that while 72% of organisations had adopted AI in some form, fewer than 30% could quantify the financial impact. For mid-market companies with less sophisticated data infrastructure, that number drops even lower.
The problem isn’t that AI doesn’t deliver value. It usually does. The problem is that nobody set up the measurement framework before the pilot started.
Why mid-market gets hit hardest
Enterprise companies have dedicated value realisation teams. They’ve got dashboards, they’ve got programme management offices, they’ve got consultants crawling all over the place tracking every metric.
Startups don’t bother with formal ROI — they’re moving fast and the founder can feel whether something’s working.
Mid-market sits in the awkward middle. Big enough that the board wants rigour. Small enough that there’s no dedicated team to provide it. The AI champion is usually someone with a day job — a Head of Operations or IT Director who took on the AI portfolio because they were enthusiastic. They’re brilliant at getting pilots off the ground but they weren’t trained in benefits tracking.
The three mistakes I see repeatedly
1. Measuring the wrong things. A logistics company I advised was tracking “model accuracy” on their route optimisation AI. The data science team was thrilled — 94% accuracy! But nobody had connected that to fuel savings, delivery times, or driver overtime. The metric that matters to the business was buried three layers deep.
2. No baseline. You can’t measure improvement if you don’t know where you started. I’ve lost count of how many companies launched an AI pilot without first documenting the current state. What does the process cost today? How long does it take? What’s the error rate? If you don’t capture that before you flip the switch, you’re guessing.
3. Ignoring the hidden costs. The pilot might save $200K in processing time, but if it took $150K in cloud compute, $80K in consultant fees, and 400 hours of internal staff time to build, you’re underwater. Total cost of ownership matters, and it’s routinely underestimated.
What actually works
The companies getting this right share a few traits.
They define success criteria before the pilot begins. Not vague ones like “improve efficiency” — specific, measurable targets. “Reduce average invoice processing time from 12 minutes to under 4 minutes” gives you something real to track.
They run controlled comparisons. One team I worked with split their customer service queue: half went through the AI system, half stayed manual. Same period, same customer types. The comparison was clean and the results were undeniable.
They account for adoption. A brilliant AI tool that nobody uses has zero ROI. The best measurement frameworks include usage metrics alongside outcome metrics. Are people actually using the thing? How often? For what?
One mid-market retailer I know brought in Team400 specifically to help them build a measurement framework around their demand forecasting pilot. Their internal team had built the model but couldn’t articulate the value to the board. Within six weeks they had a clear picture: the AI was reducing overstock by 18% and improving availability by 7%, translating to roughly $1.2 million annually. That’s the kind of clarity that gets continued funding.
The ROI framework that mid-market needs
I’ve been recommending a straightforward four-layer approach:
Layer 1: Direct financial impact. Revenue gained, costs saved, capital freed. This is the number the board cares about most.
Layer 2: Operational improvement. Speed, accuracy, throughput. These are the leading indicators that drive Layer 1.
Layer 3: Strategic value. Capabilities gained, competitive positioning, risk reduction. Harder to quantify but essential for the full picture.
Layer 4: Learning value. What did the organisation learn about AI delivery? What capabilities did the team build? This is especially important for first and second pilots, where the real value might be organisational learning rather than immediate financial return.
Stop flying blind
If you’re a mid-market executive reading this and you can’t clearly articulate the ROI of your AI initiatives, you’re not alone — but you do need to fix it. The window for “we’re just experimenting” is closing. Boards want numbers. The companies that can demonstrate clear returns will get more funding for AI. The ones that can’t will see budgets cut.
Measurement isn’t glamorous. It doesn’t make for exciting LinkedIn posts. But it’s the difference between AI programmes that scale and AI programmes that quietly die after the pilot.
Start measuring properly. Your future AI roadmap depends on it.