SAP’s AI got a 95% score, until consultants knew it was AI

SAP's AI got a 95% score, until consultants knew it was AI - Professional coverage

According to VentureBeat, SAP ran a quiet internal experiment where five consultant teams were asked to validate answers to over 1,000 business requirements completed by its AI co-pilot, Joule for Consultants. Four teams were told the analysis was done by junior interns, and they rated the work about 95% accurate. The fifth team was told the answers came from AI, and they rejected almost everything outright. Only when that fifth team was forced to validate each answer one-by-one did they discover the AI was, in fact, highly accurate—again, about 95%. Guillermo B. Vazquez Mendez, chief architect at SAP America Inc., says the lesson is about cautious communication with senior consultants regarding AI integration.

Special Offer Banner

The human bias problem

This is a fascinating, and honestly, a pretty predictable result. The bias against AI output is real, even when the work is objectively good. It’s one thing to intellectually accept that AI is a tool, but it’s another to trust its output when your reputation is on the line. Consultants with decades of experience have built careers on their judgment, so an AI just feels like an unproven rookie. The kicker? The AI was surfacing detailed insights the humans initially dismissed. That’s the real danger here—letting bias blind you to good information, regardless of its source.

Shifting from tech to business

Here’s the thing: SAP’s argument isn’t that AI is perfect. It’s that AI changes the *economics* of consulting time. Vazquez says historically, consultants spent 80% of their time on technical execution—understanding systems, data flows, processes. Customers, meanwhile, spend 80% of their time on their actual business. That’s a huge mismatch. The promise of a tool like Joule is to flip that equation. If AI handles the technical grunt work and documentation dredging, the expensive human consultant can shift that 80% of effort toward business strategy and outcomes. That’s a compelling value proposition, if you can get past the initial skepticism.

The prompt engineering reality

Vazquez admits we’re still in the “toddler” stages of AI. Right now, getting good results from copilots like Joule heavily depends on prompt engineering. Consultants have to learn to frame requests precisely—like telling the AI to act as a senior CTO specializing in finance. It’s a new skill. But it’s also becoming a great equalizer. New hires and interns, who are often more tech-savvy, can use the AI to operate independently and ask better, more targeted questions of senior mentors. That creates a positive feedback loop where the seniors see the juniors being effective with the tool and are more inclined to adopt it themselves. It’s a clever, organic adoption strategy.

The agentic AI future

The endgame here isn’t just a better chatbot. SAP is looking toward what it calls “the consultant of 2030,” enabled by agentic AI. The company’s advantage, it claims, is its repository of over 3,500 mapped business processes and the fact its systems support trillions in commerce daily. That’s the training ground. The vision is for AI to move beyond answering prompts to interpreting entire business processes, knowing where to inject human judgment and where an AI agent can autonomously execute. That’s a big leap from today’s prompt-driven world. But it starts with overcoming that very human instinct to reject the machine’s work before even looking at it. SAP’s experiment shows that might be the biggest hurdle of all.

Leave a Reply

Your email address will not be published. Required fields are marked *