TL;DR
Building agentic AI applications with a problem-first approach means defining the specific problem, its constraints, its success criteria, and its required human oversight points before choosing a framework, model, or orchestration pattern. Organizations that do this report 40% higher deployment success rates than technology-first teams (Globe Streak, Nov 2025).
The most common cause of enterprise agentic AI production failures is not inadequate AI models — it is treating the model as the starting point. Without a defined problem, an AI agent has no bounded scope, no calibrated confidence threshold, and no designed human escalation path. The result is what one analyst calls “a stochastic loop that lacks a defined exit condition” (Miles K., Medium, Feb 2026).
In regulated European enterprises, the problem-first approach is not a methodology preference — it is a compliance requirement. EU AI Act Article 9 requires ongoing risk management built into every deployment stage. The five questions in this framework produce the problem definition that Article 9 risk management, Article 14 human oversight design, and GDPR Article 30 audit trail scoping all depend on.
| # | Question | What a good answer looks like | Why it matters for agent design |
|---|---|---|---|
| 1 | What specific process fails, and how does the failure manifest? | A named workflow step, a measurable outcome gap, a specific friction point. Not “we want to automate” — “claims triage takes 14 days because classifiers manually review 2,400 submissions per week.” | Defines the agent’s task precisely. Without this, the agent’s scope cannot be bounded — and unbounded scope is the primary cause of production failures. |
| 2 | What constraints govern the solution? (compliance, data, human, time) | Named regulatory frameworks (GDPR Art. 30, DORA, EU AI Act Art. 14), data residency requirements, mandatory human review steps, SLA windows. Constraints are not obstacles — they are design inputs. | Constraints define what the agent cannot do. EU AI Act Art. 9 requires ongoing risk management built into every deployment stage — the problem definition is where that risk management starts. |
| 3 | What does a successful outcome look like — and how will you measure it? | A specific metric with a baseline: “14-day average processing time reduced to under 4 days” or “2,400 manual reviews per week reduced by 70% while maintaining <0.5% error rate.” | Without a success metric defined before deployment, the agent’s confidence threshold cannot be calibrated — and you cannot know whether the agent is performing or slowly degrading. |
| 4 | Where does human judgment remain essential? | Specific decision points: claims above €50,000, eligibility determinations where the applicant disputes the decision, cases involving incomplete data. Not “humans review everything” — a specific threshold. | This question produces the human-in-the-loop design. EU AI Act Art. 14 requires human oversight mechanisms for high-risk AI. The answer to this question is the governance design, not a policy statement. |
| 5 | What existing systems does the solution need to integrate — and what needs to be true about them? | Named systems with integration method: “SAP S/4HANA via OData for order data; legacy fleet system via SOAP for dispatch records.” Data quality requirements. API availability. Real-time vs. batch. | An agent operating on incomplete data produces confident wrong answers. The integration architecture is not a technical detail to resolve after the agent is designed — it is a prerequisite for knowing whether the problem is even solvable agentically. |