Appearance
How To Run An Agentic AI Engagement
Use this guide when you need to run an internal initiative or client engagement, manage decision gates, and produce delivery evidence. If you are new to the kit, start with the tutorial.
Purpose
This process helps organizations convert agentic AI ideas into governed, secure, measurable Microsoft-based business solutions. It is intentionally outcome-first: business value, data readiness, operating accountability, responsible AI, and lifecycle controls are defined before build decisions become permanent.
The process can be used for Microsoft 365 Copilot extensions, Copilot Studio agents, Microsoft Foundry agents, Power Platform and Dynamics 365 AI capabilities, Azure AI workloads, and custom Microsoft-hosted agent systems.
Engagement Principles
- Start with measurable business outcomes, not a preferred tool.
- Treat "not an agent" as a valid decision when deterministic automation, workflow, RAG, analytics, or a prebuilt SaaS capability is enough.
- Prefer Microsoft SaaS and extension points where they meet the requirement; move to Copilot Studio, Foundry, or custom architecture only when the need justifies the added complexity.
- Require every agent to have an owner, identity, registry entry, access boundary, control evidence, telemetry, cost model, and retirement path.
- Design human approvals, fallback, escalation, and deterministic workflows for high-risk or regulated actions.
- Promote prompts, agents, connectors, tools, models, and data dependencies through controlled environments using ALM practices.
Phase Gates
Use the gates to stop drift before the work moves to a more expensive stage.
Gate 0: Engage
Decide whether the sponsoring organization or team is ready to run the assessment.
Minimum evidence:
- Named sponsor, business owner, workload owner, and security or compliance contact.
- Target business or operating context and workshop schedule.
Gate 1: Select
Decide which use cases are worth piloting.
Minimum evidence:
- Outcome map and KPI baseline.
- Use-case inventory, "not an agent" decision log, prioritization score, and pilot shortlist.
Gate 2: Shape
Decide what solution pattern should be piloted.
Minimum evidence:
- Data readiness assessment and grounding strategy.
- Retrieval, API, or MCP decision register.
- Platform selection record and target architecture.
Gate 3: Govern
Decide whether the pilot can be built within acceptable controls.
Minimum evidence:
- Agent owner, registry metadata, and RACI.
- Responsible AI assessment, threat model, RBAC, DLP, guardrail plan, and audit trail design.
Gate 4: Validate
Decide whether the pilot proved business value and control effectiveness.
Minimum evidence:
- Prototype, test set, red-team results, task completion, quality, safety, cost, latency, and user feedback.
- Promotion decision.
Gate 5: Scale
Decide whether the solution should scale, be redesigned, pause, or retire.
Minimum evidence:
- Rollout plan, adoption telemetry, observability dashboard, cost controls, support model, and lifecycle review cadence.
Ten-Step Process
1. Frame Business Outcomes
Anchor the work to measurable value before discussing tools.
Ask stakeholders:
- Which outcomes matter most: cost, speed, quality, revenue, customer experience, employee experience, or risk reduction?
- What KPI baseline exists today?
- Who owns the value case?
Do the work:
- Confirm executive intent and affected business capabilities.
- Document the KPI baseline, success criteria, measurement method, and time horizon.
- Create the AI opportunity brief and outcome map.
Exit when each candidate outcome has an owner, baseline, target, measurement method, and review date.
2. Identify And Filter Use Cases
Separate agentic opportunities from simpler automation, static Q&A, analytics, or standard SaaS capabilities.
Ask stakeholders:
- Which workflows need reasoning, tool use, or adaptive decisions?
- Which workflows are deterministic automation, static Q&A, analytics, or standard product capability?
Do the work:
- Inventory ideas and identify workflow steps.
- Classify the required intelligence and log "not an agent" decisions.
- Separate productivity, action, automation, knowledge/RAG, model/analytics, and custom agent patterns.
Exit when every idea has a classification, rationale, and recommended next path.
3. Prioritize The Portfolio
Select pilots with the best combination of value, feasibility, desirability, and control readiness.
Ask stakeholders:
- What is the business impact, technical feasibility, and user desirability of each use case?
- What pilot result triggers scale, redesign, pause, or stop?
Do the work:
- Score use cases and assess adoption readiness.
- Identify risk and control burden.
- Define pilot scope and go/no-go gates.
Exit when one to three pilots have clear hypotheses, metrics, scope boundaries, and decision gates.
4. Assess Data And Grounding
Prove that the agent can use trusted information safely.
Ask stakeholders:
- What data is authoritative, accurate, current, clean, permissioned, available, and compliant?
- Does the agent need search, API access, MCP, connectors, or a mix?
Do the work:
- Map systems of record and access paths.
- Assess data quality, permissions, retention, residency, and compliance constraints.
- Choose grounding methods and document lineage.
Exit when authoritative sources, access paths, data gaps, permissions, retention, residency, and compliance constraints are documented.
5. Select The Solution Pattern
Choose the simplest Microsoft-aligned pattern that meets the need.
Ask stakeholders:
- Can a SaaS or prebuilt agent meet the need?
- Should the solution extend Microsoft 365 Copilot, use Copilot Studio, use Foundry, or become a custom architecture?
- Does the pilot need one agent or multiple agents?
Do the work:
- Apply build, buy, and extend logic.
- Compare Microsoft platform options.
- Define the orchestration pattern and draft the target architecture.
Exit when the selected pattern has rationale, tradeoffs, assumptions, cost implications, and architecture constraints.
6. Define Governance And Operating Model
Establish accountability before build.
Ask stakeholders:
- Who owns each agent, its funding, its risks, and its lifecycle?
- How are agents registered, identified, monitored, paused, retired, and reviewed?
- What are the roles of the platform team, workload team, AI CoE, security, compliance, privacy, risk, and operations?
Do the work:
- Define the RACI, agent lifecycle, registry metadata, intake process, and approval process.
- Assign funding, policy ownership, and AI CoE backlog responsibilities.
Exit when each agent has accountable ownership, registry requirements, lifecycle controls, and decision rights.
7. Design The Agent System
Define boundaries, behavior, and tool use.
Ask stakeholders:
- What are the agent's scope, non-goals, tools, knowledge sources, memory rules, fallback paths, approvals, and escalation routes?
- Which actions are prohibited?
Do the work:
- Write the agent charter.
- Define orchestration, tools, actions, memory, retention, instruction architecture, fallback, human checkpoints, and prompt library standards.
Exit when scope, non-goals, tool permissions, approval points, fallbacks, and escalation routes are explicit and testable.
8. Design Security, Responsible AI, And Compliance
Make risk controls provable.
Ask stakeholders:
- Which prompt injection, data leakage, access, residency, model, audit, and tool-use risks apply?
- Which controls prove fairness, safety, transparency, privacy, security, and accountability?
Do the work:
- Threat model the agent and data flows.
- Map RBAC, DLP, content safety, abuse monitoring, audit logging, data residency, responsible AI evidence, and incident procedures.
- Assign evidence owners and residual risk decision paths.
Exit when material risks have preventive or detective controls, evidence owners, test cases, and residual risk decisions.
9. Build, Test, And Validate
Prove the pilot works before scaling.
Ask stakeholders:
- Which test cases prove quality, safety, compliance, task completion, cost, latency, and user value?
- How are prompts, models, agents, connectors, tools, and data promoted across environments?
Do the work:
- Build the prototype and create the golden test set.
- Run functional, safety, security, red-team, cost, latency, and adoption tests.
- Define ALM and environment promotion.
Exit when the pilot meets agreed thresholds or receives a documented scale, redesign, pause, or stop decision.
10. Roll Out, Operate, And Improve
Turn the pilot into a managed business asset.
Ask stakeholders:
- Where should the agent live in daily work?
- Which telemetry tracks value, quality, safety, cost, latency, and adoption?
- How often are agents reviewed or retired?
Do the work:
- Plan phased rollout and embed the agent in Teams, Microsoft 365, Dynamics 365, Power Apps, or portals.
- Define support, monitor telemetry, collect feedback, control cost, and schedule lifecycle reviews.
Exit when the agent is live with adoption support, telemetry, operations ownership, cost guardrails, feedback loop, and review cadence.
Workshop Mapping
Use the workshops to move accountable stakeholders through the steps in coherent decision blocks.
- Workshop 1, strategy and use-case selection, covers steps 1-3. Stakeholders agree what outcomes matter, which ideas are agentic, and which pilots deserve investment.
- Workshop 2, data and architecture, covers steps 4-5. Stakeholders agree how the pilot will be grounded and which Microsoft solution pattern should be used.
- Workshop 3, governance and risk, covers steps 6-8. Stakeholders agree how the agent will be owned, governed, secured, audited, paused, and retired.
- Workshop 4, pilot design, covers steps 9-10. Stakeholders agree what will be built, how it will be validated, how rollout will happen, and what scale decision will be made.
- Implementation follow-up covers build, validation, operation, and scale. The implementation team prototypes, tests, proves value, hands over operations, and manages lifecycle reviews.
Delivery Cadence
Adjust the timeline to fit stakeholder availability, but keep the order of decisions intact.
- Week 0: Complete pre-work and stakeholder interviews. Produce the initiative scope, participant list, and initial data and system inventory.
- Week 1: Run Workshop 1. Produce the outcome map, use-case inventory, and pilot shortlist.
- Week 2: Run Workshop 2. Produce data readiness findings, grounding strategy, platform decision, and draft architecture.
- Week 3: Run Workshop 3. Produce the governance model, risk/control register, and agent charter draft.
- Week 4: Run Workshop 4. Produce the pilot validation plan, rollout plan, observability approach, and operations approach.
- Weeks 5-8: Prototype and validate. Produce the working pilot, test results, red-team results, and cost, latency, and value telemetry.
- Week 9: Make the scale decision. Recommend scale, redesign, pause, or stop.
- Week 10 and beyond: Operate and improve. Maintain the backlog, review telemetry, run lifecycle reviews, and extract reusable patterns.
Definition Of Done
An engagement is complete when accountable stakeholders can answer and evidence the following:
- What business outcome is being improved and how it is measured.
- Why the selected use case requires an agent or why it does not.
- Which authoritative data, tools, APIs, connectors, and Microsoft services are used.
- Who owns the agent, its funding, its risks, and its lifecycle.
- What the agent may do, what it may not do, and when humans must approve or intervene.
- Which responsible AI, security, compliance, data protection, and audit controls apply.
- How quality, task completion, safety, cost, latency, and adoption are tested.
- How prompts, connectors, agents, tools, models, and data dependencies move across environments.
- Where the agent lives in daily work and how users are trained.
- How the agent is monitored, improved, paused, scaled, or retired.
Microsoft Source Alignment
This process reflects the AB-100 focus on planning, designing, deploying, responsible AI, security, ALM, ROI, monitoring, and agentic-first architecture. It also follows the Cloud Adoption Framework pattern of business planning, technology planning, governance/security, standardized build process, and managed operation.