What it is, and what it isn't.
Agentic IT collapses the cost, security, compliance, and reliability work of running Azure and Microsoft 365 into a single chat. The agent reads your inventory, knows your conventions, formulates a plan, and offers to run the fix once you approve. This page is the pitch — and the boundary lines.
Your phone buzzes. Cost spend on Azure jumped 14% over the weekend. The dashboard is open. So is Defender. So is the cost analyser. So is the architect's Notion page about which subscriptions are even yours.
You can spend ninety minutes piecing together what changed across seven tools — or you can open one chat and say: "What spiked our Azure spend over the weekend, and what should I do about it?"
The agent reads the inventory, runs the numbers, identifies the three new VMs in rg-prod-eu with no owner tag, drafts a remediation plan, and shows you the approval card before it touches a thing. Total time: under two minutes.
That's the platform. The rest of this page explains what makes that interaction possible — and where it stops.
The pain we solve
Mid-size IT teams running on Azure / Microsoft 365 / vBox-cloud carry permanent operational debt. Three shapes of pain, by name:
The orphans nobody owns
Disks attached to deleted VMs. Public IPs from a six-month-old test. Storage accounts older than the engineer who created them. The bill keeps charging; nobody knows what's safe to remove.
Quiet 5–18% of your monthly cloud billThe wiki doesn't match prod
The naming convention says rg-{env}-{region}-{app}. Reality says half of it. The tagging policy says owner is required. Reality says it's empty in 1,400 resources. Drift compounds; nobody has a free afternoon to crawl it.
The senior just left
The runbook for "what to do when Defender flags a public storage account" lived in their head. The OneNote page has the first three steps. Step four is a question mark. Onboarding a replacement takes weeks of re-discovery.
Weeks of capacity, repeated per role changeToday, vs. with Agentic IT
The same workflow — investigate a cost spike — measured in tools, copy-pastes, and minutes:
Investigate a 14% Azure cost spike
- Open Azure Cost Management. Filter to last 7 days. Spot a peak in subscription
prod-eu. - Open Azure Resource Graph. Query for resources created last weekend. Copy-paste 12 IDs into a notepad.
- Open the architect's Notion page to remember which RG should own those resources.
- Switch to Defender for Cloud. Cross-check whether any of these 12 resources have alerts.
- Open
azCLI. Runaz resource showon each ID to find the owner tag — half are blank. - Open Slack. Ask the team channel: "did anyone provision X over the weekend?"
- Wait. Decide. Document. Maybe write a follow-up ticket. Hope you don't forget.
The same investigation with Agentic IT
- Open the chat. Type: "What spiked our Azure spend over the weekend, and what should I do?"
- The orchestrator routes to the cost specialist. It runs
get_cost_breakdown,list_orphaned_disks,resource_inventoryin parallel — all under one approval-gated tool surface. - Within 30 seconds: a streamed answer with three new VMs in
rg-prod-eu, no owner tags, $42/day each. A drafted remediation: tag, downsize, or delete. - Click ✓ on the approval card to apply tags. Click ✗ on the delete suggestion until you've checked with the team.
- The whole turn is logged in Langfuse with cost, tokens, and tools used. The chat lives in the project. Tomorrow's standup has a link.
What you can actually do
Five outcomes the platform delivers today, mapped to the tools and agents behind them:
Find the money
Identify orphaned resources, idle VMs, mis-sized disks, and over-provisioned everything. Estimate savings before you act.
- list_orphaned_disks · list_orphaned_nics
- list_idle_vms · get_cost_breakdown
- estimate_savings · price-calculator MCP
Tighten the perimeter
Find open ports, public storage, expired Key Vault items, weak SQL configurations, missing NSG associations. Prioritise by Defender score impact.
- list_open_ports · list_public_storage
- list_keyvault_issues · list_defender_alerts
- get_secure_score · list_unassociated_nsgs
Stay in compliance
Tag drift, RBAC drift, missing resource locks, policy violations across management groups. Surface them before audit week.
- list_missing_tags · list_policy_violations
- get_rbac_issues · list_resource_locks
- list_management_groups
Sleep at night
Backup coverage, site-recovery health, monitoring gaps, alert quality. Catch the broken backup before the disaster, not after.
- get_backup_coverage · list_unprotected_vms
- get_site_recovery_status · get_alert_coverage
- list_missing_diagnostic_settings
Go faster, every day
Schedule recurring scans, save invocable skills (slash commands), share projects, sync to Git or SharePoint, hand off via persistent project memory.
- cron + one-shot scheduled jobs
- invocable skills · slash commands
- GitHub / SharePoint two-way sync
The platform, in numbers
The numbers are the design's load-bearing claims. Eleven MCP integrations means the agent can talk to vBox, Jira, Outlook, Teams, SharePoint, Azure CLI, web search, document tools, and the schedule store through one uniform protocol. Eight specialists fan out in parallel when a question warrants depth. Thirty-eight entities cover the whole domain — chats, projects, artifacts, memory, schedules, integrations, audit. One hundred percent is the only acceptable number for "destructive operations gated" — every write tool routes through an approval card. Five seconds is the watchdog SLA on the Stop button. Zero vendor lock-in: the LLM transport is OpenRouter; switch any model per tenant, per user, per turn.
Who it's for
IT operator
Daily ops, troubleshooting, hygiene. The chat replaces the "open six tools, copy IDs around, decide" loop with a conversation that knows your inventory.
Cloud architect
Cost / security / reliability reviews. The suggestion engine pre-stages findings; the project memory tracks decisions across reviews.
Tenant admin
Manages agents, tools, model modes, credit caps, custom MCP connections, sharing. Configurable everything; observable everything via Langfuse.
Developer
Uses Claude Code / OpenCode integrations and the terminal for IDE-style work. Artifacts sync to Git automatically.
What it is not
Knowing the boundaries is half the pitch. The platform is deliberate about what it doesn't try to be:
- Not an APM No tracing of your customer requests. We trace our own LLM calls; that's it.
- Not a SIEM We surface Defender alerts, we don't aggregate logs from your fleet. Use Sentinel / Splunk for that.
- Not a ticketing system The Jira MCP integration writes to your tracker; the platform doesn't try to replace it.
- Not a portal replacement The Azure portal still exists. Agentic IT sits next to it, reading the same APIs, presenting the workflow as a conversation.
- Not a dashboard tool No widgets, no time-series graphs. The output of a session is a chat thread, an artifact, or an executed change.
- Not auto-pilot Destructive operations require explicit user approval. The agent never writes to your cloud silently. → §07·Approvals
The three principles that shape everything
If you remember nothing else, remember these:
The chat is the IDE
Artifacts, terminal sessions, schedules, approvals — every workflow can be initiated, observed, and resolved inside the chat thread. The WebSocket carries token streams, tool calls, approval gates, artifact saves, and routing events as separate event types so the UI can render them inline. → §02·Request lifecycle
Approvals before destruction
A platform that can delete cloud resources must never do so without an explicit user click. Every write tool routes through IApprovalManager and lands in the audit trail (tool_input JSON, tool_result JSON, approval status). The headless path auto-approves only because there is no user — and tenant admins can deny destructive tools entirely. → §07·Approvals
The user owns the model and the integrations
Tenant admins choose which LLM modes appear; users override the model per mode; users connect their own MCP servers. The platform never assumes a single global LLM or a fixed integration list. The auto-router transparently downgrades to the cheapest capable model when a budget is hit — never blocks. → §07·Credits & auto-router
Open the chat. Ask the question. Approve the action. Never lose the audit trail. Never get blocked by cost. Never wait for a senior engineer's runbook page to load. That's the entire pitch.
For six concrete end-to-end examples — cost spike triage, security review, subscription onboarding, backup audit, a scheduled scan, and a live incident — read §01.2 · Real use cases →.