Orchestrator, loop, specialists.
The brain of the platform. A planner LLM picks the route per user turn; the chosen runner executes the tool loop; synthesis combines specialist outputs. All output streams over a single channel that fans out to all attached WebSockets.
SmartOrchestrator
backend/src/AgenticIT.Agent/Orchestration/SmartOrchestrator.cs — 48 KB
The orchestrator is the public entry point of the agent layer. RunAsync(OrchestratorRequest) is what every caller invokes — the WebSocket handler for live chat, the headless runner for scheduled jobs, the proactive scanner for unattended tasks.
Inside RunAsync:
- Creates a Langfuse trace via
_traceFactory.Create(...)— sessionId, userId, tenantId, mode, displayName, redaction flag, parent trace ID, tags. - Opens
_accessor.BeginScope(trace). Every downstreamIAnthropicClientcall automatically becomes a Generation under this trace, with no per-call wiring required (see §03.8). - Attaches routing metadata when
req.RoutingDecisionis non-null — requested model, routing decision, retry attempt — for cost-attribution downstream (PBI 60387). - Builds project context via
ProjectRepository.BuildProjectContextAsync()— assembles a Markdown block of project metadata + artifact manifest. - Loads
tenant.DomainKnowledgeandtenant.CriticalRules. These override every other instruction in the system prompt; they are the tenant's last word. - Calls a planner LLM. The response is parsed by
RouteParserinto a routing decision:directorspecialist. - Dispatches:
direct→SingleAgentLoop.RunAsync(...)specialist→SpecialistRunner.RunAsync(...)for N agents in parallel, then synthesis
Specialists are expensive — fanning out 4–6 agents in parallel multiplies token cost. The planner is one cheap, fast LLM call that decides whether the user query actually warrants the fan-out, or whether one generalist will do.
SingleAgentLoop
backend/src/AgenticIT.Agent/Loop/SingleAgentLoop.cs — 86 KB
The hot loop: build prompt → call LLM → execute tools → repeat until done. Most of the file is the bookkeeping needed to keep this stable under streaming, tool errors, history corruption, budget exhaustion, and user interruption.
Prompt assembly
InstructionModuleLoader.BuildSystemPrompt()assembles: date anchor → user identity → core-advisor → tool-selection → optional MCP/project skills → tenant.DomainKnowledge → tenant.CriticalRules → user timezone.- The
core-advisormodule must literally contain the string "Agentic IT" (with space) — it is the canonical product name."AgenticIT"compressed without space is a bug.
Tool merging
ToolRegistry.GetEffectiveTools(projectId, requested)filters by project subscription scope (e.g. artifact tools only whenprojectId != null)._toolSetBuilder.BuildAsync()merges built-in tools with user-connected MCP servers' tools.- Tenant
TenantToolAccessEntityoverrides apply:Hiddentools are removed entirely.
History health
- History healer.
HistoryHealer.RemoveOrphanToolResults()dropstool_resultblocks whose precedingtool_usewas edited away — Anthropic API returns 400 on orphans. See §03.9·Agent loop internals →. DeduplicateHistory()— removes accidental dupes from concurrent saves.
The loop body
- Stream-call the LLM via
TracingAnthropicClient. Tokens emit as they arrive —ChatChannel.SendJsonAsync({type:"token",text}). - For each
tool_useblock:- If destructive,
ApprovalGate.RequestAsync(...)blocks. The channel emits anapproval_requiredevent; the user clicks ✓ in chat;ApprovalGate.Resolve()unblocks. ToolExecutor.ExecuteAsync(...)dispatches: MCP → cached lookup (e.g.list_subscriptions) → az CLI → built-in handler.- The result becomes a
tool_resultblock in history.
- If destructive,
- Circuit breaker.
ToolCircuitBreakertrips if 3 tool calls fail consecutively, terminating the loop. See §03.9·Agent loop internals →. - Loop continues until LLM emits
stop_reason: "end_turn"oreffectiveMaxToolTurnsis reached.
Budget & watchdog
ToolCallLimitResolverdecides the cap: system default (MAX_TOOL_TURNS, 50) → tenant override (TenantEntity.MaxToolTurns) → "Long running" UI flag bumps to 100 for that turn.- Pause-summary & result truncation. When the budget or context limit is hit the loop emits a pause-summary and truncates oversized tool results — full mechanics at §03.9·Agent loop internals →.
StopWatchdogguarantees that a user-clicked Stop terminates within 5 seconds even if the LLM is mid-stream — kills the cancellation token and force-closes the loop.
SpecialistRunner
backend/src/AgenticIT.Agent/Orchestration/SpecialistRunner.cs — 23 KB
Built-in specialists are: cost, security, compliance, reliability, observability, m365, identity, migration. Their configurations live in SpecialistRegistry.BuiltInMap; tenants can override system prompts, tool subsets, colours, and roles via AgentOverrideRepository.
Per specialist, the runner:
- Builds a per-agent system prompt: date anchor + user identity + agent's prompt + optional MCP skills (vBox/Jira if listed) + DomainKnowledge + CriticalRules.
- Streams the LLM call under a
specialist:<name>trace span. - Returns a result dictionary back to the orchestrator.
The orchestrator's synthesis step then combines specialist outputs into a single user-facing response, again via an LLM call traced as synthesis.
ChatChannel — fan-out broadcast
backend/src/AgenticIT.Agent/Channel/ChatChannel.cs — 118 lines
- One channel per chatId
- Held by
ChatChannelRegistry; survives WebSocket detach. - SemaphoreSlim guard
- Only one writer at a time; serialises the byte stream.
- Buffer
- Cap 2000 messages. If the WebSocket is detached, output buffers;
AttachAsyncreplays everything since the last attach. - Headless reads
GetBufferedMessagesAsync()snapshots the buffer for headless / proactive runs that have no live WebSocket.
ChatWebSocketHandler
backend/src/AgenticIT.Api/WebSockets/ChatWebSocketHandler.cs — 53 KB
The single ASP.NET handler that owns the WebSocket lifecycle.
- Validate JWT from the
Sec-WebSocket-Protocol: bearer,<token>subprotocol viaWebSocketExtensions.GetAuthSubprotocol. - Confirm the user is a member of the project (
ProjectMemberEntity). - Attach the WebSocket to the chat's channel; replay any buffered output.
- Hoist the receive task — only one
ReceiveAsyncmay be pending at a time. - Launch
SmartOrchestrator.RunAsyncas fire-and-forgetTask.Run, with a fresh DI scope.
Approval flow
backend/src/AgenticIT.Agent/Approval/ApprovalGate.cs · IApprovalManager.cs
- Create()
- Registers a pending approval keyed by toolCallId; returns a
Taskthat completes when resolved. - Channel emits
{type:"approval_required", id, tool, input}— frontend renders an ApprovalCard.- Resolve()
- Frontend sends
{type:"approval", id, decision}. The handler routes toIApprovalManager.Resolve(); the gate's task completes; the loop proceeds. - Audit
- Tool name, input, result, and approval status are written to
AuditLogEntity. - Headless mode
- The headless job runner installs
HeadlessApprovalManagerwith a 1-hour timeout that auto-approves all destructive ops — there is no live user.
Concurrency invariants you must not break
- Only one
ReceiveAsyncin flight on a WebSocket at any time. - Fire-and-forget
Task.Run⇒ create a fresh DI scope; never capture an outer scoped service. ChatChannel.SendJsonAsyncserialises writes — don't bypass it by writing to the WebSocket directly.- OpenRouter URL is relative without a leading slash:
"messages", not"/v1/messages".