AgentCraftLab System Architecture Guide
This document is intended for developers who want to understand or extend AgentCraftLab, covering core architecture, execution flow, and extensibility mechanisms.
1. Solution Overview
AgentCraftLab is a fully open-source AI Agent orchestration platform with a modular architecture.
| Project | Purpose |
|---|---|
AgentCraftLab.Api | Pure backend API (AG-UI + REST, Minimal API endpoints, port 5200) |
AgentCraftLab.Web | React frontend (React Flow + CopilotKit + shadcn/ui, port 5173) |
AgentCraftLab.Search | Standalone search engine library (FTS5 + vector + RRF hybrid search) |
AgentCraftLab.Engine | Core engine (strategies + nodes + tools + Middleware + Hooks) |
AgentCraftLab.Autonomous | ReAct loop + Sub-agent collaboration + 12 meta-tools + safety mechanisms |
AgentCraftLab.Autonomous.Flow | Flow structured execution (LLM planning -> 7 node types -> Crystallize) |
AgentCraftLab.Script | Multi-language sandbox engine (Jint JS + Roslyn C#, IScriptEngine / IScriptEngineFactory interfaces) |
AgentCraftLab.Ocr | OCR engine (Tesseract, IOcrEngine interface) |
AgentCraftLab.Data | Data layer abstraction (15 Store interfaces + Documents + CredentialProtector, zero dependencies) |
AgentCraftLab.Data.Sqlite | SQLite Provider (EF Core, default) |
AgentCraftLab.Data.MongoDB | MongoDB Provider (MongoDB.Driver) |
AgentCraftLab.Data.PostgreSQL | PostgreSQL Provider (EF Core + Npgsql) |
AgentCraftLab.Data.SqlServer | SQL Server Provider (EF Core + Microsoft.Data.SqlClient) |
Tech Stack: .NET 10 + LangVersion 13.0, using Microsoft.Agents.AI series APIs (Semantic Kernel is prohibited).
Feature Placement Decision: Core features go in Engine; search/extraction/chunking goes in Search; new DB providers go in extensions/data/.
2. Database Provider Switching
The system selects the database provider based on the Database:Provider configuration at startup:
Database:Provider = ?
|
+-- "sqlite" --> SQLite (default, zero configuration)
+-- "mongodb" --> MongoDB
+-- "postgresql" --> PostgreSQL
+-- "sqlserver" --> SQL ServerAll Store interfaces (IWorkflowStore, ICredentialStore, etc.) have implementations for each provider. The DI container registers the appropriate implementation at startup based on configuration.
3. Workflow Execution Three-Layer Architecture
Workflow execution is the core path of the system, divided into three layers:
WorkflowExecutionService.ExecuteAsync(request) <-- Lean orchestrator (~180 lines)
|
+-> ParseAndValidatePayload <-- Validate JSON payload
+-> Hook(OnInput) <-- Input interception
+-> WorkflowPreprocessor.PrepareAsync <-- Layer 2: Node classification + RAG + AgentContext
| |
| +-> Classify nodes (executable / data / meta)
| +-> Parse RAG nodes, execute ingest
| +-> AgentContextBuilder builds context for each agent
|
+-> WorkflowStrategyResolver.Resolve <-- Layer 3: Strategy selection
+-> IWorkflowStrategy.ExecuteAsync <-- Strategy execution
+-> Hook(OnComplete / OnError) <-- Completion/error callbacks
+-> yield IAsyncEnumerable<ExecutionEvent> <-- Streaming outputLayer Responsibilities
| Layer | Class | Responsibility |
|---|---|---|
| Orchestration Layer | WorkflowExecutionService | Pipeline assembly, error handling, Hooks invocation |
| Preprocessing Layer | WorkflowPreprocessor | Node classification, RAG indexing, AgentContext construction |
| Strategy Layer | IWorkflowStrategy | Concrete execution logic (determines execution order by topology) |
4. Five Execution Strategies and Auto-Detection
Strategy Overview
| Strategy | Description | Use Case |
|---|---|---|
| Single | Single agent direct execution | Only one agent node |
| Sequential | Execute sequentially by topological sort | Linear pipeline |
| Concurrent | Multiple agents execute in parallel | Independent agents, no dependencies |
| Handoff | Control transfer between agents | Any agent has multiple outgoing edges |
| Imperative | Step-by-step imperative execution (supports branching/looping) | Contains control flow nodes such as condition/loop/code |
Auto-Detection Logic
NodeTypeRegistry.HasAnyRequiringImperative() == true ?
|-- Yes --> Imperative strategy
|-- No --> Any agent has multiple outgoing edges?
|-- Yes --> Handoff strategy
|-- No --> Agent count == 1?
|-- Yes --> Single strategy
|-- No --> Sequential strategyWorkflowStrategyResolver.Resolve() encapsulates this logic. To extend with a new strategy: implement the IWorkflowStrategy interface + add a case in Resolve.
5. Node System
NodeTypeRegistry -- Single Source of Truth
All node metadata is centralized in NodeTypeRegistry. Each node type defines three flags:
| Node | IsExecutable | RequiresImperative | IsAgentLike | Description |
|---|---|---|---|---|
agent | Y | Y | Local LLM Agent (ChatClientAgent + tools) | |
a2a-agent | Y | Y | Y | Remote A2A Agent (URL + format) |
autonomous | Y | Y | Y | ReAct loop (interface-decoupled) |
condition | Y | Y | Conditional branching | |
loop | Y | Y | Loop | |
router | Y | Y | Multi-route classification | |
human | Y | Y | Pause waiting for user input | |
code | Y | Y | Deterministic transformation (9 modes + JS/C# dual-language sandbox) | |
iteration | Y | Y | foreach loop (SplitMode + MaxItems 50) | |
parallel | Y | Y | fan-out/fan-in parallelism | |
http-request | Y | Y | Deterministic HTTP call | |
start / end | Meta nodes (IsMeta) | |||
rag | Data node (IsDataNode) |
NodeExecutorRegistry
Each executable node type maps to an executor handler. The execution engine dispatches via NodeExecutorRegistry lookup.
Steps to Add a New Node
- Add a constant string in the
NodeTypesclass - Add a metadata definition entry in
NodeTypeRegistry - Add the corresponding handler in
NodeExecutorRegistry - (Frontend) Add node rendering definition in JS
NODE_REGISTRY
6. Agent Tool Resolution -- Four-Layer Tool Sources
AgentContextBuilder.ResolveToolsAsync() is responsible for merging all tool sources:
+------------------------------------------------------+
| AgentContextBuilder |
| |
| Layer 1: Tool Catalog (Built-in Tools) |
| - Static tools registered by ToolRegistryService |
| - web_search, file_read, calculator, etc. |
| |
| Layer 2: MCP Servers (Dynamic External Tools) |
| - Connect to external Tool Servers via MCP protocol|
| - Dynamically enumerate available tools |
| |
| Layer 3: A2A Agents (Agent-to-Agent Invocation) |
| - Remote Agents exposed as tools |
| - Supports Google / Microsoft formats |
| |
| Layer 4: HTTP APIs + OCR + Script |
| - API calls from http-request nodes |
| - OCR (IOcrEngine, Tesseract) |
| - Script (IScriptEngineFactory, Jint JS + Roslyn C# sandbox)|
| |
| --> Merged into a unified AITool[] for ChatClientAgent|
+------------------------------------------------------+Each agent node's tools field references a list of tool IDs. ResolveToolsAsync retrieves the corresponding tool instances from the four-layer sources by ID.
7. Middleware Pipeline
Middleware uses DelegatingChatClient (Decorator pattern), wrapped in order by AgentContextBuilder.ApplyMiddleware():
Outer Inner
| |
v v
GuardRails --> PII --> RateLimit --> Retry --> Logging --> ChatClientRAG is mounted independently of this pipeline (via RagChatClient).
7.1 GuardRails — Enterprise-Grade Content Safety
Decoupled through the IGuardRailsPolicy interface, with a default implementation DefaultGuardRailsPolicy that can be replaced with an ML classifier, Azure Content Safety, or NVIDIA NeMo Guardrails.
| Feature | Description |
|---|---|
| Keyword + Regex Rules | text.Contains() (CJK-safe) + RegexOptions.Compiled |
| Three-Level Actions | Block (reject with denial message), Warn (warn but allow), Log (silent logging) |
| Prompt Injection Detection | 9 built-in patterns (Chinese and English), opt-in activation |
| Topic Restriction | Restrict Agent to discuss only whitelisted topics |
| Full Message Scanning | Scans all User messages by default (not just the last one), preventing multi-turn attacks |
| Output Scanning | Optional LLM response scanning (buffered scanning in streaming mode) |
| Audit Logging | [GUARD] Direction=Input, Action=Block, Rule="hack", Match="hack" |
| Frontend Configuration | Blocked/Warn Terms, Regex Rules, Allowed Topics, Injection Detection, Custom Block Response |
7.2 PII Protection — Enterprise-Grade Personal Data Detection and Anonymization
Decoupled through IPiiDetector + IPiiTokenVault interfaces, replaceable with ONNX NER models, Microsoft Presidio, or Azure AI Language.
| Feature | Description |
|---|---|
| 35 Regex Rules × 6 Locales | Global / TW / JP / KR / US / UK, covering GDPR/HIPAA/PCI-DSS |
| 7 Checksum Validations | Luhn (credit card), mod97 (IBAN), Taiwan ID/tax ID, JP My Number, KR RRN, UK NHS |
| Context-Aware Weighting | Scans surrounding keywords to boost confidence, reducing false positives |
| Reversible Tokenization | Type-based tokens like [EMAIL_1], [PHONE_1], auto-restored after LLM response |
| Irreversible Mode | Fixed *** replacement (backward compatible) |
| Bidirectional Scanning | Input anonymize + Output detokenize |
| Audit Logging | [PII] Direction=Input, Entities=[Global.Email:1, TW.Phone:1], Count=2 (never logs raw PII) |
| Frontend Configuration | Mode (reversible/irreversible), Locale multi-select, Confidence Threshold, Scan Output |
7.3 RateLimit — Token Bucket Rate Limiting
Uses System.Threading.RateLimiting.TokenBucketRateLimiter (default 5 requests per second). Queue capacity 10, FIFO ordering. AcquireAsync has a 30-second timeout protection.
7.4 Retry — Exponential Backoff Retry
Default maximum 3 retries with exponential backoff (500ms → 1s → 2s). IsTransient uses HttpRequestException.StatusCode pattern matching (429/502/503/504) + TaskCanceledException + TimeoutException. Logs LogError when retries are exhausted. In streaming mode, only failures before the first chunk are retried.
7.5 Logging — Structured Logging
Logs input (truncated to 100 characters) + duration. Both non-streaming and streaming modes have try-catch exception logging (including duration).
7.6 RAG — Retrieval-Augmented Generation
Mounted independently (not in the ApplyMiddleware pipeline). Supports temporary indexes + parallel multi-knowledge-base search (Task.WhenAll). Search failures degrade gracefully (non-blocking, logged as Warning).
Design Highlights
- Interface Decoupling: GuardRails (
IGuardRailsPolicy) and PII (IPiiDetector+IPiiTokenVault) are both abstracted through interfaces, allowing replacement with ML/cloud services without modifying the Middleware - DI Smart Reuse:
ApplyMiddlewarepreferentially uses DI singletons, only creating new instances when the frontend specifies custom rules - Dual Constructors: Each enhanced Middleware has both a new version (DI interface injection) and a legacy version (config dictionary) constructor, fully backward compatible
- Defensive Programming: RateLimit has timeout protection, Retry has StatusCode pattern matching, Logging records exceptions, RAG degrades gracefully
- Enterprise Compliance: PII audit logs comply with GDPR Art.30 (never logs raw PII); GuardRails supports Prompt Injection detection and Topic restriction
Extension Method: Inherit from DelegatingChatClient to implement new Middleware, and add the corresponding case in ApplyMiddleware(). Or implement IGuardRailsPolicy / IPiiDetector to replace the detection engine.
8. Workflow Hooks
Hooks provide 6 lifecycle insertion points, triggered at different stages of workflow execution:
User Input
|
v
OnInput ---------> Can intercept/transform input
|
PreExecute ------> Before workflow starts
|
+-- Node Loop --+
| |
| PreAgent | --> Before each agent executes
| PostAgent | --> After each agent executes
| |
+--------------+
|
OnComplete ------> Successfully completed
OnError ---------> Execution failedHook Types
| Type | Mechanism |
|---|---|
code | Executed via TransformHelper (supports 9 transformation modes) |
webhook | HTTP POST to specified URL |
BlockPattern supports regex interception -- if input matches the pattern, execution is immediately rejected.
9. Credentials Backend Encrypted Storage
API keys are handled entirely on the backend; the frontend never touches plaintext:
React /settings page
|
| POST /api/credentials { provider, apiKey }
v
ICredentialStore.SaveAsync()
|
| DPAPI encryption (Windows Data Protection API)
v
Database storage (ciphertext)
--- At Runtime ---
WorkflowExecutionService
|
| ResolveCredentialsAsync()
v
ICredentialStore.GetDecryptedCredentialsAsync()
|
| DPAPI decryption
v
Plaintext API Key --> Injected into ChatClientThe frontend uses a saved flag to determine whether a key has been configured; localStorage does not store any plaintext.
10. Chat Attachment Upload Pipeline
CopilotKit natively supports only image uploads and the implementation is incomplete, so a standalone upload pipeline is used:
User selects file
|
| POST /api/upload (multipart/form-data)
v
Backend temporary storage (1-hour TTL) --> Returns { fileId }
|
| CopilotKit forwardedProps.fileId
v
AG-UI endpoint receives fileId
|
| GetAndRemove(fileId) --> Retrieves file, removes temporary storage
v
WorkflowPreprocessor processing
|
+-- Document type --> RAG ingest (extraction + chunking + indexing)
+-- ZIP file --> Decompress + batch processing
+-- Image/other --> Multimodal DataContent (directly injected into LLM)The frontend uses StableChatInput (module-scope definition) + chatInputFileRef to ensure stable component identity, preventing CopilotChat from rebuilding the Input component.
11. Autonomous Agent -- ReAct + Flow Dual Mode
Three-Layer Execution Funnel
+----------------------------------------------------------+
| Engine Workflow (Human-Designed) |
| - Developers manually drag-and-drop nodes, define flows |
| - Fully deterministic |
+----------------------------------------------------------+
|
v
+----------------------------------------------------------+
| Flow (AI Planning + Structured Execution + Crystallize) |
| - IGoalExecutor interface |
| - LLM generates FlowPlan --> 7 node types structured |
| execution |
| - Crystallizes into editable Workflow after completion |
+----------------------------------------------------------+
|
v
+----------------------------------------------------------+
| ReAct (Fully Autonomous) |
| - ReactExecutor (~540 lines) |
| - Observe -> Think -> Act loop |
| - 12 meta-tools + Sub-agent collaboration |
+----------------------------------------------------------+ReAct Mode Core
Strategy Object Decomposition:
| Interface | Responsibility |
|---|---|
IBudgetPolicy | Token/step budget control |
IHumanInteractionHandler | Human-machine interaction (ask_user) |
IHistoryManager | Conversation history management |
IReflectionEngine | Self-reflection (Reflexion) |
IToolDelegationStrategy | Tool selection and delegation |
Dual Model Architecture: TaskPlanner uses a strong model (gpt-4o) for planning, ReactExecutor uses a weak model (gpt-4o-mini) for execution, reducing costs.
12 meta-tools (MetaToolFactory):
- Sub-agent management: create / ask / spawn / collect / stop / send / list
- Shared state: shared_state
- Human interaction: ask_user
- Quality control: peer_review / challenge
Safety Levels: P0 Risk Approval -> P1 Transparency -> P2 Self-Reflection -> S1~S8 Isolation Protection
Flow Mode Core
Decoupled from ReAct through the IGoalExecutor interface. DI switching:
// ReAct mode
services.AddAutonomousAgent();
// Flow mode
services.AddAutonomousFlowAgent();Funnel Bridging: ReactTraceConverter converts spawn/collect traces into FlowPlan JSON, stores them in ExecutionMemoryService, and injects them as a Reference Plan during the next Flow planning session.
Crystallize: After execution completes, converts the ExecutionTrace into Studio buildFromAiSpec JSON, stored in Data/flow-outputs/, which can be directly loaded into Workflow Studio for editing.
12. CraftSearch Search Engine
AgentCraftLab.Search is a standalone class library with no dependencies on Engine or Autonomous.
Core Interfaces
ISearchEngine --> Search entry point (query + options)
IDocumentExtractor --> Document content extraction (PDF/DOCX/HTML/TXT...)
ITextChunker --> Text chunking (fixed size + overlap)Three Search Modes
+------------------+ +------------------+ +------------------+
| FullText | | Vector | | Hybrid |
| (FTS5) | | (SIMD Cosine) | | (RRF k=60) |
| | | | | |
| SQLite FTS5 | | 1536-dim vector | | FullText rank |
| Tokenizer + | | Cosine | | + Vector rank |
| BM25 | | similarity | | RRF fusion |
| | | SIMD accelerated| | |
+------------------+ +------------------+ +------------------+RRF (Reciprocal Rank Fusion): Fuses full-text and vector rankings with k=60, balancing exact keyword matching and semantic similarity.
Provider Implementations:
SqliteSearchEngine-- Production (open source)InMemorySearchEngine-- Unit testing
13. RAG Pipeline
RAG functionality is built on top of CraftSearch, providing a complete retrieval-augmented generation pipeline.
Ingest Flow
File upload
|
v
IDocumentExtractor.ExtractAsync() --> Extract text (multi-format support)
|
v
ITextChunker.ChunkAsync() --> Chunk (fixed size + overlap window)
|
v
Embedding (1536-dim) --> Vectorization
|
v
ISearchEngine.IndexAsync() --> Build indexQuery Flow
User question
|
v
RagService.SearchAsync() --> Hybrid search (FTS5 + Vector + RRF)
|
v
RagChatClient (DelegatingChatClient) --> Inject search results into system message
|
v
LLM response (based on retrieved context)indexName Conventions
| Format | Purpose |
|---|---|
{userId}_rag_{guid} | Temporary index (single upload) |
{userId}_kb_{id} | Knowledge base index (persistent) |
14. CopilotKit Frontend Architecture
System Overview
+-------------------+ +-------------------+ +-------------------+
| React Frontend | | CopilotKit | | .NET API Backend |
| (port 5173) | | Runtime | | (port 5200) |
| | | (port 4000) | | |
| React Flow | --> | Node.js | --> | Minimal API |
| CopilotKit SDK | | server.mjs | | AG-UI Endpoints |
| shadcn/ui | | AG-UI Protocol | | WorkflowEngine |
| i18n (en/zh-TW) | | Relay | | CraftSearch |
+-------------------+ +-------------------+ +-------------------+
Vite dev Middle Layer Backend ServicesAG-UI Protocol
CopilotKit Runtime serves as a middle layer, converting the frontend CopilotKit format to the AG-UI (Agent-UI) protocol for communication with the .NET backend. The backend streams events back to the frontend via IAsyncEnumerable<ExecutionEvent>.
Frontend Main Modules
| Module | Description |
|---|---|
| Workflow Studio | React Flow canvas, drag-and-drop workflow construction |
| Chat Panel | CopilotChat integration, supports attachment uploads |
| Settings | Personal settings, Credentials, default model |
| Skill Manager | Skill management (built-in + custom) |
| Service Tester | Dual-panel + Chat, 5 protocol testing modes |
| KB Manager | Knowledge base file upload + SSE progress streaming |
Key Design Decisions
- Standalone Upload Pipeline: CopilotKit native upload has many limitations, replaced with standalone
POST /api/upload - ErrorBoundary: Global error boundary, React component crashes do not cause white screens
- i18n: Supports en + zh-TW, divided into three namespaces: common / studio / chat
- StableChatInput: Module-scope definition, prevents input field state loss caused by CopilotChat rebuilding
Appendix: Extensibility Quick Reference
| Extension Item | Steps |
|---|---|
| New Execution Strategy | Implement IWorkflowStrategy + add case in WorkflowStrategyResolver.Resolve() |
| New Node Type | NodeTypes constant + NodeTypeRegistry metadata + NodeExecutorRegistry handler + JS NODE_REGISTRY |
| New Built-in Tool | Add method in ToolImplementations.cs + ToolRegistryService.Register() |
| New Middleware | Inherit from DelegatingChatClient + add case in ApplyMiddleware() |
| New Flow Node | FlowNodeRunner case + FlowPlannerPrompt + FlowPlanValidator + WorkflowCrystallizer |
| Replace Script Engine | Implement IScriptEngine + DI replacement (Jint JS / Roslyn C# / Python) |
| Add Script Language | ScriptEngineFactory.Register("language", engine) + add option to frontend CodeForm SCRIPT_LANGUAGES |
| Replace OCR Engine | Implement IOcrEngine + DI replacement |
| New Tool Module | Follow the AddXxx() + UseXxxTools() pattern from AgentCraftLab.Ocr / AgentCraftLab.Script |
| Replace Autonomous Strategy | Implement the corresponding interface (IBudgetPolicy, etc.) + DI Replace registration |