Spaces:
Configuration error
Configuration error
ToGMAL Architecture
System Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Claude Desktop β
β (or other MCP Client) β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β stdio/MCP Protocol
β
ββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββ
β ToGMAL MCP Server β
β (togmal_mcp.py) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β MCP Tools Layer β β
β β - togmal_analyze_prompt β β
β β - togmal_analyze_response β β
β β - togmal_submit_evidence β β
β β - togmal_get_taxonomy β β
β β - togmal_get_statistics β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β
β β Detection Heuristics β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Math/Physics Speculation Detector β β β
β β β - Pattern: "theory of everything" β β β
β β β - Pattern: "new equation" β β β
β β β - Pattern: excessive notation β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Ungrounded Medical Advice Detector β β β
β β β - Pattern: "you probably have" β β β
β β β - Pattern: "take Xmg" β β β
β β β - Check: has_sources β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Dangerous File Operations Detector β β β
β β β - Pattern: "rm -rf" β β β
β β β - Pattern: recursive deletion β β β
β β β - Check: has_safeguards β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Vibe Coding Overreach Detector β β β
β β β - Pattern: "complete app" β β β
β β β - Pattern: large line counts β β β
β β β - Check: has_planning β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Unsupported Claims Detector β β β
β β β - Pattern: "always/never" β β β
β β β - Pattern: statistics without source β β β
β β β - Check: has_hedging β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β
β β Risk Assessment & Interventions β β
β β - Calculate weighted risk score β β
β β - Map to risk levels (LOW β CRITICAL) β β
β β - Recommend interventions β β
β ββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β
β β Taxonomy Database β β
β β - In-memory storage (extendable to persistent) β β
β β - Evidence entries with metadata β β
β β - Filtering and pagination β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Data Flow - Prompt Analysis
User Prompt
β
βββββββββββββββββββββββββββββββββββββββββββββββ
β β
βΌ β
togmal_analyze_prompt β
β β
ββββΊ Math/Physics Detector βββΊ Result 1 β
β β
ββββΊ Medical Advice Detector βββΊ Result 2 β
β β
ββββΊ File Ops Detector βββΊ Result 3 β
β β
ββββΊ Vibe Coding Detector βββΊ Result 4 β
β β
ββββΊ Unsupported Claims Detector βββΊ Result 5β
β
βββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
Risk Calculation
β
βββΊ Weight results
βββΊ Calculate score
βββΊ Map to risk level
β
βΌ
Intervention Recommendation
β
βββΊ Step breakdown?
βββΊ Human-in-loop?
βββΊ Web search?
βββΊ Simplified scope?
β
βΌ
Format Response (Markdown/JSON)
β
ββββΊ Return to Client
Detection Pipeline
Input Text
β
βΌ
βββββββββββββββββββββββββββββ
β Preprocessing β
β - Lowercase β
β - Strip whitespace β
βββββββββββββ¬ββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββ
β Pattern Matching β
β - Regex patterns β
β - Keyword detection β
β - Structural analysis β
βββββββββββββ¬ββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββ
β Confidence Scoring β
β - Count matches β
β - Weight by type β
β - Normalize to [0,1] β
βββββββββββββ¬ββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββ
β Context Checks β
β - has_sources? β
β - has_hedging? β
β - has_safeguards? β
βββββββββββββ¬ββββββββββββββββ
β
βΌ
Detection Result
{
detected: bool,
categories: list,
confidence: float,
metadata: dict
}
Risk Calculation Algorithm
For each detection category:
Math/Physics:
risk += confidence Γ 0.5
Medical Advice:
risk += confidence Γ 1.5 # Highest weight
File Operations:
risk += confidence Γ 2.0 # Critical actions
Vibe Coding:
risk += confidence Γ 0.4
Unsupported Claims:
risk += confidence Γ 0.3
Total Risk Score:
β₯ 1.5 β CRITICAL
β₯ 1.0 β HIGH
β₯ 0.5 β MODERATE
< 0.5 β LOW
Intervention Decision Tree
Detection Results
β
βββββββββββββββββββΌββββββββββββββββββ
β β β
βΌ βΌ βΌ
Math/Physics? Medical Advice? File Operations?
β β β
βββΊ Yes βββΊ Yes βββΊ Yes
β β β β β β
β βββΊ Step β βββΊ Human β βββΊ Human
β β Breakdown β β in Loop β β in Loop
β β β β β β
β βββΊ Web β βββΊ Web β βββΊ Step
β Search β Search β Breakdown
β β β
βββΊ No βββΊ No βββΊ No
β β β
βΌ βΌ βΌ
Continue Continue Continue
βββββββββββββ
β Combine β
β Results β
βββββββ¬ββββββ
β
βΌ
Intervention List
(deduplicated)
Taxonomy Database Schema
TAXONOMY_DB = {
"category_name": [
{
"id": "abc123def456",
"category": "math_physics_speculation",
"prompt": "User's prompt text...",
"response": "LLM's response text...",
"description": "Why problematic...",
"severity": "high",
"timestamp": "2025-10-18T00:00:00",
"prompt_hash": "a1b2c3d4"
},
{ ... more entries ... }
],
"another_category": [ ... ]
}
Indices:
- By category (dict key)
- By severity (filter)
- By timestamp (sort)
- By hash (deduplication)
Component Responsibilities
MCP Tools Layer
Responsibilities:
- Input validation (Pydantic models)
- Parameter extraction
- Tool orchestration
- Response formatting
- Character limit enforcement
Does NOT:
- Perform detection logic
- Calculate risk scores
- Store data directly
Detection Heuristics Layer
Responsibilities:
- Pattern matching
- Confidence scoring
- Context analysis
- Detection result generation
Does NOT:
- Make intervention decisions
- Format responses
- Handle I/O
Risk Assessment Layer
Responsibilities:
- Aggregate detection results
- Calculate weighted risk scores
- Map scores to risk levels
- Generate intervention recommendations
Does NOT:
- Perform detection
- Format responses
- Store data
Taxonomy Database
Responsibilities:
- Store evidence entries
- Support filtering/pagination
- Provide statistics
- Maintain capacity limits
Does NOT:
- Perform analysis
- Make decisions
- Format responses
Extension Points
Adding New Detection Categories
# 1. Add enum value
class CategoryType(str, Enum):
NEW_CATEGORY = "new_category"
# 2. Create detector function
def detect_new_category(text: str) -> Dict[str, Any]:
patterns = { ... }
# Detection logic
return {
'detected': bool,
'categories': list,
'confidence': float
}
# 3. Update analysis functions
def analyze_prompt(params):
results['new_category'] = detect_new_category(params.prompt)
# ... rest of logic
# 4. Update risk calculation
def calculate_risk_level(results):
if results['new_category']['detected']:
risk_score += results['new_category']['confidence'] * WEIGHT
# 5. Add intervention logic
def recommend_interventions(results):
if results['new_category']['detected']:
interventions.append({ ... })
Adding Persistent Storage
# 1. Define storage backend
class TaxonomyStorage:
def save(self, category, entry): ...
def load(self, category, filters): ...
def get_stats(self): ...
# 2. Replace in-memory dict
storage = TaxonomyStorage(backend="sqlite") # or "postgres", "mongodb"
# 3. Update tool functions
@mcp.tool()
async def submit_evidence(params):
# Instead of: TAXONOMY_DB[category].append(entry)
await storage.save(params.category, entry)
Adding ML Models
# 1. Define model interface
class AnomalyDetector:
def fit(self, X): ...
def predict(self, x) -> float: ...
# 2. Train from taxonomy
detector = AnomalyDetector()
training_data = get_training_data_from_taxonomy()
detector.fit(training_data)
# 3. Use in detection
def detect_with_ml(text: str) -> float:
features = extract_features(text)
anomaly_score = detector.predict(features)
return anomaly_score
Performance Characteristics
Time Complexity
- Pattern Matching: O(n) where n = text length
- All Detectors: O(n) (parallel constant time)
- Risk Calculation: O(1) (fixed number of categories)
- Taxonomy Query: O(mΒ·log m) where m = matching entries
- Overall: O(n + mΒ·log m)
Space Complexity
- Server Base: ~50 MB
- Per Request: ~1 KB (temporary)
- Per Taxonomy Entry: ~1 KB
- Total with 1000 entries: ~51 MB
Latency
- Single Detection: ~10-50 ms
- All Detections: ~50-100 ms
- Format Response: ~1-10 ms
- Total Per Request: ~100-150 ms
Security Considerations
Input Validation
User Input
β
βΌ
Pydantic Model
β
βββΊ Type checking
βββΊ Length limits
βββΊ Pattern validation
βββΊ Field constraints
β
βΌ
Valid Input
Privacy Protection
ββββββββββββββββββββββββββββββββββββββ
β NO External API Calls β
β NO Data Transmission β
β NO Logging Sensitive Info β
β YES Local Processing Only β
β YES User Consent Required β
β YES Data Stays on Device β
ββββββββββββββββββββββββββββββββββββββ
Human-in-the-Loop
Sensitive Operation Detected
β
βΌ
Request User Confirmation
β
βββΊ Yes β Proceed
β
βββΊ No β Cancel
Scalability Path
Current: Single Instance
Client β stdio β ToGMAL Server β Response
Future: HTTP Transport
Multiple Clients β HTTP β ToGMAL Server β Response
β
Shared Database
Advanced: Distributed
Clients β Load Balancer β ToGMAL Servers (N)
β
Shared Database
β
ML Model Cache
Monitoring Points
βββββββββββββββββββββββββββββββββββββββ
β Metrics to Track β
βββββββββββββββββββββββββββββββββββββββ€
β - Tool call frequency β
β - Detection rates by category β
β - Risk level distribution β
β - Intervention effectiveness β
β - False positive rate β
β - Response latency β
β - Taxonomy growth rate β
β - User feedback submissions β
βββββββββββββββββββββββββββββββββββββββ
This architecture supports:
- β Privacy-preserving analysis
- β Low-latency detection
- β Extensible design
- β Production readiness
- β Future ML integration