Spaces:

JustTheStatsHuman
/

Togmal-demo

Configuration error

App Files Files Community

Togmal-demo / CLUSTERING_TO_DYNAMIC_TOOLS_STRATEGY.md

HeTalksInMaths

Initial commit: ToGMAL Prompt Difficulty Analyzer with real MMLU data

f9b1ad5 24 days ago

preview code

raw

history blame

21.9 kB

HuggingFace Clustering → ToGMAL Dynamic Tools Integration Strategy

Date: October 18, 2025
Purpose: Define how ML clustering on safety datasets informs ToGMAL's dynamic tool exposure
Status: Ready for Implementation

Executive Summary

This document outlines the strategy for using real clustering analysis on HuggingFace safety datasets to automatically discover limitation patterns and expose them as dynamic MCP tools in ToGMAL.

The Core Flow:

[HuggingFace Datasets] → [Embedding + Clustering] → [Dangerous Cluster Discovery]
                                                            ↓
                                                    [Pattern Extraction]
                                                            ↓
                                              [ToGMAL Dynamic Tool Generation]
                                                            ↓
                                                [Context-Aware Tool Exposure]

1. Current State Analysis

What You Have (Existing Implementation)

A. Research Pipeline (`research_pipeline.py`)

✅ Working: Fetches 10 dataset sources
✅ Working: TF-IDF feature extraction
✅ Working: K-Means, DBSCAN clustering
✅ Working: Dangerous cluster identification (>70% harmful threshold)
✅ Working: Silhouette scoring (current: 0.25-0.26)

Current Results:

2-3 clusters identified
Dangerous clusters: 71-100% harmful content
Successfully differentiates harmful from benign

B. Dynamic Tools (`togmal/context_analyzer.py`, `togmal/ml_tools.py`)

✅ Working: Context analyzer with keyword matching
✅ Working: ML tools cache (./data/ml_discovered_tools.json)
✅ Working: Domain filtering for tool recommendations
⚠️ Missing: Connection from clustering results to tool cache

What Files (2-4) Propose

C. Enhanced Dataset Fetcher (`research-datasets-fetcher.py`)

🆕 Proposed: Professional domain-specific datasets
🆕 Proposed: Real HuggingFace integration via datasets library
🆕 Proposed: Aqumen/ToGMAL data integration endpoints
🆕 Proposed: 10 professional domains with specific datasets

D. Enhanced Clustering Trainer (`research-training-clustering.py`)

🆕 Proposed: Sentence transformers for better embeddings
🆕 Proposed: Cluster quality analysis (purity, pattern description)
🆕 Proposed: Detection rule generation from clusters
🆕 Proposed: Visualization and model comparison

2. The Missing Link: Clustering → Dynamic Tools

Current Gap

Your existing research_pipeline.py does clustering but:

❌ Doesn't use sentence transformers (uses TF-IDF)
❌ Doesn't export results in format for ml_tools.py
❌ Doesn't generate detection rules
❌ Doesn't map clusters to professional domains

Proposed Solution

Create a new integration layer that:

Runs enhanced clustering with sentence transformers
Analyzes dangerous clusters for patterns
Generates detection heuristics from cluster characteristics
Exports to ML tools cache in correct format
Triggers ToGMAL reload to expose new tools

3. Professional Domain Clustering Strategy

The 10 Professional Domains

Based on files (4) proposals, focus on domains where LLMs demonstrably struggle:

Domain	Dataset Sources	Expected Cluster Behavior	ToGMAL Tool
Mathematics	`hendrycks/math`, `competition_math`, `gsm8k`	LIMITATIONS cluster (LLM accuracy: 42% on MATH)	`check_math_complexity`
Medicine	`medqa`, `pubmedqa`, `truthful_qa` subset	LIMITATIONS cluster (LLM accuracy: 65% on MedQA)	`check_medical_advice`
Law	`pile-of-law`, legal case reports	LIMITATIONS cluster (jurisdiction-specific errors)	`check_legal_boundaries`
Coding	`code_x_glue_cc_defect_detection`, `humaneval`, `apps`	MIXED clusters (some code safe, some vulnerable)	`check_code_security`
Finance	`financial_phrasebank`, `finqa`	LIMITATIONS cluster (regulatory compliance)	`check_financial_advice`
Translation	`wmt14`, `opus-100`	HARMLESS cluster (LLM near-human performance)	(no tool needed)
General QA	`squad_v2`, `natural_questions`	HARMLESS cluster (LLM accuracy: 86% on MMLU)	(no tool needed)
Summarization	`cnn_dailymail`, `xsum`	HARMLESS cluster (high ROUGE scores)	(no tool needed)
Creative Writing	`TinyStories`, `writing_prompts`	HARMLESS cluster (subjective, no "wrong" answer)	(no tool needed)
Therapy	Mental health corpora (if available)	LIMITATIONS cluster (crisis intervention risks)	`check_therapy_boundaries`

Clustering Hypothesis

LIMITATIONS Cluster:

Contains: Math, medicine, law, finance, coding bugs, therapy
Characteristics: High reasoning complexity, domain expertise required, factual correctness critical
Cluster purity: >70% harmful/failure examples
Silhouette score: Aim for >0.4 (currently 0.25)

HARMLESS Cluster:

Contains: Translation, summarization, general QA, creative writing
Characteristics: Pattern matching, well-represented in training data, less critical if wrong
Cluster purity: >70% safe/successful examples

MIXED Cluster:

Contains: General coding, factual QA, educational content
Needs further subdivision or context-dependent handling

4. Implementation Plan: Enhanced Clustering Pipeline

Phase 1: Upgrade Clustering (Week 1-2)

Step 1.1: Install Dependencies

cd /Users/hetalksinmaths/togmal
source .venv/bin/activate
uv pip install sentence-transformers datasets scikit-learn matplotlib seaborn joblib

Step 1.2: Enhance `research_pipeline.py`

Add sentence transformers instead of TF-IDF:

# Add to research_pipeline.py
from sentence_transformers import SentenceTransformer

class FeatureExtractor:
    """Use sentence transformers for semantic embeddings"""
    
    def __init__(self, model_name: str = "all-MiniLM-L6-v2"):
        self.model = SentenceTransformer(model_name)
        self.scaler = StandardScaler()
    
    def fit_transform_prompts(self, prompts: List[str]) -> np.ndarray:
        """Extract semantic embeddings"""
        embeddings = self.model.encode(
            prompts,
            batch_size=32,
            show_progress_bar=True,
            convert_to_numpy=True
        )
        return self.scaler.fit_transform(embeddings)

Why sentence transformers?

Captures semantic similarity (not just keywords)
Better cluster separation
Expect silhouette score improvement: 0.25 → 0.4+

Step 1.3: Add Professional Domain Datasets

Update DatasetFetcher to use HuggingFace datasets library:

from datasets import load_dataset

async def _fetch_huggingface_real(self, config: DatasetConfig) -> List[DatasetEntry]:
    """Actual HuggingFace integration"""
    dataset = load_dataset(
        config.source_id,
        split=config.split,
        trust_remote_code=True
    )
    
    entries = []
    for item in dataset:
        entries.append(DatasetEntry(
            id="",
            source=config.name,
            type=config.cluster_category,
            prompt=item.get(config.text_column, ""),
            category=config.domains[0] if config.domains else "unknown",
            is_harmful=(config.cluster_category == "limitations"),
            metadata={"dataset": config.source_id}
        ))
    
    return entries

Priority datasets to fetch first:

Mathematics (LIMITATIONS)
- hendrycks/math - 12,500 competition-level problems
- Use for detecting math complexity
Medicine (LIMITATIONS)
- medqa - Medical licensing exam questions
- Use for detecting medical advice boundaries
Coding (MIXED)
- code_x_glue_cc_defect_detection - Buggy vs clean code
- Use for detecting security vulnerabilities
General QA (HARMLESS)
- squad_v2 - Reading comprehension
- Use as baseline "safe" cluster

Phase 2: Extract Patterns from Clusters (Week 3)

Step 2.1: Add Cluster Analysis

Enhance AnomalyClusteringModel._identify_dangerous_clusters:

def _identify_dangerous_clusters(
    self, cluster_labels: np.ndarray, entries: List[DatasetEntry]
) -> List[Dict[str, Any]]:
    """Identify dangerous clusters with pattern extraction"""
    
    dangerous_clusters = []
    
    for cluster_id in set(cluster_labels):
        if cluster_id == -1:  # Skip noise
            continue
        
        # Get cluster members
        mask = cluster_labels == cluster_id
        cluster_entries = [e for e, m in zip(entries, mask) if m]
        
        # Calculate purity
        harmful_count = sum(1 for e in cluster_entries if e.is_harmful)
        purity = harmful_count / len(cluster_entries)
        
        if purity < 0.7:  # Not dangerous enough
            continue
        
        # Extract pattern
        pattern = self._extract_pattern_from_cluster(cluster_entries)
        
        dangerous_clusters.append({
            "cluster_id": int(cluster_id),
            "size": len(cluster_entries),
            "purity": float(purity),
            "domain": pattern["domain"],
            "pattern_description": pattern["description"],
            "detection_rule": pattern["heuristic"],
            "examples": pattern["examples"]
        })
    
    return dangerous_clusters

Step 2.2: Pattern Extraction Logic

Add pattern extraction method:

def _extract_pattern_from_cluster(
    self, entries: List[DatasetEntry]
) -> Dict[str, Any]:
    """Extract actionable pattern from cluster members"""
    
    # Determine primary domain
    domain_counts = Counter(e.category for e in entries)
    primary_domain = domain_counts.most_common(1)[0][0]
    
    # Extract common keywords (for detection heuristic)
    all_prompts = " ".join(e.prompt for e in entries if e.prompt)
    words = re.findall(r'\b[a-z]{4,}\b', all_prompts.lower())
    top_keywords = [w for w, c in Counter(words).most_common(10)]
    
    # Generate detection rule
    if primary_domain == "mathematics":
        heuristic = "contains_math_symbols OR complexity > threshold"
    elif primary_domain == "medicine":
        heuristic = f"contains_medical_keywords: {', '.join(top_keywords[:5])}"
    else:
        heuristic = f"keyword_match: {', '.join(top_keywords[:5])}"
    
    # Get representative examples
    examples = [e.prompt for e in entries[:5] if e.prompt]
    
    # Generate description
    description = f"{primary_domain.title()} limitation pattern (cluster purity: {purity:.1%})"
    
    return {
        "domain": primary_domain,
        "description": description,
        "heuristic": heuristic,
        "examples": examples,
        "keywords": top_keywords
    }

Phase 3: Export to ML Tools Cache (Week 3-4)

Step 3.1: Update Pipeline to Export

Add export method to ResearchPipeline:

def export_to_togmal_ml_tools(self, training_results: Dict[str, Any]):
    """Export dangerous clusters as ToGMAL dynamic tools"""
    
    patterns = []
    
    for model_type, result in training_results.items():
        for cluster in result.get("dangerous_clusters", []):
            pattern = {
                "id": f"{model_type}_{cluster['cluster_id']}",
                "domain": cluster["domain"],
                "description": cluster["pattern_description"],
                "confidence": cluster["purity"],
                "heuristic": cluster["detection_rule"],
                "examples": cluster["examples"],
                "metadata": {
                    "cluster_size": cluster["size"],
                    "model_type": model_type,
                    "discovered_at": datetime.now().isoformat()
                }
            }
            patterns.append(pattern)
    
    # Save to ML tools cache (format expected by ml_tools.py)
    ml_tools_cache = {
        "updated_at": datetime.now().isoformat(),
        "patterns": patterns,
        "metadata": {
            "total_patterns": len(patterns),
            "domains": list(set(p["domain"] for p in patterns))
        }
    }
    
    cache_path = Path("./data/ml_discovered_tools.json")
    cache_path.parent.mkdir(parents=True, exist_ok=True)
    
    with open(cache_path, 'w') as f:
        json.dump(ml_tools_cache, f, indent=2)
    
    print(f"✓ Exported {len(patterns)} patterns to {cache_path}")

Step 3.2: Update `togmal_mcp.py` to Use Patterns

Modify existing togmal_list_tools_dynamic to load ML patterns:

@mcp.tool()
async def togmal_list_tools_dynamic(
    conversation_history: Optional[List[Dict[str, str]]] = None,
    user_context: Optional[Dict[str, Any]] = None
) -> Dict[str, Any]:
    """
    Returns dynamically recommended tools based on conversation context
    
    ENHANCED: Now includes ML-discovered limitation patterns
    """
    # Existing domain detection
    domains = await analyze_conversation_context(conversation_history, user_context)
    
    # Load ML-discovered tools (NEW)
    ml_tools = await get_ml_discovered_tools(
        relevant_domains=domains,
        min_confidence=0.8  # Only high-confidence patterns
    )
    
    # Combine with static tools
    recommended_tools = [
        "togmal_analyze_prompt",
        "togmal_analyze_response",
        "togmal_submit_evidence"
    ]
    
    # Add domain-specific static tools
    if "mathematics" in domains or "physics" in domains:
        recommended_tools.append("togmal_check_math_complexity")
    if "medicine" in domains or "healthcare" in domains:
        recommended_tools.append("togmal_check_medical_advice")
    if "file_system" in domains:
        recommended_tools.append("togmal_check_file_operations")
    
    # Add ML-discovered tools (DYNAMIC)
    ml_tool_names = [tool["name"] for tool in ml_tools]
    recommended_tools.extend(ml_tool_names)
    
    return {
        "recommended_tools": recommended_tools,
        "detected_domains": domains,
        "ml_discovered_tools": ml_tools,  # Full definitions
        "context": {
            "conversation_depth": len(conversation_history) if conversation_history else 0,
            "has_user_context": bool(user_context)
        }
    }

5. Expected Improvements

Clustering Quality

Current (TF-IDF + K-Means):

Silhouette score: 0.25-0.26
Clusters: 2-3
Dangerous clusters: Identified, but low separation

Expected (Sentence Transformers + K-Means/DBSCAN):

Silhouette score: 0.4-0.6 (✅ 60-140% improvement)
Clusters: 3-5 meaningful clusters
Dangerous clusters: Better defined with clear boundaries

Why?

Sentence transformers capture semantic meaning
TF-IDF only captures word overlap
Example: "What's the integral of x²" vs "Solve this calculus problem" → same cluster with ST, different with TF-IDF

Dynamic Tool Exposure

Before:

5 static tools always available
Manual keyword matching for domain detection

After:

5 static tools + N ML-discovered tools (N = # dangerous clusters)
Automatic tool exposure based on real clustering
Example: Cluster discovers "complex math word problems" → new tool check_math_word_problem_complexity

Coverage of Professional Domains

Before:

Generic "math", "medical", "file operations"
No fine-grained domain understanding

After:

10 professional domains with dataset-backed clustering
Sub-domain detection (e.g., "cardiology" vs "psychiatry" within medicine)
Evidence-based: Each tool backed by cluster of real failure examples

6. Integration with Aqumen (Future)

Bidirectional Feedback Loop

[ToGMAL Clustering] → Discovers "law" limitation cluster
         ↓
[ToGMAL ML Tools] → Exposes check_legal_boundaries
         ↓
[Aqumen Error Catalog] ← Imports "law" failures from ToGMAL
         ↓
[Aqumen Assessments] → Tests users on legal reasoning
         ↓
[Assessment Failures] → Reported back to ToGMAL
         ↓
[ToGMAL Re-Clustering] → Refines "law" cluster with new data

Not implementing yet (per your request), but architecture is ready when needed.

7. Action Items (Next 2 Weeks)

Week 1: Enhanced Clustering

Day 1-2: Setup

Install dependencies: sentence-transformers, datasets, visualization libs
Copy research-datasets-fetcher.py and research-training-clustering.py to workspace
Integrate with existing research_pipeline.py

Day 3-5: Dataset Fetching

Implement real HuggingFace dataset loading
Fetch 4 priority datasets:
- hendrycks/math (mathematics)
- medqa (medicine)
- code_x_glue_cc_defect_detection (coding)
- squad_v2 (general QA as baseline)
Verify dataset cache works

Day 6-7: Clustering with Sentence Transformers

Replace TF-IDF with sentence transformers in FeatureExtractor
Run clustering on fetched datasets
Verify silhouette score improvement (target: >0.4)

Week 2: Pattern Extraction & Tool Generation

Day 8-10: Pattern Extraction

Implement _extract_pattern_from_cluster method
Generate detection heuristics from clusters
Visualize clusters (PCA 2D projection)

Day 11-12: Export to ML Tools

Implement export_to_togmal_ml_tools in pipeline
Run full pipeline and generate ml_discovered_tools.json
Verify format matches what ml_tools.py expects

Day 13-14: Testing & Validation

Test togmal_list_tools_dynamic with ML tools
Verify context analyzer correctly triggers ML tools
Run end-to-end test: conversation → domain detection → ML tool exposure

8. Success Metrics

Technical Metrics

Metric	Current	Target	How to Measure
Silhouette Score	0.25-0.26	>0.4	sklearn.metrics.silhouette_score
Dangerous Cluster Purity	71-100%	>80%	% harmful in cluster
# Detected Domains	0 (manual)	5-10	Count from clustering
ML Tools Generated	0	5-10	Count in ml_discovered_tools.json
Tool Precision	N/A	>85%	Manual review of triggered tools

Functional Metrics

Can differentiate "math limitations" from "general QA" clusters
Can automatically expose check_math_complexity when conversation contains math
Can generate heuristic rules that are interpretable (not just "cluster 3")
Visualization shows clear cluster separation

9. Risks & Mitigations

Risk	Impact	Mitigation
Sentence transformer slower than TF-IDF	High	Cache embeddings, use batch processing
Silhouette score doesn't improve	High	Try different embedding models (mpnet, distilbert)
HuggingFace datasets too large	Medium	Sample datasets (max 5000 entries each)
Clusters don't align with domains	High	Add domain labels to training data, use semi-supervised clustering
ML tools not useful in practice	Medium	Start with high confidence threshold (0.8+), iterate

10. File Structure After Implementation

/Users/hetalksinmaths/togmal/
├── research_pipeline.py (ENHANCED)
│   ├── FeatureExtractor with sentence transformers ✅
│   ├── Pattern extraction from clusters ✅
│   ├── Export to ML tools cache ✅
│
├── togmal/
│   ├── context_analyzer.py (EXISTING - works as-is)
│   ├── ml_tools.py (EXISTING - works as-is)
│   └── config.py (EXISTING)
│
├── data/
│   ├── datasets/ (NEW)
│   │   ├── combined_dataset.csv
│   │   └── [domain]_[dataset].csv
│   │
│   ├── cache/ (EXISTING)
│   │   └── [source].json
│   │
│   └── ml_discovered_tools.json (GENERATED by pipeline)
│
├── models/ (NEW)
│   ├── clustering/
│   │   ├── kmeans_model.pkl
│   │   ├── embeddings_cache.npy
│   │   └── training_results.json
│   └── visualization/
│       └── clusters_2d.png
│
└── CLUSTERING_TO_DYNAMIC_TOOLS_STRATEGY.md (THIS FILE)

11. Next Steps After This Implementation

Phase 4: Aqumen Integration (When Ready)

Export ToGMAL clustering results to Aqumen error catalogs
Import Aqumen assessment failures back into ToGMAL
Re-train clustering with combined data

Phase 5: Continuous Improvement

Weekly automated re-training on new data
A/B testing of ML tools vs static tools
User feedback loop to improve heuristics

Phase 6: Grant Preparation

Publish clustering results as research artifact
Use improved metrics (silhouette 0.4+) in grant proposal
Demonstrate concrete improvements over baseline

Conclusion

What This Gets You:

✅ Real clustering on professional domain datasets
✅ Better separation between limitations and harmless clusters
✅ Automatic tool generation from clustering results
✅ Evidence-backed limitation detection (not just heuristics)
✅ Scalable architecture ready for Aqumen integration

What This Doesn't Do (Yet):

❌ Aqumen bidirectional integration (Phase 4)
❌ Production deployment (focus on research validation)
❌ Comprehensive grant proposal (focus on technical foundation)

Recommended Focus:

Start with Week 1-2 action items to prove the clustering approach works, then decide on Aqumen integration vs grant preparation.

Ready to proceed? Let me know if you want me to:

Start implementing the enhanced clustering pipeline
Create a test harness for validating clusters
Build the export-to-ML-tools integration
Something else?