--- title: LinkScout Backend emoji: 🔍 colorFrom: yellow colorTo: red sdk: docker pinned: false --- # LinkScout - Smart Analysis. Simple Answers. **The Ultimate AI-Powered Misinformation Detection Extension** LinkScout combines the best of both worlds - powerful AI analysis from Groq with pre-trained machine learning models to provide comprehensive fact-checking and misinformation detection. ## 🚀 Features ### Dual AI Analysis System - **Groq AI Agent**: Advanced natural language understanding and reasoning - **Pre-trained Models**: RoBERTa, Emotion Analysis, NER, Hate Speech Detection, Clickbait Detection, Bias Detection ### Revolutionary Detection (8 Phases) 1. **Linguistic Fingerprint Analysis**: Detects manipulation patterns in text 2. **Claim-by-Claim Verification**: Verifies individual claims against databases 3. **Source Credibility Analysis**: Rates source reliability 4. **Entity Verification**: Validates people, organizations, places 5. **Propaganda Detection**: Identifies propaganda techniques 6. **Contradiction Detection**: Finds logical inconsistencies 7. **Network Analysis**: Detects bot/astroturfing patterns 8. **Reinforcement Learning**: Learns from user feedback to improve accuracy ### User Interface Features - **Smart Paragraph Highlighting**: Color-coded suspicious content detection - **Sidebar Analysis Report**: Comprehensive results without blocking the page - **Real-time Google Search Integration**: Verifies claims with recent sources - **Interactive Results Display**: Organized tabs for overview, details, and sources - **One-Click Analysis**: Analyze entire pages or paste text/URLs ### Technical Capabilities - **Chunk-based Analysis**: Analyzes content paragraph-by-paragraph for precision - **Multi-language Support**: English, Hindi, Marathi, and 15+ Indian languages - **Image Analysis**: Detects AI-generated/manipulated images - **Offline Database**: Fast local verification of known false claims - **Context-Aware Scoring**: Adjusts detection based on content type and category ## 📦 Installation ### Prerequisites - Python 3.8+ - Node.js (optional, for development) - Google Chrome or Microsoft Edge browser ### Backend Setup 1. **Install Python Dependencies**: ```powershell cd d:\mis_2\LinkScout pip install -r requirements_mis.txt pip install flask flask-cors requests beautifulsoup4 torch transformers pillow ``` 2. **Download AI Models** (if not already cached): ```powershell # Models will auto-download to D:\huggingface_cache # Requires ~5GB disk space python -c "from transformers import AutoTokenizer; AutoTokenizer.from_pretrained('hamzab/roberta-fake-news-classification', cache_dir=r'D:\huggingface_cache')" ``` 3. **Configure Google Search** (optional): - Get Google Custom Search API key from https://developers.google.com/custom-search - Update `google_config.json` with your API key and CSE ID 4. **Start the Server**: ```powershell python combined_server.py ``` Server will start at `http://localhost:5000` ### Extension Installation 1. **Open Chrome/Edge** 2. **Navigate to Extensions**: `chrome://extensions` or `edge://extensions` 3. **Enable Developer Mode**: Toggle in top-right corner 4. **Load Unpacked**: Click button and select `d:\mis_2\LinkScout\extension` folder 5. **Pin Extension**: Click puzzle icon and pin LinkScout for easy access ## 🎯 Usage ### Method 1: Analyze Current Page 1. Navigate to any news article or webpage 2. Click the LinkScout extension icon 3. Click **"Scan Page"** 4. View results in popup and check highlighted suspicious content on page ### Method 2: Paste Text or URL 1. Click the LinkScout extension icon 2. Paste text or URL in the input box 3. Click **"Analyze"** 4. Review comprehensive analysis results ### Method 3: Highlight Suspicious Content 1. After scanning a page, click **"Highlight"** button 2. Suspicious paragraphs will be color-coded: - 🔴 **Red**: High risk (>70% suspicious) - 🟡 **Yellow**: Medium risk (40-70% suspicious) - 🔵 **Blue**: Low risk (<40% suspicious) 3. Click **"Clear"** to remove highlights ### Method 4: View Detailed Report - Analysis results appear in a sidebar on the right - Shows percentage score, verdict, summary, and flagged content - Includes Google search results for fact-checking ## 🔧 Configuration ### Server Configuration Edit `combined_server.py`: ```python # Groq API Key (for AI analysis) GROQ_API_KEY = 'your_groq_api_key_here' # Change port if needed app.run(host='0.0.0.0', port=5000, debug=False) ``` ### Extension Configuration Edit `extension/content.js`: ```javascript const CONFIG = { API_ENDPOINT: 'http://localhost:5000/api/v1/analyze-chunks', REQUEST_TIMEOUT: 180000, // 3 minutes AUTO_SCAN_DELAY: 3000 }; ``` ## 📊 How It Works ### Analysis Pipeline 1. **Content Extraction** - Extracts all paragraphs, headings, and article text - Filters out navigation, ads, and boilerplate 2. **Multi-Model Analysis** - RoBERTa: Fake news probability - Emotion Model: Sentiment and emotional manipulation - NER: Entity extraction and verification - Hate Speech: Toxic content detection - Clickbait: Sensationalism detection - Bias: Political/ideological bias detection 3. **Revolutionary Detection** - Linguistic patterns (sentence structure, word choice) - Claim extraction and database verification - Source credibility scoring - Entity validation (real people/organizations) - Propaganda technique identification - Logical contradiction detection - Bot/astroturfing pattern analysis 4. **Google Research** - Searches recent sources for claims - Compares against credible news outlets - Provides links for manual verification 5. **Scoring & Verdict** - Combines all signals into final score (0-100%) - Determines verdict: FAKE, SUSPICIOUS, or REAL - Generates human-readable explanation 6. **Reinforcement Learning** - Learns from user feedback - Improves accuracy over time - Adapts to new misinformation patterns ## 🎓 Understanding Results ### Misinformation Percentage - **0-30%**: Low Risk - Mostly Credible - **30-60%**: Medium Risk - Verify Claims - **60-100%**: High Risk - Likely Misinformation ### Verdict Types - **REAL**: Content appears authentic and fact-checked - **SUSPICIOUS**: Mixed signals, requires verification - **FAKE**: Strong indicators of misinformation ### Confidence Indicators - High confidence: Multiple models agree + external verification - Medium confidence: Some conflicting signals - Low confidence: Limited data or unclear content ## 🐛 Troubleshooting ### Server Won't Start - Check if port 5000 is available: `netstat -ano | findstr :5000` - Ensure Python dependencies are installed - Check for errors in terminal output ### Extension Not Working - Verify server is running at http://localhost:5000 - Check browser console for errors (F12 → Console) - Try reloading the extension - Ensure you're on a valid webpage (not chrome:// pages) ### Models Not Loading - Check disk space (requires ~5GB) - Verify D:\huggingface_cache directory exists and is writable - Run download script manually if needed ### Slow Analysis - Large articles (>100 paragraphs) take 1-2 minutes - Check CPU/GPU usage - Consider reducing `REQUEST_TIMEOUT` for faster (less accurate) results ## 🤝 Contributing This project combines features from two advanced misinformation detection systems. To contribute: 1. Keep backend functionality intact - both systems are working correctly 2. Test thoroughly before committing changes 3. Maintain clean, organized frontend code 4. Update documentation for new features ## 📝 Credits **LinkScout** combines: - **MIS Extension**: Groq AI agentic analysis, RL, image detection, revolutionary detection phases - **MIS_2 Extension**: Pre-trained models, chunk analysis, Google search, sidebar UI Created by combining the best features of both systems into one powerful tool. ## 🔒 Privacy & Security - All analysis is performed locally or through your own API keys - No data is collected or stored by LinkScout - Google Search API (if configured) follows Google's privacy policy - Groq API usage follows Groq's terms of service ## 📄 License For educational and research purposes. Please respect API usage limits and terms of service. --- **LinkScout - Smart Analysis. Simple Answers.** 🔍✨