Daksh0505's picture
Update README.md
4483e19 verified
---
colorFrom: green
colorTo: green
---
# βš”οΈ Game of Thrones Character Similarity Explorer
Explore semantic relationships between **Game of Thrones characters** using state-of-the-art embedding techniques. This project leverages **SBERT** and **TF-IDF embeddings** to map characters into high-dimensional semantic spaces, enabling similarity analysis, visualization, and recommendations.
---
## πŸš€ Features
- **Semantic Similarity Search:** Find the most similar characters using **cosine similarity** on embeddings.
- **Hybrid Embeddings:** Combine SBERT (semantic) and TF-IDF (frequency-based) embeddings for improved results.
- **Dimensionality Reduction Visualizations:**
- **2D & 3D t-SNE:** Non-linear reduction highlighting local neighborhoods and clusters.
- **2D & 3D PCA:** Linear reduction emphasizing global variance and main directions.
- **Interactive Plots:** Explore embeddings with **Plotly 2D/3D scatter plots**, hoverable labels, and downloadable plots.
- **Downloadable Results:** Export similarity results as **CSV** for further analysis.
---
## 🧠 How It Works
### Data Preprocessing
Character names are normalized, and embeddings are precomputed for SBERT and TF-IDF.
### Similarity Computation
- **SBERT embeddings** capture contextual and semantic relationships.
- **TF-IDF embeddings** capture frequency-based textual features.
- **Hybrid similarity** is a weighted combination of both embeddings.
### Recommendation
The app computes **top-N similar characters** based on the chosen embedding model and similarity type.
### Visualization
- Reduce embeddings to **2D or 3D space** with t-SNE or PCA.
- Interactive plots allow exploration of latent structures and clusters.
---
## πŸ› οΈ Technologies Used
- **Python** & **Streamlit** for the web interface
- **Pandas**, **NumPy**, **Joblib** for data processing
- **Scikit-learn** for embeddings, similarity, t-SNE, and PCA
- **Plotly** for interactive 2D/3D visualizations
- **Pillow** for image handling
- **SBERT & TF-IDF** for semantic and frequency-based embeddings
---
## βš™οΈ How to Use
1. Select a **character** from the sidebar.
2. Choose an **embedding model**: SBERT, TF-IDF, or Hybrid.
3. Adjust the number of **similar characters** to display.
4. Click **Find Similar Characters** to get results with images and similarity scores.
5. Explore **Dimensionality Reduction** to visualize embeddings in 2D/3D space.
6. Download **CSV results** or **plots** for offline use.
---
## πŸ“Š Applications
- Character similarity analysis for literature and fandom studies
- Semantic search engines
- Recommendation systems
- Knowledge graph construction and clustering of entities
---
## πŸ”— Live Demo
[Game of Thrones Character Similarity Explorer](https://huggingface.co/spaces/Daksh0505/Game-Of-Thrones-Character-Similarity)
---
## πŸ“„ License
MIT License – feel free to use and adapt for research or educational purposes.