|
|
--- |
|
|
colorFrom: green |
|
|
colorTo: green |
|
|
--- |
|
|
# βοΈ Game of Thrones Character Similarity Explorer |
|
|
|
|
|
Explore semantic relationships between **Game of Thrones characters** using state-of-the-art embedding techniques. This project leverages **SBERT** and **TF-IDF embeddings** to map characters into high-dimensional semantic spaces, enabling similarity analysis, visualization, and recommendations. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Features |
|
|
|
|
|
- **Semantic Similarity Search:** Find the most similar characters using **cosine similarity** on embeddings. |
|
|
- **Hybrid Embeddings:** Combine SBERT (semantic) and TF-IDF (frequency-based) embeddings for improved results. |
|
|
- **Dimensionality Reduction Visualizations:** |
|
|
- **2D & 3D t-SNE:** Non-linear reduction highlighting local neighborhoods and clusters. |
|
|
- **2D & 3D PCA:** Linear reduction emphasizing global variance and main directions. |
|
|
- **Interactive Plots:** Explore embeddings with **Plotly 2D/3D scatter plots**, hoverable labels, and downloadable plots. |
|
|
- **Downloadable Results:** Export similarity results as **CSV** for further analysis. |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ How It Works |
|
|
|
|
|
### Data Preprocessing |
|
|
Character names are normalized, and embeddings are precomputed for SBERT and TF-IDF. |
|
|
|
|
|
### Similarity Computation |
|
|
- **SBERT embeddings** capture contextual and semantic relationships. |
|
|
- **TF-IDF embeddings** capture frequency-based textual features. |
|
|
- **Hybrid similarity** is a weighted combination of both embeddings. |
|
|
|
|
|
### Recommendation |
|
|
The app computes **top-N similar characters** based on the chosen embedding model and similarity type. |
|
|
|
|
|
### Visualization |
|
|
- Reduce embeddings to **2D or 3D space** with t-SNE or PCA. |
|
|
- Interactive plots allow exploration of latent structures and clusters. |
|
|
|
|
|
--- |
|
|
|
|
|
## π οΈ Technologies Used |
|
|
|
|
|
- **Python** & **Streamlit** for the web interface |
|
|
- **Pandas**, **NumPy**, **Joblib** for data processing |
|
|
- **Scikit-learn** for embeddings, similarity, t-SNE, and PCA |
|
|
- **Plotly** for interactive 2D/3D visualizations |
|
|
- **Pillow** for image handling |
|
|
- **SBERT & TF-IDF** for semantic and frequency-based embeddings |
|
|
|
|
|
--- |
|
|
|
|
|
## βοΈ How to Use |
|
|
|
|
|
1. Select a **character** from the sidebar. |
|
|
2. Choose an **embedding model**: SBERT, TF-IDF, or Hybrid. |
|
|
3. Adjust the number of **similar characters** to display. |
|
|
4. Click **Find Similar Characters** to get results with images and similarity scores. |
|
|
5. Explore **Dimensionality Reduction** to visualize embeddings in 2D/3D space. |
|
|
6. Download **CSV results** or **plots** for offline use. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Applications |
|
|
|
|
|
- Character similarity analysis for literature and fandom studies |
|
|
- Semantic search engines |
|
|
- Recommendation systems |
|
|
- Knowledge graph construction and clustering of entities |
|
|
|
|
|
--- |
|
|
|
|
|
## π Live Demo |
|
|
|
|
|
[Game of Thrones Character Similarity Explorer](https://huggingface.co/spaces/Daksh0505/Game-Of-Thrones-Character-Similarity) |
|
|
|
|
|
--- |
|
|
|
|
|
## π License |
|
|
|
|
|
MIT License β feel free to use and adapt for research or educational purposes. |