A newer version of the Streamlit SDK is available:
1.51.0
metadata
colorFrom: green
colorTo: green
βοΈ Game of Thrones Character Similarity Explorer
Explore semantic relationships between Game of Thrones characters using state-of-the-art embedding techniques. This project leverages SBERT and TF-IDF embeddings to map characters into high-dimensional semantic spaces, enabling similarity analysis, visualization, and recommendations.
π Features
- Semantic Similarity Search: Find the most similar characters using cosine similarity on embeddings.
- Hybrid Embeddings: Combine SBERT (semantic) and TF-IDF (frequency-based) embeddings for improved results.
- Dimensionality Reduction Visualizations:
- 2D & 3D t-SNE: Non-linear reduction highlighting local neighborhoods and clusters.
- 2D & 3D PCA: Linear reduction emphasizing global variance and main directions.
- Interactive Plots: Explore embeddings with Plotly 2D/3D scatter plots, hoverable labels, and downloadable plots.
- Downloadable Results: Export similarity results as CSV for further analysis.
π§ How It Works
Data Preprocessing
Character names are normalized, and embeddings are precomputed for SBERT and TF-IDF.
Similarity Computation
- SBERT embeddings capture contextual and semantic relationships.
- TF-IDF embeddings capture frequency-based textual features.
- Hybrid similarity is a weighted combination of both embeddings.
Recommendation
The app computes top-N similar characters based on the chosen embedding model and similarity type.
Visualization
- Reduce embeddings to 2D or 3D space with t-SNE or PCA.
- Interactive plots allow exploration of latent structures and clusters.
π οΈ Technologies Used
- Python & Streamlit for the web interface
- Pandas, NumPy, Joblib for data processing
- Scikit-learn for embeddings, similarity, t-SNE, and PCA
- Plotly for interactive 2D/3D visualizations
- Pillow for image handling
- SBERT & TF-IDF for semantic and frequency-based embeddings
βοΈ How to Use
- Select a character from the sidebar.
- Choose an embedding model: SBERT, TF-IDF, or Hybrid.
- Adjust the number of similar characters to display.
- Click Find Similar Characters to get results with images and similarity scores.
- Explore Dimensionality Reduction to visualize embeddings in 2D/3D space.
- Download CSV results or plots for offline use.
π Applications
- Character similarity analysis for literature and fandom studies
- Semantic search engines
- Recommendation systems
- Knowledge graph construction and clustering of entities
π Live Demo
Game of Thrones Character Similarity Explorer
π License
MIT License β feel free to use and adapt for research or educational purposes.