Daksh0505's picture
Update README.md
4483e19 verified

A newer version of the Streamlit SDK is available: 1.51.0

Upgrade
metadata
colorFrom: green
colorTo: green

βš”οΈ Game of Thrones Character Similarity Explorer

Explore semantic relationships between Game of Thrones characters using state-of-the-art embedding techniques. This project leverages SBERT and TF-IDF embeddings to map characters into high-dimensional semantic spaces, enabling similarity analysis, visualization, and recommendations.


πŸš€ Features

  • Semantic Similarity Search: Find the most similar characters using cosine similarity on embeddings.
  • Hybrid Embeddings: Combine SBERT (semantic) and TF-IDF (frequency-based) embeddings for improved results.
  • Dimensionality Reduction Visualizations:
    • 2D & 3D t-SNE: Non-linear reduction highlighting local neighborhoods and clusters.
    • 2D & 3D PCA: Linear reduction emphasizing global variance and main directions.
  • Interactive Plots: Explore embeddings with Plotly 2D/3D scatter plots, hoverable labels, and downloadable plots.
  • Downloadable Results: Export similarity results as CSV for further analysis.

🧠 How It Works

Data Preprocessing

Character names are normalized, and embeddings are precomputed for SBERT and TF-IDF.

Similarity Computation

  • SBERT embeddings capture contextual and semantic relationships.
  • TF-IDF embeddings capture frequency-based textual features.
  • Hybrid similarity is a weighted combination of both embeddings.

Recommendation

The app computes top-N similar characters based on the chosen embedding model and similarity type.

Visualization

  • Reduce embeddings to 2D or 3D space with t-SNE or PCA.
  • Interactive plots allow exploration of latent structures and clusters.

πŸ› οΈ Technologies Used

  • Python & Streamlit for the web interface
  • Pandas, NumPy, Joblib for data processing
  • Scikit-learn for embeddings, similarity, t-SNE, and PCA
  • Plotly for interactive 2D/3D visualizations
  • Pillow for image handling
  • SBERT & TF-IDF for semantic and frequency-based embeddings

βš™οΈ How to Use

  1. Select a character from the sidebar.
  2. Choose an embedding model: SBERT, TF-IDF, or Hybrid.
  3. Adjust the number of similar characters to display.
  4. Click Find Similar Characters to get results with images and similarity scores.
  5. Explore Dimensionality Reduction to visualize embeddings in 2D/3D space.
  6. Download CSV results or plots for offline use.

πŸ“Š Applications

  • Character similarity analysis for literature and fandom studies
  • Semantic search engines
  • Recommendation systems
  • Knowledge graph construction and clustering of entities

πŸ”— Live Demo

Game of Thrones Character Similarity Explorer


πŸ“„ License

MIT License – feel free to use and adapt for research or educational purposes.