Resume Matcher – TF-IDF Model

This repository provides the TF-IDF baseline model for resume and job description matching.
It includes pre-trained vectorizers and a pre-computed job matrix, allowing you to run fast keyword-based similarity searches without needing to rebuild the model from scratch.


Repository Contents

  • applicant/
    • job_matrix.npz – Sparse matrix of job description embeddings.
    • job_vectorizer.pkl – TF-IDF vectorizer trained on job data.
  • recruiter/
    • combined_vectorizer.pkl – Shared TF-IDF vectorizer trained on both resumes and job descriptions.

How to Use

Applicant Use Case

An applicant can input a single resume and receive the top-k most relevant job titles.
Example:

import joblib, numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Load vectorizer and job matrix
vectorizer = joblib.load("recruiter/combined_vectorizer.pkl")
job_matrix = np.load("applicant/job_matrix.npz")["arr_0"]

# Encode a resume
resume_text = "Skilled in Python, data analysis, and machine learning."
resume_vec = vectorizer.transform([resume_text])

# Compute similarities
scores = cosine_similarity(resume_vec, job_matrix)
top_k_indices = scores[0].argsort()[-5:][::-1]
print("Top job indices:", top_k_indices)

Recruiter Use Case

A recruiter can upload a job description along with a batch of resumes. The system will encode all resumes and rank them by similarity to the given job description.

import joblib, numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Load the shared vectorizer
vectorizer = joblib.load("recruiter/combined_vectorizer.pkl")

# Load a batch of resumes (for demo purposes)
resumes = [
    "Experienced software engineer skilled in Java and Spring Boot.",
    "Data scientist with Python, TensorFlow, and NLP experience.",
    "Frontend developer with React, TypeScript, and UI/UX background."
]

# Encode resumes
resume_matrix = vectorizer.transform(resumes)

# Example job description
job_desc = "Looking for a data scientist with experience in Python and NLP."
job_vec = vectorizer.transform([job_desc])

# Compute similarity between job and all resumes
scores = cosine_similarity(job_vec, resume_matrix).flatten()

# Rank resumes by relevance
ranked_indices = scores.argsort()[::-1]
for idx in ranked_indices:
    print(f"Resume: {resumes[idx]} | Score: {scores[idx]:.4f}")

License

This project is distributed under the Apache License 2.0. You are free to use, modify, and integrate the model in commercial or research projects, provided attribution is maintained.


Notes

  • Use this repository when you want a lightweight, fast baseline for matching.
  • For deeper semantic understanding beyond keywords, see the BERT-based version.
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Spaces using Om-Shandilya/resume-matcher-tfidf 2