Resume Matcher β TF-IDF Model
This repository provides the TF-IDF baseline model for resume and job description matching.
It includes pre-trained vectorizers and a pre-computed job matrix, allowing you to run fast keyword-based similarity searches without needing to rebuild the model from scratch.
Repository Contents
- applicant/
job_matrix.npz
β Sparse matrix of job description embeddings.job_vectorizer.pkl
β TF-IDF vectorizer trained on job data.
- recruiter/
combined_vectorizer.pkl
β Shared TF-IDF vectorizer trained on both resumes and job descriptions.
How to Use
Applicant Use Case
An applicant can input a single resume and receive the top-k most relevant job titles.
Example:
import joblib, numpy as np
from sklearn.metrics.pairwise import cosine_similarity
# Load vectorizer and job matrix
vectorizer = joblib.load("recruiter/combined_vectorizer.pkl")
job_matrix = np.load("applicant/job_matrix.npz")["arr_0"]
# Encode a resume
resume_text = "Skilled in Python, data analysis, and machine learning."
resume_vec = vectorizer.transform([resume_text])
# Compute similarities
scores = cosine_similarity(resume_vec, job_matrix)
top_k_indices = scores[0].argsort()[-5:][::-1]
print("Top job indices:", top_k_indices)
Recruiter Use Case
A recruiter can upload a job description along with a batch of resumes. The system will encode all resumes and rank them by similarity to the given job description.
import joblib, numpy as np
from sklearn.metrics.pairwise import cosine_similarity
# Load the shared vectorizer
vectorizer = joblib.load("recruiter/combined_vectorizer.pkl")
# Load a batch of resumes (for demo purposes)
resumes = [
"Experienced software engineer skilled in Java and Spring Boot.",
"Data scientist with Python, TensorFlow, and NLP experience.",
"Frontend developer with React, TypeScript, and UI/UX background."
]
# Encode resumes
resume_matrix = vectorizer.transform(resumes)
# Example job description
job_desc = "Looking for a data scientist with experience in Python and NLP."
job_vec = vectorizer.transform([job_desc])
# Compute similarity between job and all resumes
scores = cosine_similarity(job_vec, resume_matrix).flatten()
# Rank resumes by relevance
ranked_indices = scores.argsort()[::-1]
for idx in ranked_indices:
print(f"Resume: {resumes[idx]} | Score: {scores[idx]:.4f}")
License
This project is distributed under the Apache License 2.0. You are free to use, modify, and integrate the model in commercial or research projects, provided attribution is maintained.
Notes
- Use this repository when you want a lightweight, fast baseline for matching.
- For deeper semantic understanding beyond keywords, see the BERT-based version.
- Downloads last month
- -