Resume Matcher – TF-IDF Model

This repository provides the TF-IDF baseline model for resume and job description matching.
It includes pre-trained vectorizers and a pre-computed job matrix, allowing you to run fast keyword-based similarity searches without needing to rebuild the model from scratch.

Repository Contents

applicant/
- job_matrix.npz – Sparse matrix of job description embeddings.
- job_vectorizer.pkl – TF-IDF vectorizer trained on job data.
recruiter/
- combined_vectorizer.pkl – Shared TF-IDF vectorizer trained on both resumes and job descriptions.

How to Use

Applicant Use Case

An applicant can input a single resume and receive the top-k most relevant job titles.
Example:

import joblib, numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Load vectorizer and job matrix
vectorizer = joblib.load("recruiter/combined_vectorizer.pkl")
job_matrix = np.load("applicant/job_matrix.npz")["arr_0"]

# Encode a resume
resume_text = "Skilled in Python, data analysis, and machine learning."
resume_vec = vectorizer.transform([resume_text])

# Compute similarities
scores = cosine_similarity(resume_vec, job_matrix)
top_k_indices = scores[0].argsort()[-5:][::-1]
print("Top job indices:", top_k_indices)

Recruiter Use Case

A recruiter can upload a job description along with a batch of resumes. The system will encode all resumes and rank them by similarity to the given job description.

import joblib, numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Load the shared vectorizer
vectorizer = joblib.load("recruiter/combined_vectorizer.pkl")

# Load a batch of resumes (for demo purposes)
resumes = [
    "Experienced software engineer skilled in Java and Spring Boot.",
    "Data scientist with Python, TensorFlow, and NLP experience.",
    "Frontend developer with React, TypeScript, and UI/UX background."
]

# Encode resumes
resume_matrix = vectorizer.transform(resumes)

# Example job description
job_desc = "Looking for a data scientist with experience in Python and NLP."
job_vec = vectorizer.transform([job_desc])

# Compute similarity between job and all resumes
scores = cosine_similarity(job_vec, resume_matrix).flatten()

# Rank resumes by relevance
ranked_indices = scores.argsort()[::-1]
for idx in ranked_indices:
    print(f"Resume: {resumes[idx]} | Score: {scores[idx]:.4f}")

License

This project is distributed under the Apache License 2.0. You are free to use, modify, and integrate the model in commercial or research projects, provided attribution is maintained.

Notes

Use this repository when you want a lightweight, fast baseline for matching.
For deeper semantic understanding beyond keywords, see the BERT-based version.

Om-Shandilya
/

resume-matcher-tfidf