I'm attempting to use a 7B LLM, Llama in this case, with an embedding head stuck on the end instead of the lm_head. I used an LLM to rank a ton of randomly selected pairs of papers based on if they have good connections, and trained the embedding head on triplets mined from those ranked pairs.
The idea is for the embedding head to learn to align features from paper abstracts that complement each other.
this is the first version and yeah, I'm not overly impressed. I think I'm seeing results that kinda vibe with the concept sometimes, but I think the ranking criteria for the dataset were a bit loose. I'm going to try making a new dataset with better, more strict, more nuanced criteria and train a second version of the model from that.