README.md · fineinstructions/template_instantiator at bf6a2f352ce57da5d288bd6c9c2d90b26ec383d3

metadata

base_model: meta-llama/Llama-3.2-1B-Instruct
datasets:
  - fineinstructions/template_instantiator_training
tags:
  - datadreamer
  - datadreamer-0.46.0
  - synthetic
  - text-generation
pipeline_tag: text-generation

This model will take a instruction template in the format of FineTemplates and a document and return an instantiated instruction and answer pair.

The output will be a JSON object.

Simple Usage Example

import json
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained('fineinstructions/template_instantiator', revision=None)
tokenizer.padding_side = 'left'
model = AutoModelForCausalLM.from_pretrained('fineinstructions/template_instantiator', revision=None)
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, pad_token_id=tokenizer.pad_token_id, return_full_text=False)

# Run inference to instantiate the instruction template and generate an answer
inputs = [json.dumps({
  "instruction_template": "...",
  "document": "..."
}, indent=2)]
prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
output = generations[0][0]['generated_text']
print(output)

##### Output:
# {
# ..
# }
#

This model was trained with a synthetic dataset with DataDreamer 🤖💤. The synthetic dataset card and model card can be found here. The training arguments can be found here.