Enjoying the Model

by wjbmattingly - opened 7 days ago

7 days ago

Hello, thank you for making this model open-source! Do you have a notebook for fine-tuning by chance? I'm looking to replace a small transformer model for parsing names.

from ollama import chat
from recon.model import Person
from pydantic import BaseModel
from typing import Optional, List


class Person(BaseModel):
    first_name: Optional[str] = None
    last_name: Optional[str] = None
    middle_names: Optional[List[str]] = None
    middle_initials: Optional[List[str]] = None
    birth_year: Optional[int] = None
    death_year: Optional[int] = None
    titles: Optional[List[str]] = None
    extra_info: Optional[List[str]] = None


reasoning_trace = """
Problem: Parse the following name into a JSON format according to the following schema.
William White (1748-1836)

"""

response = chat(
  messages=[
    {
        "role": "system",
        "content": f"You are a helpful assistant that understands and translates text to JSON format according to the following schema. {Person.model_json_schema()}"
    },
    {
      'role': 'user',
      'content': reasoning_trace,
    }
  ],
  model='Osmosis/Osmosis-Structure-0.6B',
  format=Person.model_json_schema(),
)

answer = Person.model_validate_json(response.message.content)
print(answer)

The responses from the model are a bit all over the place (understandably so as this is probably not what it was trained on). I'm hoping to fine-tune it, though and have a smaller, better model for performing this and other similar tasks.

Sifal

7 days ago

Seems like the model prefers longer outputs (for the few experiments, it seems to also hallucinate some informations, even though the snippet is short)

from ollama import chat
from pydantic import BaseModel
from typing import Optional, List


class Person(BaseModel):
    first_name: Optional[str] = None
    last_name: Optional[str] = None
    middle_names: Optional[List[str]] = None
    middle_initials: Optional[List[str]] = None
    birth_year: Optional[int] = None
    death_year: Optional[int] = None
    titles: Optional[List[str]] = None
    extra_info: Optional[List[str]] = None


user_task = """
A person is named William Henry White he was born in year 1748 and lived to be a great doctor.
He was a very tall man, he died in year 1823 from drinking too much water
"""

response = chat(
  messages=[
    {
        "role": "system",
        "content": f"You are a helpful assistant that understands and translates text to JSON format according to the following schema. {Person.model_json_schema()}"
    },
    {
      'role': 'user',
      'content': user_task,
    }
  ],
  model='Osmosis/Osmosis-Structure-0.6B',
  format=Person.model_json_schema(),
)

answer = Person.model_validate_json(response.message.content)
print(answer)

wjbmattingly

7 days ago

This is a fascinating observation! Thank you, @Sifal

wjbmattingly

7 days ago

For those who come here, few-shot helps the model a lot.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment