AI & ML interests

None defined yet.

Recent Activity

leonardlin  updated a dataset about 2 months ago
augmxnt/ultra-orca-boros-en-ja-v1
leonardlin  updated a model 3 months ago
augmxnt/shisa-gamma-7b-v1
leonardlin  updated a dataset 12 months ago
augmxnt/deccp
View all activity

augmxnt's activity

leonardlin 
posted an update 2 days ago
view post
Post
267
I'm excited to announce the official release of our Shisa V2 405B model:
shisa-ai/shisa-v2-llama3.1-405b

It's the strongest model ever trained in Japan, and even goes toe-to-toe w/ GPT-4o and DeepSeek-V3 in JA MT-Bench.

For all the details, be sure to check out post and overview report here: https://shisa.ai/posts/shisa-v2-405b/
leonardlin 
posted an update 13 days ago
view post
Post
2516
BTW, in case anyone wants to kick the tires, test their 日本語, I have our Shisa V2 405B model up and running temporarily: https://chat.shisa.ai/
  • 3 replies
·
leonardlin 
posted an update about 2 months ago
view post
Post
2658
Happy to announce the release of Shisa V2, our latest generation of our bilingual Japanese-English language models. After hundreds of ablations and months of work, we're releasing some of the strongest open Japanese models at 7B, 8B, 12B, 14B, 32B and 70B! Full announcement here https://shisa.ai/posts/shisa-v2/ or visit the Shisa V2 HF collection: shisa-ai/shisa-v2-67fc98ecaf940ad6c49f5689
leonardlin 
posted an update 12 months ago
view post
Post
2088
My weekened project ended up being doing some testing between torchtune, axolotl, and unsloth. I *think* it's a 1:1 comparison of what LoRA fine-tuning performance looks like between the different hardware I have in my dev boxes (4090, 3090, 7900 XTX, W7900) with a few other interesting tidbits.

Tonight I wrote up a WandB report (the panel editor is super broken in Firefox 😔) that sums up some of the more interesting bits from the results: https://wandb.ai/augmxnt/train-bench/reports/torchtune-vs-axolotl-vs-unsloth-Trainer-Comparison--Vmlldzo4MzU3NTAx
  • 1 reply
·
leonardlin 
posted an update 12 months ago
leonardlin 
posted an update about 1 year ago
view post
Post
1951
Interesting, I've just seen the my first HF spam on one of my new model uploads: shisa-ai/shisa-v1-llama3-70b - someone has an SEO spam page as a HF space attached to the model!?! Wild. Who do I report this to?
·
leonardlin 
posted an update about 1 year ago
view post
Post
1620
For those with an interest in JA language models, this Llama 3 70B test ablation looks like it is the current strongest publicly released, commercially usable, open model available. A lot of caveats I know, but it also matches gpt-3.5-turbo-0125's JA performance, which is worth noting, and is tuned *exclusively* with the old shisa-v1 dataset (so it's chart position will be very short lived).

shisa-ai/shisa-v1-llama3-70b

augmxnt/ultra-orca-boros-en-ja-v1
  • 2 replies
·
leonardlin 
posted an update about 1 year ago
leonardlin 
posted an update about 1 year ago
view post
Post
1382
llm-jp-eval is currently one of the most widely used benchmarks for Japanese LLMs and is half of WandB's comprehensive Nejumi LLM Leaderboard scoring. I was seeing some weirdness in results I was getting and ended up in a bit of a rabbit hole. Here's my article on evaling llm-jp-eval: https://huggingface.co/blog/leonardlin/llm-jp-eval-eval

I've setup a fork of Lightblue's Shaberi testing framework which uses LLM-as-a-Judge style benchmarks as something probably more representative of real world LLM strength in Japanese. Here's how the new base model ablations are looking:
leonardlin 
posted an update about 1 year ago
view post
Post
1260
I've been doing some evals and tuning, and this chat template repo maintained by @chujiezheng is great: https://github.com/chujiezheng/chat_templates

Here's also a simple script for checking what the output looks like:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("augmxnt/shisa-7b-v1")
messages = [
    {'role': 'user', 'content': 'This is the first user input.'},
    {'role': 'assistant', 'content': 'This is the first assistant response.'},
    {'role': 'user', 'content': 'This is the second user input.'},
]

print()
print('Chat Template:')
print(tokenizer.chat_template)
print()
print('---')
print()

print(tokenizer.apply_chat_template(messages, tokenize=False))
leonardlin 
updated a Space over 1 year ago