Stockmark-2-100B-Instruct-beta

Model description

Stockmark-2-100B-Instruct-beta is a 100-billion-parameter large language model built from scratch, with a particular focus on Japanese. It was pre-trained on approximately 1.5 trillion tokens of data, consisting of 60% English, 30% Japanese, and 10% code. Following pretraining, the model underwent post-training with synthetic data in Japanese to enhance its ability to follow instructions. This synthetic data was generated using Qwen2.5-32B-Instruct.

As a beta release, Stockmark-2-100b-Instruct-beta is still undergoing improvements and evaluations. Feedback and insights from users will help refine future versions.

See our blog for the detail.

This project is supported by GENIAC.

How to use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("stockmark/Stockmark-2-100B-Instruct-beta")
model = AutoModelForCausalLM.from_pretrained(
    "stockmark/Stockmark-2-100B-Instruct-beta", device_map="auto", torch_dtype=torch.bfloat16
)

instruction = "自然言語処理とは？"
input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": instruction}], add_generation_prompt=True, return_tensors="pt"
).to(model.device)

with torch.inference_mode():
    tokens = model.generate(
        input_ids,
        max_new_tokens = 512,
        do_sample = True,
        temperature = 0.7,
        top_p = 0.95,
        repetition_penalty = 1.05
    )
    
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(output)

License

MIT

Developed by

Stockmark Inc.

Author

Takahiro Omi

stockmark
/

Stockmark-2-100B-Instruct-beta

Stockmark-2-100B-Instruct-beta

Model description

How to use

License

Developed by

Author

Model tree for stockmark/Stockmark-2-100B-Instruct-beta

Space using stockmark/Stockmark-2-100B-Instruct-beta 1

Collection including stockmark/Stockmark-2-100B-Instruct-beta

Stokmark-2