Overview

This is a fine-tuned version of the LLaMA-3.3-70B-Instruct model with a focus on Turkish language processing, leveraging LoRA (Low-Rank Adaptation) for efficient adaptation to specialized tasks. The model is trained to handle causal language modeling tasks, excelling at text generation, comprehension, and structured reasoning in Turkish.

Model Description

  • Model Type: LLaMA (Large Language Model Meta AI)
  • Version: 3.3-70B-Instruct
  • Number of Parameters: 70 Billion
  • Pretraining Task: Causal Language Modeling
  • Training Objective: To fine-tune the base LLaMA model for improved Turkish language understanding with LoRA.
  • Tokenizer: Special tokenizer adapted to Turkish, enabling efficient tokenization of the language.
  • Training Data: A specialized Turkish corpus tailored for tasks such as reasoning, comprehension, and structured output generation.

Intended Use

This model is designed for tasks requiring Turkish text generation and comprehension, such as:

  • Text Generation: Generating coherent, contextually relevant text in Turkish.
  • Question Answering: Answering questions posed in Turkish, leveraging both the fine-tuned model and structural task handling.
  • Text Summarization: Summarizing complex Turkish texts into concise outputs.
  • Dialogue Systems: Enabling interactive dialogue systems that can converse in Turkish.

Model Parameters

  • Max Position Embeddings: 131072
  • Number of Attention Heads: 64
  • Number of Hidden Layers: 80
  • Vocab Size: 128256
  • Pretraining TP: 1
  • Rope Scaling Factors:
    • High Frequency: 4.0
    • Low Frequency: 1.0
  • Tie Word Embeddings: False
  • RMS Norm EPS: 1e-05
  • Training Arguments:
    • Epochs: 3
    • Training Loss: 0.0995
    • Training Samples per Second: 3.417
    • Training Steps per Second: 0.027
    • Training Runtime: 6 days, 4:59:46.09

Training Results

  • Epoch: 3.0
  • Total FLOPs: 224.76 TFLOPS (224,757,703 GF)
  • Train Loss: 0.0995
  • Train Runtime: 6 days, 4:59:46.09
  • Train Samples per Second: 3.417
  • Train Steps per Second: 0.027
  • Training Loss Figure: Training Loss

Evaluation Results

The model was evaluated on a dataset containing 67,882 examples. The evaluation results are as follows:

  • Eval Loss: 0.108
  • Eval Runtime: 2:39:22.40
  • Eval Samples per Second: 7.099
  • Eval Steps per Second: 0.887

Usage Example

To use the model for text generation in Turkish, you can load it with the transformers library like so:

from transformers import LlamaForCausalLM, LlamaTokenizer

model = LlamaForCausalLM.from_pretrained("newmindai/Llama-3.3-70B-Instruct-Instruct-V3")
tokenizer = LlamaTokenizer.from_pretrained("newmindai/Llama-3.3-70B-Instruct-Instruct-V3")

input_text = "Merhaba, nasılsınız?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs["input_ids"], max_length=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using newmindai/Llama-3.3-70b-Instruct 1