dsfsdf.gif

Megatron-Corpus-14B-Exp

Megatron-Corpus-14B-Exp is based on the Qwen 2.5 14B modality architecture, designed to enhance the reasoning capabilities of 14B-parameter models. It has been fine-tuned on a synthetic dataset based on math corpus, further optimizing its chain-of-thought (CoT) reasoning and logical problem-solving abilities. The model demonstrates significant improvements in context understanding, structured data processing, and long-context comprehension, making it ideal for complex reasoning tasks, instruction-following, and text generation.

Key Improvements

  1. Advanced Reasoning & Logic: Optimized for multi-step problem-solving, logical deduction, and contextual analysis.
  2. Fine-Tuned Instruction Following: Generates precise responses, structured outputs (e.g., JSON), and extended long-form text (8K+ tokens).
  3. Greater Adaptability: Excels in role-playing, multi-turn dialogues, and diverse system prompts.
  4. Long-Context Support: Handles up to 128K tokens and generates up to 8K tokens per output.
  5. Multilingual Proficiency: Supports over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, and more.

Quickstart with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Megatron-Corpus-14B-Exp"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Explain the concept of logical reasoning in AI."
messages = [
    {"role": "system", "content": "You are an expert AI assistant specialized in reasoning and logic."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Intended Use

  • Advanced Logical & Analytical Reasoning: Designed for problem-solving, multi-step deductions, and cognitive reasoning tasks.
  • Mathematical & Scientific Computation: Supports theorem proving, complex calculations, and scientific knowledge retrieval.
  • Code Generation & Debugging: Generates optimized code, detects errors, and improves programming workflows.
  • Structured Data Analysis: Processes tables, JSON, and structured formats for data-centric applications.
  • Multilingual Reasoning & Translation: High proficiency across 29+ languages for international applications.
  • Extended Text Generation: Capable of generating research papers, instructional guides, and in-depth reports.

Limitations

  1. High Computational Requirements: Due to its 14B parameters and 128K context support, it requires powerful GPUs or TPUs for efficient inference.
  2. Language-Specific Variability: Performance may differ across supported languages, especially for low-resource languages.
  3. Potential Error Accumulation: Long-form text generation can introduce inconsistencies over extended outputs.
  4. Limited Real-World Awareness: Knowledge is restricted to training data and may not reflect recent world events.
  5. Prompt Sensitivity: The quality of responses depends on the specificity and clarity of the input prompt.
Downloads last month
15
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for prithivMLmods/Megatron-Corpus-14B-Exp