SheikhCoder v1.3b πŸ•Œ

A culturally-aware code completion model built on top of Microsoft's Phi-2, fine-tuned with Bengali tech content and MDX-based cultural intelligence.

Model Description

SheikhCoder is a specialized code completion model that combines the efficiency of Phi-2 with cultural awareness, particularly for Bengali developers. It supports both English and Bengali inputs, and provides contextually appropriate code suggestions.

Key Features

  • 🧠 2.7B parameters (Phi-2 base)
  • πŸ“ 2048 token context window
  • 🎨 MDX-native cultural intelligence
  • πŸ” Bengali language support
  • ⚑ 4-bit quantization support
  • πŸš€ Optimized for VS Code/Codespaces

Use Cases

  1. Code Completion with Cultural Context
  2. Technical Documentation in Bengali
  3. Culturally-Aware Code Comments
  4. MDX-Based Documentation Generation

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the model
model = AutoModelForCausalLM.from_pretrained("likhonsheikh/sheikh-coder-v1-3b", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("likhonsheikh/sheikh-coder-v1-3b")

# Example usage
code = """
def calculate_zakat(amount):
    # Calculate Islamic Zakat (2.5% of wealth)
"""

inputs = tokenizer(code, return_tensors="pt")
outputs = model.generate(**inputs, max_length=200)
print(tokenizer.decode(outputs[0]))

Model Details

  • Base Model: Microsoft Phi-2
  • Training Data: Stack Dedup v1.2 + Bengali Tech Content
  • Parameters: 2.7B
  • Context Length: 2048 tokens
  • License: MIT (following Phi-2)
  • Limitations: See section below

Performance and Limitations

  • Best suited for code completion and documentation tasks
  • May require fine-tuning for specific domains
  • Bengali support is primarily for comments and documentation
  • Resource requirements:
    • RAM: 8GB minimum
    • GPU: Optional, but recommended for faster inference
    • Disk: ~5GB

Benchmarks

Code Completion (Python):
- Accuracy: 85%
- Cultural Context Score: 90%
- Response Time: <100ms

Documentation Generation:
- BLEU Score: 0.75
- Cultural Relevance: 0.85

Installation

# With pip
pip install torch transformers

# Optional: for 4-bit quantization
pip install bitsandbytes

Contributing

We welcome contributions! Please check our contribution guidelines and feel free to submit pull requests.

Citation

@software{sheikh_coder_2025,
  author = {Likhon Sheikh},
  title = {SheikhCoder: A Culturally-Aware Code Completion Model},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/likhonsheikh/sheikh-coder-v1-3b}
}

License

This model is released under the MIT License, following the licensing of its base model, Phi-2.

Contact

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results

  • Accuracy on Stack Dedup v1.2 + Bengali Tech Content
    self-reported
    0.850
  • Cultural Context Score on Stack Dedup v1.2 + Bengali Tech Content
    self-reported
    0.900