--- license: other base_model: Qwen tags: - qwen3 - instruction-tuning - chat - transformers - pytorch - safetensors language: - en - ja library_name: transformers pipeline_tag: text-generation --- # Qwen3-4B-SFT-TEST2 ## Model Description Qwen3-4B-SFT-TEST2 is a language model fine-tuned for improved performance on various natural language understanding and generation tasks. ## Model Details - **Model Name**: Qwen3-4B-SFT-TEST2 - **Base Model**: Qwen - **Architecture**: Qwen3ForCausalLM - **Parameters**: ~2B - **Model Type**: qwen3 - **Total Size**: 3.4GB - **Upload Date**: 2025-08-18 ### Model Architecture - **Hidden Size**: 2560 - **Number of Layers**: 36 - **Attention Heads**: 32 - **Vocabulary Size**: 151936 - **Max Position Embeddings**: 40960 ## Files This repository contains: - **SafeTensors format**: Optimized for fast loading and reduced memory usage - **Tokenizer**: Included for text processing ## Usage ### Loading the Model ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch # Load model and tokenizer model_name = "team-suzuki/Qwen3-4B-SFT-TEST2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto" ) ``` ### Text Generation ```python # Prepare input text = "Hello, how are you?" inputs = tokenizer(text, return_tensors="pt") # Generate response with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=100, temperature=0.7, do_sample=True, pad_token_id=tokenizer.eos_token_id ) # Decode output response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ### Chat Format (if applicable) ```python # For instruction-tuned models messages = [ {"role": "user", "content": "What is the capital of Japan?"} ] # Apply chat template if available if hasattr(tokenizer, 'apply_chat_template'): formatted_input = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors="pt" ) else: formatted_input = tokenizer("User: What is the capital of Japan?\nAssistant:", return_tensors="pt") # Generate response outputs = model.generate( formatted_input, max_new_tokens=100, temperature=0.7, do_sample=True ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Details - **Training Data**: [Specify training dataset if known] - **Fine-tuning Method**: [Specify fine-tuning approach] - **Training Framework**: PyTorch + Transformers - **Hardware**: [Specify if known] ## Evaluation [Add evaluation results if available] ## Limitations and Biases - This model may exhibit biases present in the training data - Performance may vary across different domains and languages - Always verify outputs for accuracy and appropriateness ## Ethical Considerations - Use responsibly and in accordance with applicable laws and regulations - Be aware of potential biases and limitations - Consider the impact of generated content ## Citation If you use this model in your research, please cite: ```bibtex @misc{qwen3_4b_sft_test2, title={Qwen3-4B-SFT-TEST2: A Fine-tuned Language Model}, author={[Your Name/Organization]}, year={2025}, url={https://huggingface.co/team-suzuki/Qwen3-4B-SFT-TEST2} } ``` ## License This model is released under the other license. Please see the license file for more details. ## Contact For questions or issues, please [open an issue](https://huggingface.co/team-suzuki/Qwen3-4B-SFT-TEST2/discussions) on this repository. --- *This model card was automatically generated. Please update with specific details about your model.*