PJEDeveloper
/

Mistral-7B-Instruct-v0.3-4bit-20250716_003938

@@ -1,199 +1,273 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+tags:
+- mistral
+- instruct
+- quantization
+- 4bit
+- bitsandbytes
+- causal-lm
 ---
+# 4bit Quantized Model: Mistral-7B-Instruct-v0.3
+This is a 4bit quantized variant of [mistralai/Mistral-7B-Instruct-v0.3](https://https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3), optimized to reduce memory footprint and accelerate inference while maintaining high output similarity.
+## Overview
+Mistral-7B-Instruct-v0.3 is an instruction fine-tuned model derived from Mistral-7B-v0.3, featuring:
+- An extended 32,768 token vocabulary.
+- Support for v3 tokenizer.
+- Built-in function calling capabilities.
+This quantized checkpoint was produced with [BitsAndBytes](https://github.com/bitsandbytes-foundation/bitsandbytes) and evaluated using standard text similarity metrics.
+---
+## Model Architecture
+| Attribute               | Value                          |
+|-------------------------|--------------------------------|
+| **Model class**         | MistralForCausalLM |
+| **Number of parameters**| 3,758,362,624 |
+| **Hidden size**         | 4096 |
+| **Number of layers**    | 32 |
+| **Attention heads**     | 32 |
+| **Vocabulary size**     | 32768 |
+| **Compute dtype**       | torch.bfloat16 |
+---
+## Quantization Configuration
+The following configuration dictionary was used during quantization:
+```json
+{'quant_method': <QuantizationMethod.BITS_AND_BYTES: 'bitsandbytes'>, '_load_in_8bit': False, '_load_in_4bit': True, 'llm_int8_threshold': 6.0, 'llm_int8_skip_modules': None, 'llm_int8_enable_fp32_cpu_offload': False, 'llm_int8_has_fp16_weight': False, 'bnb_4bit_quant_type': 'fp4', 'bnb_4bit_use_double_quant': False, 'bnb_4bit_compute_dtype': 'bfloat16', 'bnb_4bit_quant_storage': 'uint8', 'load_in_4bit': True, 'load_in_8bit': False}
+```
+---
+## Intended Use
+- Research and experimentation with instruction-following tasks.
+- Demonstrations of quantized model capabilities in resource-constrained environments.
+- Prototyping workflows requiring extended vocabulary and function calling support (v3 tokenizer).
+## Limitations
+- May reproduce biases and factual inaccuracies present in the original model.
+- This instruct variant does not include any moderation or safety guardrails by default.
+- Quantization can reduce generation diversity and precision.
+- Not intended for production without thorough evaluation and alignment testing.
+## Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("PJEDeveloper/Mistral-7B-Instruct-v0.3-4bit-20250716_003938")
+model = AutoModelForCausalLM.from_pretrained("PJEDeveloper/Mistral-7B-Instruct-v0.3-4bit-20250716_003938", device_map="auto")
+prompt = "Explain the concept of reinforcement learning."
+inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+outputs = model.generate(**inputs, max_new_tokens=256)
+print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+## Function Calling
+For function calling workflows, please see the [Transformers Function Calling Guide](https://huggingface.co/docs/transformers/main/en/model_doc/mistral3#overview) and the original [Mistral examples](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3).
+---
+Prompt Used for Evaluation
+```text
+You are an expert assistant helping a user plan a themed event.
+Please complete the following tasks:
+Event Summary: Write a concise paragraph (3-5 sentences) describing a steampunk-themed anniversary party in a unique style, incorporating at least two uncommon English words from the extended vocabulary (for example, pulchritudinous, vicissitude, susurration, ephemeral, etc.).
+Guest List JSON: Create a JSON array listing five fictional guests, each with the following fields:
+"name": A full name.
+"role": e.g., "Inventor", "Airship Captain".
+"special_item": A unique steampunk artifact.
+Checklist Function Call: Format the following checklist as a function call named create_event_checklist with a single argument items, which should be a list of strings. The checklist items are:
+"Order brass goggles"
+"Prepare mechanical owl centerpiece"
+"Send invitations"
+"Arrange clockwork music ensemble"
+"Book airship docking slot"
+Respond in three sections labeled Event Summary, Guest List JSON, and Checklist Function Call.
+Make sure all JSON is properly formatted and the function call syntax is valid.
+```
+Reference Output (Full-weight Model)
+```text
+Event Summary:
+Amidst the ephemeral whimsy of a gaslit evening, the anniversary of our esteemed society shall be celebrated in a grand Steampunk extravaganza. The Victorian-esque manor, adorned with intricate brass machinery and gears, shall echo with the susurration of time's relentless march, as we revel in the pulchritudinous splendor of our pulchritudinous past.
+Guest List JSON:
+json
+[
+  {
+    "name": "Professor Archibald F. Brassbolt III",
+    "role": "Inventor",
+    "special_item": "Aether-infused Time-Turner"
+  },
+  {
+    "name": "Captain Amelia Cogsworth",
+    "role": "Airship Captain",
+    "special_item": "The Leviathan - a clockwork airship"
+  },
+  {
+    "name": "Lady Isabella Gearsby",
+    "role": "Society Matron",
+    "special_item": "A diamond-encrusted Steam-Pendulum"
+  },
+  {
+    "name": "Sir Percival Automaton",
+    "role": "Mechanical Knight",
+    "special_item": "A steam-powered automaton suit"
+  },
+  {
+    "name": "Dr. Eleanor Clockwork",
+    "role": "Mad Scientist",
+    "special_item": "A living, breathing steam-powered rose"
+  }
+]
+Checklist Function Call:
+python
+def create_event_checklist(items):
+    return "\n".join(items)
+checklist = create_event_checklist(["Order brass goggles", "Prepare mechanical owl centerpiece", "Send invitations", "Arrange clockwork music ensemble", "Book airship docking slot"])
+With this checklist, we can ensure that our Steampunk anniversary party will be an unforgettable event, filled with the marvels of a bygone era.
+```
+Quantized Model Output
+```text
+---
+Event Summary:
+In the heart of the industrial ether, where the air is thick with the scent of brass and steam, a grand celebration of marital vicissitude shall unfold. The ephemeral spectacle of a steampunk anniversary party shall be a veritable cornucopia of pulchritudinous contraptions, where the susurration of gears and the hiss of steam shall serenade the guests in a symphony of mechanical harmony.
+Guest List JSON:
+```
+[
+  {
+    "name": "Professor Abigail Edison-Smythe",
+    "role": "Inventor",
+    "special_item": "Aether-infused Tesla Coil"
+  },
+  {
+    "name": "Captain Amelia Brassbright",
+    "role": "Airship Captain",
+    "special_item": "Steampowered Parasol"
+  },
+  {
+    "name": "Lord Percival Cogsworth",
+    "role": "Steampunk Poet",
+    "special_item": "Mechanical Quill and Inkwell"
+  },
+  {
+    "name": "Miss Isabella Gearheart",
+    "role": "Steampunk Fashion Designer",
+    "special_item": "Steam-powered Dress with Built-in Fan"
+  },
+  {
+    "name": "Sir Archibald Clockwork",
+    "role": "Clockwork Mechanic",
+    "special_item": "Mechanical Hand with Built-in Compass"
+  }
+]
+```
+Checklist Function Call:
+```
+def create_event_checklist(items):
+    return "\n".join(items)
+create_event_checklist(["Order brass goggles", "Prepare mechanical owl centerpiece", "Send invitations", "Arrange clockwork music ensemble", "Book airship docking slot"])
+```
+```
+## Evaluation Metrics
+| Metric            | Value   |
+|-------------------|---------|
+| ROUGE-L F1        | 0.4581 |
+| BLEU              | 0.2442 |
+| Cosine Similarity | 0.9141 |
+| BERTScore F1      | 0.6955 |
+- Higher ROUGE and BLEU scores indicate closer alignment with the original output.
+Interpretation:
+The quantized model output exhibits moderate similarity to the full-weight model.
+Warning: The quantized output has 3 sentences, while the reference has 6. This may indicate structural divergence.
+## Generation Settings
+This model produces best results when generated with:
+```python
+max_new_tokens=1024,
+do_sample=False,
+temperature=0.3,
+top_p=0.9,
+pad_token_id=tokenizer.eos_token_id
+```
+## Model Files Metadata
+| Filename           | Size (bytes)   | SHA-256                                      |
+|--------------------|----------------|----------------------------------------------|
+| `quant_config.txt` | 446 | `f7a08f6dc4b46a4803dce152c536ceed2ee802755840db11231fb5a895b2e022` |
+---
+## Notes
+- Produced on 2025-07-16T00:43:52.476070.
+- Quantized automatically using BitsAndBytes.
+- Base model: mistralai/Mistral-7B-Instruct-v0.3 with extended 32,768-token vocabulary and function calling capabilities.
+Intended primarily for research and experimentation.
+## Citation
+[Mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)
+[Mistral 7B Announcement](https://mistral.ai/news/announcing-mistral-7b)
+## License
+This model is distributed under the Apache 2.0 license, consistent with the original Mistral-7B-Instruct-v0.3.
+## Model Card Authors
+This quantized model was prepared by PJEDeveloper.