Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
Since Huggingface has omitted to publish a standalone pytorch SmolLM2_360M_model.py to load and finetune and run inference of the released model weights and config at https://huggingface.co/HuggingFaceTB/SmolLM2-360M/
|
| 2 |
-
I have attempted to construct a model.py that can load and at least do inference mode using the published weights and config. One a functioning pytorch model.py is built, it may be possible to export a torchscript version of the SmolLM2 model that can be implemented on non-python hardware such as MPUs or Risc machines or Smartphones, in edge devices. The SmolLM2_360M_model.py runs but is unable to load the safetensors data. Here is the encountered error:
|
| 3 |
|
| 4 |
C:\Users\User\OneDrive\Desktop\SmolLM2>python SmolLM2_360M_model_debugging.py
|
| 5 |
Warning: SentencePiece not found, using rudimentary BPE tokenizer. Install SentencePiece for better performance.
|
|
@@ -32,6 +32,122 @@ Is there a python script for inspecting the safetensors file?
|
|
| 32 |
Why does model.safetensors file "not contain tensor lm_head.weight"?
|
| 33 |
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
---
|
| 36 |
license: apache-2.0
|
| 37 |
---
|
|
|
|
| 1 |
Since Huggingface has omitted to publish a standalone pytorch SmolLM2_360M_model.py to load and finetune and run inference of the released model weights and config at https://huggingface.co/HuggingFaceTB/SmolLM2-360M/
|
| 2 |
+
I have attempted to construct a pytorch model.py that can load and at least do inference mode using the published weights and config. One a functioning pytorch model.py is built, it may be possible to export a torchscript version of the SmolLM2 model that can be implemented on non-python hardware such as MPUs or Risc machines or Smartphones, in edge devices. The SmolLM2_360M_model.py runs but is unable to load the safetensors data. Here is the encountered error:
|
| 3 |
|
| 4 |
C:\Users\User\OneDrive\Desktop\SmolLM2>python SmolLM2_360M_model_debugging.py
|
| 5 |
Warning: SentencePiece not found, using rudimentary BPE tokenizer. Install SentencePiece for better performance.
|
|
|
|
| 32 |
Why does model.safetensors file "not contain tensor lm_head.weight"?
|
| 33 |
|
| 34 |
|
| 35 |
+
# Help Needed: Building a Standalone PyTorch SmolLM2-360M Model
|
| 36 |
+
|
| 37 |
+
The Hugging Face Hub hosts the SmolLM2-360M model ([HuggingFaceTB/SmolLM2-360M](https://huggingface.co/HuggingFaceTB/SmolLM2-360M/)), but currently lacks a standalone PyTorch `model.py` file for loading, fine-tuning, and inference. This limits the model's usability outside the Hugging Face ecosystem.
|
| 38 |
+
|
| 39 |
+
I've started creating a `SmolLM2_360M_model.py` file to address this gap, aiming for compatibility with all SmolLM2 models. The initial goal is to enable inference using the published weights and config. A successful PyTorch implementation would pave the way for exporting a TorchScript version, broadening accessibility to non-Python environments like microcontrollers, RISC-V machines, smartphones, and other edge devices.
|
| 40 |
+
|
| 41 |
+
**The Challenge:**
|
| 42 |
+
|
| 43 |
+
While my `SmolLM2_360M_model.py` runs, it encounters problems loading the `safetensors` data. I'm receiving the following error:
|
| 44 |
+
|
| 45 |
+
```
|
| 46 |
+
# Insert the full error message here, including traceback. This will help others diagnose the problem quickly.
|
| 47 |
+
# For example:
|
| 48 |
+
Traceback (most recent call last):
|
| 49 |
+
File "SmolLM2_360M_model.py", line 32, in <module>
|
| 50 |
+
model.load_state_dict(torch.load("pytorch_model.bin"))
|
| 51 |
+
File ".../python3.8/site-packages/torch/serialization.py", line 781, in load
|
| 52 |
+
with _open_file_like(f, 'rb') as opened_file:
|
| 53 |
+
FileNotFoundError: [Errno 2] No such file or directory: 'pytorch_model.bin'
|
| 54 |
+
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
**Call to Action:**
|
| 58 |
+
|
| 59 |
+
I'm seeking assistance from experienced PyTorch developers to debug the loading issue and complete the `SmolLM2_360M_model.py` implementation. Your contributions will significantly expand the potential applications of SmolLM2.
|
| 60 |
+
|
| 61 |
+
**Specific Areas Where Help is Needed:**
|
| 62 |
+
|
| 63 |
+
* **Safetensors Loading:** Resolving the error encountered when loading the model weights from the safetensors file.
|
| 64 |
+
* **Model Architecture Verification:** Confirming the correctness of the PyTorch model architecture based on the config file.
|
| 65 |
+
* **Inference Implementation:** Ensuring the model can perform inference correctly.
|
| 66 |
+
* **Fine-tuning Support (Optional):** Adding functionality for fine-tuning the model on downstream tasks.
|
| 67 |
+
* **TorchScript Export (Optional):** Enabling export to TorchScript for deployment on resource-constrained devices.
|
| 68 |
+
|
| 69 |
+
**How to Contribute:**
|
| 70 |
+
|
| 71 |
+
1. Fork the repository containing the `SmolLM2_360M_model.py` file.
|
| 72 |
+
2. Debug the code and implement the missing functionality.
|
| 73 |
+
3. Submit a pull request with your changes.
|
| 74 |
+
|
| 75 |
+
By working together, we can make SmolLM2 more accessible and empower a wider range of users to leverage its capabilities. Thank you for your time and expertise!
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
P.S. Here's a technical breakdown of the process for creating a TorchScript version of the model and deploying it to various platforms:
|
| 79 |
+
|
| 80 |
+
**1. TorchScript Creation:**
|
| 81 |
+
|
| 82 |
+
* **Trace or Script:** TorchScript offers two ways to convert your PyTorch model: tracing and scripting. Tracing records the operations performed on example inputs, creating a static graph. Scripting directly parses the model code, supporting control flow. Scripting is preferred if your model uses dynamic control flow.
|
| 83 |
+
```python
|
| 84 |
+
# Tracing Example
|
| 85 |
+
example_input = torch.randn(1, 3, 224, 224) # Example input
|
| 86 |
+
traced_model = torch.jit.trace(model, example_input)
|
| 87 |
+
|
| 88 |
+
# Scripting Example
|
| 89 |
+
scripted_model = torch.jit.script(model)
|
| 90 |
+
```
|
| 91 |
+
|
| 92 |
+
* **Optimization (Optional):** TorchScript provides optimization passes to improve the performance of the exported model.
|
| 93 |
+
```python
|
| 94 |
+
optimized_model = torch.jit.optimize_for_inference(scripted_model)
|
| 95 |
+
```
|
| 96 |
+
|
| 97 |
+
* **Saving:** Save the TorchScript model to a file.
|
| 98 |
+
```python
|
| 99 |
+
torch.jit.save(optimized_model, "smolLM2_360m.pt")
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
+
**2. Deployment to Target Environments:**
|
| 103 |
+
|
| 104 |
+
* **C++:** LibTorch, the C++ API for PyTorch, can load and execute TorchScript models. Integrate `libTorch` into your C++ application for microcontroller, RISC-V, or other edge device deployments. This typically involves compiling your C++ code and linking against `libTorch`.
|
| 105 |
+
|
| 106 |
+
* **Android/iOS:** Use the respective PyTorch Mobile libraries for these platforms. These libraries offer optimized runtime environments for executing TorchScript models within mobile applications.
|
| 107 |
+
|
| 108 |
+
* **Other Edge Devices:** Depending on the device and its capabilities, explore options like using a custom runtime, or if available, a cross-compilation toolchain to target the device from your development environment.
|
| 109 |
+
|
| 110 |
+
**Example C++ Deployment (Simplified):**
|
| 111 |
+
|
| 112 |
+
```c++
|
| 113 |
+
#include <torch/script.h>
|
| 114 |
+
|
| 115 |
+
int main() {
|
| 116 |
+
// Load the TorchScript model
|
| 117 |
+
torch::jit::script::Module module = torch::jit::load("smolLM2_360m.pt");
|
| 118 |
+
|
| 119 |
+
// Prepare input tensor
|
| 120 |
+
// ... (Device-specific input tensor preparation) ...
|
| 121 |
+
|
| 122 |
+
// Run inference
|
| 123 |
+
std::vector<torch::jit::IValue> inputs;
|
| 124 |
+
inputs.push_back(input_tensor); // Add input tensor(s)
|
| 125 |
+
auto output = module.forward(inputs);
|
| 126 |
+
|
| 127 |
+
// Process output
|
| 128 |
+
// ... (Handle output tensor on the device) ...
|
| 129 |
+
|
| 130 |
+
return 0;
|
| 131 |
+
}
|
| 132 |
+
```
|
| 133 |
+
|
| 134 |
+
**Key Considerations:**
|
| 135 |
+
|
| 136 |
+
* **Hardware Limitations:** Microcontrollers and other edge devices have limited resources. Model size and complexity may need adjustments (quantization, pruning) for optimal performance.
|
| 137 |
+
|
| 138 |
+
* **Platform-Specific Tooling:** Each target platform has its own build system and toolchain. Familiarize yourself with these tools for successful deployment.
|
| 139 |
+
|
| 140 |
+
* **Cross-Compilation:** If building directly on the target device isn't feasible, cross-compilation is necessary. This typically involves setting up a cross-compilation toolchain for the target architecture.
|
| 141 |
+
|
| 142 |
+
* **Debugging:** Debugging on edge devices can be challenging. Thoroughly testing the TorchScript model within a more accessible environment (e.g., your development machine) before deploying is essential.
|
| 143 |
+
|
| 144 |
+
|
| 145 |
+
This expanded explanation provides a more complete roadmap for creating and deploying TorchScript versions of the SmolLM2 model. Remember to consult the official PyTorch and LibTorch documentation for platform-specific instructions and best practices.
|
| 146 |
+
|
| 147 |
+
|
| 148 |
+
|
| 149 |
+
|
| 150 |
+
|
| 151 |
---
|
| 152 |
license: apache-2.0
|
| 153 |
---
|