Mohamed Mekkouri
commited on
Commit
·
8449921
1
Parent(s):
71671c1
update README
Browse files
README.md
CHANGED
|
@@ -4,7 +4,7 @@ tags:
|
|
| 4 |
- gptoss
|
| 5 |
---
|
| 6 |
|
| 7 |
-
#
|
| 8 |
|
| 9 |
Metal kernels that back the OpenAI GPT-OSS reference implementation, repackaged for local experiments on Apple Silicon GPUs. The GPT-OSS project distributes optimized inference primitives for the `gpt-oss-20b` and `gpt-oss-120b` open-weight models, including MXFP4-packed linear layers and fused attention paths that target Metal Performance Shaders on macOS [gpt-oss](https://github.com/openai/gpt-oss).
|
| 10 |
|
|
@@ -14,7 +14,7 @@ Metal kernels that back the OpenAI GPT-OSS reference implementation, repackaged
|
|
| 14 |
pip install kernels # we just need to install the kernels package
|
| 15 |
```
|
| 16 |
|
| 17 |
-
The package exposes Python bindings through `
|
| 18 |
|
| 19 |
## Available Ops
|
| 20 |
|
|
@@ -39,7 +39,7 @@ Each example below compares a Metal kernel against the canonical PyTorch equival
|
|
| 39 |
import torch
|
| 40 |
from kernels import get_kernel
|
| 41 |
|
| 42 |
-
gptoss_kernels = get_kernel("kernels-community/
|
| 43 |
|
| 44 |
torch.manual_seed(0)
|
| 45 |
device = "mps"
|
|
@@ -74,7 +74,7 @@ torch.testing.assert_close(out_kernel, out_ref, atol=1e-3, rtol=1e-3)
|
|
| 74 |
from kernels import get_kernel
|
| 75 |
import torch
|
| 76 |
|
| 77 |
-
gptoss_kernels = get_kernel("kernels-community/
|
| 78 |
device = "mps"
|
| 79 |
|
| 80 |
hidden = 4096
|
|
@@ -101,7 +101,7 @@ from kernels import get_kernel
|
|
| 101 |
import torch
|
| 102 |
|
| 103 |
device = "mps"
|
| 104 |
-
gptoss_kernels = get_kernel("kernels-community/
|
| 105 |
|
| 106 |
vocab, dim = 1024, 256
|
| 107 |
token_ids = torch.randint(0, vocab, (16,), device=device, dtype=torch.int32)
|
|
@@ -125,7 +125,7 @@ import torch
|
|
| 125 |
import torch.nn as nn
|
| 126 |
|
| 127 |
device = "mps"
|
| 128 |
-
gptoss_kernels = get_kernel("kernels-community/
|
| 129 |
|
| 130 |
|
| 131 |
head_dim = 64
|
|
|
|
| 4 |
- gptoss
|
| 5 |
---
|
| 6 |
|
| 7 |
+
# gpt-oss-metal-kernels
|
| 8 |
|
| 9 |
Metal kernels that back the OpenAI GPT-OSS reference implementation, repackaged for local experiments on Apple Silicon GPUs. The GPT-OSS project distributes optimized inference primitives for the `gpt-oss-20b` and `gpt-oss-120b` open-weight models, including MXFP4-packed linear layers and fused attention paths that target Metal Performance Shaders on macOS [gpt-oss](https://github.com/openai/gpt-oss).
|
| 10 |
|
|
|
|
| 14 |
pip install kernels # we just need to install the kernels package
|
| 15 |
```
|
| 16 |
|
| 17 |
+
The package exposes Python bindings through `gpt_oss_metal_kernels.ops`; these symbols are re-exported in `gpt_oss_metal_kernels.__init__` for convenience. All kernels expect Metal (`mps`) tensors and operate in place on user-provided outputs to minimize additional allocations.
|
| 18 |
|
| 19 |
## Available Ops
|
| 20 |
|
|
|
|
| 39 |
import torch
|
| 40 |
from kernels import get_kernel
|
| 41 |
|
| 42 |
+
gptoss_kernels = get_kernel("kernels-community/gpt-oss-metal-kernels")
|
| 43 |
|
| 44 |
torch.manual_seed(0)
|
| 45 |
device = "mps"
|
|
|
|
| 74 |
from kernels import get_kernel
|
| 75 |
import torch
|
| 76 |
|
| 77 |
+
gptoss_kernels = get_kernel("kernels-community/gpt-oss-metal-kernels")
|
| 78 |
device = "mps"
|
| 79 |
|
| 80 |
hidden = 4096
|
|
|
|
| 101 |
import torch
|
| 102 |
|
| 103 |
device = "mps"
|
| 104 |
+
gptoss_kernels = get_kernel("kernels-community/gpt-oss-metal-kernels")
|
| 105 |
|
| 106 |
vocab, dim = 1024, 256
|
| 107 |
token_ids = torch.randint(0, vocab, (16,), device=device, dtype=torch.int32)
|
|
|
|
| 125 |
import torch.nn as nn
|
| 126 |
|
| 127 |
device = "mps"
|
| 128 |
+
gptoss_kernels = get_kernel("kernels-community/gpt-oss-metal-kernels")
|
| 129 |
|
| 130 |
|
| 131 |
head_dim = 64
|