Update README.md
Browse filesadded single chunk profile note
README.md
CHANGED
@@ -10,15 +10,22 @@ tags:
|
|
10 |
---
|
11 |
# ANEMLL
|
12 |
|
13 |
-
PREFILL
|
14 |
-
after unzipping (please see below)
|
15 |
run
|
16 |
python prefill.py --meta meta.yaml
|
17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
For M3U/M4P see original post:
|
19 |
https://x.com/anemll/status/1919796143787278623
|
20 |
|
21 |
|
|
|
22 |
**ANEMLL** (pronounced like "animal") is an open-source project focused on accelerating the porting of Large Language Models (LLMs) to tensor processors, starting with the Apple Neural Engine (ANE).
|
23 |
|
24 |
The goal is to provide a fully open-source pipeline from model conversion to inference for common LLM architectures running on ANE.
|
|
|
10 |
---
|
11 |
# ANEMLL
|
12 |
|
13 |
+
PREFILL Test for M3 Ultra
|
14 |
+
after unzipping ( please see below, "find . -type f -name "*.zip" -exec unzip {} \;" )
|
15 |
run
|
16 |
python prefill.py --meta meta.yaml
|
17 |
|
18 |
+
The repo has an extra file : nemotron_prefill_chunk_01of16_64x64.mlpackage which will be interesting to profile with Xcode on:
|
19 |
+
M1 Ultra, M2 Ultra and M4 Max
|
20 |
+
It is single chunk for Batch=64/Window=64
|
21 |
+
See https://docs.google.com/spreadsheets/d/1OCxn730D5h8rvS2IHsSi0UBYbsP_lV-W-0uVdVDCvIk
|
22 |
+
FP16 tab for baseline numbers
|
23 |
+
|
24 |
For M3U/M4P see original post:
|
25 |
https://x.com/anemll/status/1919796143787278623
|
26 |
|
27 |
|
28 |
+
|
29 |
**ANEMLL** (pronounced like "animal") is an open-source project focused on accelerating the porting of Large Language Models (LLMs) to tensor processors, starting with the Apple Neural Engine (ANE).
|
30 |
|
31 |
The goal is to provide a fully open-source pipeline from model conversion to inference for common LLM architectures running on ANE.
|