anemll commited on
Commit
b5481f6
·
verified ·
1 Parent(s): 41c446d

Update README.md

Browse files

added single chunk profile note

Files changed (1) hide show
  1. README.md +9 -2
README.md CHANGED
@@ -10,15 +10,22 @@ tags:
10
  ---
11
  # ANEMLL
12
 
13
- PREFILL test for M3 Ultra
14
- after unzipping (please see below)
15
  run
16
  python prefill.py --meta meta.yaml
17
 
 
 
 
 
 
 
18
  For M3U/M4P see original post:
19
  https://x.com/anemll/status/1919796143787278623
20
 
21
 
 
22
  **ANEMLL** (pronounced like "animal") is an open-source project focused on accelerating the porting of Large Language Models (LLMs) to tensor processors, starting with the Apple Neural Engine (ANE).
23
 
24
  The goal is to provide a fully open-source pipeline from model conversion to inference for common LLM architectures running on ANE.
 
10
  ---
11
  # ANEMLL
12
 
13
+ PREFILL Test for M3 Ultra
14
+ after unzipping ( please see below, "find . -type f -name "*.zip" -exec unzip {} \;" )
15
  run
16
  python prefill.py --meta meta.yaml
17
 
18
+ The repo has an extra file : nemotron_prefill_chunk_01of16_64x64.mlpackage which will be interesting to profile with Xcode on:
19
+ M1 Ultra, M2 Ultra and M4 Max
20
+ It is single chunk for Batch=64/Window=64
21
+ See https://docs.google.com/spreadsheets/d/1OCxn730D5h8rvS2IHsSi0UBYbsP_lV-W-0uVdVDCvIk
22
+ FP16 tab for baseline numbers
23
+
24
  For M3U/M4P see original post:
25
  https://x.com/anemll/status/1919796143787278623
26
 
27
 
28
+
29
  **ANEMLL** (pronounced like "animal") is an open-source project focused on accelerating the porting of Large Language Models (LLMs) to tensor processors, starting with the Apple Neural Engine (ANE).
30
 
31
  The goal is to provide a fully open-source pipeline from model conversion to inference for common LLM architectures running on ANE.