model_info: | |
name: anemll-Llama-3.1-Nemotron-Nano-8B-v1-ctx512 | |
version: 0.3.0 | |
description: | | |
Demonstarates running Llama-3.1-Nemotron-Nano-8B-v1 on Apple Neural Engine | |
Context length: 512 | |
Batch size: 64 | |
Chunks: 16 | |
license: MIT | |
author: Anemll | |
framework: Core ML | |
language: Python | |
parameters: | |
context_length: 512 | |
batch_size: 64 | |
lut_embeddings: none | |
lut_ffn: none | |
lut_lmhead: none | |
num_chunks: 16 | |
model_prefix: nemo_ | |
embeddings: nemo__embeddings.mlmodelc | |
lm_head: nemo__lm_head.mlmodelc | |
ffn: nemo__FFN_PF.mlmodelc | |