# ExecuTorch [[executorch]]

[`ExecuTorch`](https://github.com/pytorch/executorch) 는 웨어러블, 임베디드 장치, 마이크로컨트롤러를 포함한 모바일 및 엣지 장치에서 온디바이스 추론 기능을 가능하게 하는 종합 솔루션입니다. PyTorch 생태계에 속해있으며, 이식성, 생산성, 성능에 중점을 둔 PyTorch 모델 배포를 지원합니다.

ExecuTorch는 백엔드 위임, 사용자 정의 컴파일러 변환, 메모리 계획 등 모델, 장치 또는 특정 유즈케이스 맞춤 최적화를 수행할 수 있는 진입점을 명확하게 정의합니다. ExecuTorch를 사용해 엣지 장치에서 PyTorch 모델을 실행하는 첫 번째 단계는 모델을 익스포트하는 것입니다. 이 작업은 PyTorch API인 [`torch.export`](https://pytorch.org/docs/stable/export.html)를 사용하여 수행합니다.

## ExecuTorch 통합 [[transformers.TorchExportableModuleWithStaticCache]][[transformers.TorchExportableModuleWithStaticCache]]

`torch.export`를 사용하여 🤗 Transformers를 익스포트 할 수 있도록  통합 지점이 개발되고 있습니다. 이 통합의 목표는 익스포트뿐만 아니라, 익스포트한 아티팩트가 `ExecuTorch`에서 효율적으로 실행될 수 있도록 더 축소하고 최적화하는 것입니다. 특히 모바일 및 엣지 유즈케이스에 중점을 두고 있습니다.

#### transformers.TorchExportableModuleWithStaticCache[[transformers.TorchExportableModuleWithStaticCache]]

[Source](https://github.com/huggingface/transformers/blob/v5.1.0/src/transformers/integrations/executorch.py#L448)

A recipe module designed to make a `PreTrainedModel` exportable with `torch.export`,
specifically for decoder-only LM to `StaticCache`. This module ensures that the
exported model is compatible with further lowering and execution in `ExecuTorch`.

Note:
This class is specifically designed to support export process using `torch.export`
in a way that ensures the model can be further lowered and run efficiently in `ExecuTorch`.

forwardtransformers.TorchExportableModuleWithStaticCache.forwardhttps://github.com/huggingface/transformers/blob/v5.1.0/src/transformers/integrations/executorch.py#L533[{"name": "input_ids", "val": ": torch.LongTensor | None = None"}, {"name": "inputs_embeds", "val": ": torch.Tensor | None = None"}, {"name": "cache_position", "val": ": torch.Tensor | None = None"}]- **input_ids** (`torch.Tensor`) -- Tensor representing current input token id to the module.
- **inputs_embeds** (`torch.Tensor`) -- Tensor representing current input embeddings to the module.
- **cache_position** (`torch.Tensor`) -- Tensor representing current input position in the cache.0torch.TensorLogits output from the model.

Forward pass of the module, which is compatible with the ExecuTorch runtime.

This forward adapter serves two primary purposes:

1. **Making the Model `torch.export`-Compatible**:
   The adapter hides unsupported objects, such as the `Cache`, from the graph inputs and outputs,
   enabling the model to be exportable using `torch.export` without encountering issues.

2. **Ensuring Compatibility with `ExecuTorch` runtime**:
   The adapter matches the model's forward signature with that in `executorch/extension/llm/runner`,
   ensuring that the exported model can be executed in `ExecuTorch` out-of-the-box.

**Parameters:**

input_ids (`torch.Tensor`) : Tensor representing current input token id to the module.

inputs_embeds (`torch.Tensor`) : Tensor representing current input embeddings to the module.

cache_position (`torch.Tensor`) : Tensor representing current input position in the cache.

**Returns:**

`torch.Tensor`

Logits output from the model.

#### transformers.convert_and_export_with_cache[[transformers.convert_and_export_with_cache]]

[Source](https://github.com/huggingface/transformers/blob/v5.1.0/src/transformers/integrations/executorch.py#L732)

Convert a `PreTrainedModel` into an exportable module and export it using `torch.export`,
ensuring the exported model is compatible with `ExecuTorch`.

**Parameters:**

model (`PreTrainedModel`) : The pretrained model to be exported.

example_input_ids (`Optional[torch.Tensor]`) : Example input token id used by `torch.export`.

example_cache_position (`Optional[torch.Tensor]`) : Example current cache position used by `torch.export`.

dynamic_shapes(`Optional[dict]`) : Dynamic shapes used by `torch.export`.

strict(`Optional[bool]`) : Flag to instruct `torch.export` to use `torchdynamo`.

**Returns:**

`Exported program (`torch.export.ExportedProgram`)`

The exported program generated via `torch.export`.

