파일 시스템 API
HfFileSystem 클래스는 fsspec
을 기반으로 Hugging Face Hub에 Python 파일 인터페이스를 제공합니다.
HfFileSystem
HfFileSystem은 fsspec
을 기반으로 하므로 제공되는 대부분의 API와 호환됩니다. 자세한 내용은 가이드 및 fsspec의 API 레퍼런스를 확인하세요.
class huggingface_hub.HfFileSystem
< source >( *args **kwargs )
Parameters
- token (
str
orbool
, optional) — A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, passFalse
. - endpoint (
str
, optional) — Endpoint of the Hub. Defaults to https://huggingface.co.
Access a remote Hugging Face Hub repository as if were a local file system.
HfFileSystem provides fsspec compatibility, which is useful for libraries that require it (e.g., reading
Hugging Face datasets directly with pandas
). However, it introduces additional overhead due to this compatibility
layer. For better performance and reliability, it’s recommended to use HfApi
methods when possible.
Usage:
>>> from huggingface_hub import HfFileSystem
>>> fs = HfFileSystem()
>>> # List files
>>> fs.glob("my-username/my-model/*.bin")
['my-username/my-model/pytorch_model.bin']
>>> fs.ls("datasets/my-username/my-dataset", detail=False)
['datasets/my-username/my-dataset/.gitattributes', 'datasets/my-username/my-dataset/README.md', 'datasets/my-username/my-dataset/data.json']
>>> # Read/write files
>>> with fs.open("my-username/my-model/pytorch_model.bin") as f:
... data = f.read()
>>> with fs.open("my-username/my-model/pytorch_model.bin", "wb") as f:
... f.write(data)
__init__
< source >( *args endpoint: typing.Optional[str] = None token: typing.Union[bool, str, NoneType] = None **storage_options )
resolve_path
< source >( path: str revision: typing.Optional[str] = None ) → HfFileSystemResolvedPath
Parameters
- path (
str
) — Path to resolve. - revision (
str
, optional) — The revision of the repo to resolve. Defaults to the revision specified in the path.
Returns
HfFileSystemResolvedPath
Resolved path information containing repo_type
, repo_id
, revision
and path_in_repo
.
Raises
ValueError
or NotImplementedError
ValueError
— If path contains conflicting revision information.NotImplementedError
— If trying to list repositories.
Resolve a Hugging Face file system path into its components.
ls
< source >( path: str detail: bool = True refresh: bool = False revision: typing.Optional[str] = None **kwargs ) → List[Union[str, Dict[str, Any]]]
Parameters
- path (
str
) — Path to the directory. - detail (
bool
, optional) — If True, returns a list of dictionaries containing file information. If False, returns a list of file paths. Defaults to True. - refresh (
bool
, optional) — If True, bypass the cache and fetch the latest data. Defaults to False. - revision (
str
, optional) — The git revision to list from.
Returns
List[Union[str, Dict[str, Any]]]
List of file paths (if detail=False) or list of file information dictionaries (if detail=True).
List the contents of a directory.
For more details, refer to fsspec documentation.
Note: When possible, use HfApi.list_repo_tree()
for better performance.