Spaces:

Allex21
/

LT

Sleeping

App Files Files Community

LT / INSTALACAO.md

Allex21

Upload 4 files

077989f verified 2 months ago

preview code

raw

history blame

7.62 kB

	# 🛠️ Guia de Instalação - LoRA Trainer Funcional

	## 📋 Pré-requisitos

	### Sistema Operacional
	- Linux: Ubuntu 20.04+ (recomendado)
	- Windows: Windows 10/11 com WSL2
	- macOS: macOS 12+ (limitado, sem GPU)

	### Hardware
	- GPU: NVIDIA com 6GB+ VRAM (obrigatório)
	- RAM: 16GB+ (recomendado)
	- Armazenamento: 20GB+ livres
	- CPU: Qualquer CPU moderno

	### Software
	- Python: 3.8 a 3.11
	- CUDA: 11.8 ou 12.1
	- Git: Para clonar repositórios

	## 🚀 Instalação Local

	### Método 1: Instalação Completa

	```bash
	# 1. Clone o repositório
	git clone <repository-url>
	cd lora_trainer_hf

	# 2. Crie ambiente virtual
	python -m venv venv
	source venv/bin/activate # Linux/Mac
	# ou
	venv\Scripts\activate # Windows

	# 3. Instale dependências
	pip install -r requirements.txt

	# 4. Execute a aplicação
	python app.py
	```

	### Método 2: Instalação com Conda

	```bash
	# 1. Crie ambiente conda
	conda create -n lora_trainer python=3.10
	conda activate lora_trainer

	# 2. Instale PyTorch com CUDA
	conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

	# 3. Clone e instale
	git clone <repository-url>
	cd lora_trainer_hf
	pip install -r requirements.txt

	# 4. Execute
	python app.py
	```

	## 🐳 Instalação com Docker

	### Dockerfile Incluído

	```bash
	# 1. Build da imagem
	docker build -t lora-trainer .

	# 2. Execute o container
	docker run -p 7860:7860 --gpus all lora-trainer
	```

	### Docker Compose

	```yaml
	version: '3.8'
	services:
	lora-trainer:
	build: .
	ports:
	- "7860:7860"
	volumes:
	- ./data:/tmp/lora_training
	deploy:
	resources:
	reservations:
	devices:
	- driver: nvidia
	count: 1
	capabilities: [gpu]
	```

	## ☁️ Deploy no Hugging Face Spaces

	### Configuração do Space

	1. Crie um novo Space:
	- Vá para [huggingface.co/new-space](https://huggingface.co/new-space)
	- Escolha "Gradio" como SDK
	- Selecione hardware com GPU

	2. Configure o Space:
	```yaml
	# space_config.yml
	title: LoRA Trainer Funcional
	emoji: 🎨
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 4.0.0
	app_file: app.py
	pinned: false
	hardware: t4-medium # ou a100-large
	```

	3. Upload dos arquivos:
	- `app.py`
	- `requirements.txt`
	- `README.md`
	- Pasta `sd-scripts/`

	### Configuração de Hardware

	\| Hardware \| VRAM \| RAM \| Recomendado Para \|
	\|----------\|------\|-----\|------------------\|
	\| CPU Basic \| 0GB \| 16GB \| Apenas teste \|
	\| T4 Small \| 16GB \| 15GB \| Projetos pequenos \|
	\| T4 Medium \| 16GB \| 30GB \| Projetos médios \|
	\| A10G Small \| 24GB \| 30GB \| Projetos grandes \|
	\| A100 Large \| 40GB \| 80GB \| Projetos profissionais \|

	## 🔧 Configuração de Dependências

	### Dependências Principais

	```txt
	# Core ML
	torch>=2.0.0
	torchvision>=0.15.0
	diffusers>=0.21.0
	transformers>=4.25.0
	accelerate>=0.20.0

	# LoRA Training
	safetensors>=0.3.0
	huggingface-hub>=0.16.0
	xformers>=0.0.20
	bitsandbytes>=0.41.0

	# Interface
	gradio>=4.0.0

	# Utilities
	Pillow>=9.0.0
	opencv-python>=4.7.0
	numpy>=1.21.0
	toml>=0.10.0
	tqdm>=4.64.0
	```

	### Instalação Manual de Dependências

	```bash
	# PyTorch (ajuste para sua versão CUDA)
	pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

	# Diffusers e Transformers
	pip install diffusers transformers accelerate

	# Otimização
	pip install xformers bitsandbytes

	# Interface
	pip install gradio

	# Utilitários
	pip install safetensors huggingface-hub Pillow opencv-python numpy toml tqdm
	```

	## 🐛 Solução de Problemas de Instalação

	### Erro: CUDA não encontrado

	```bash
	# Verifique instalação CUDA
	nvidia-smi
	nvcc --version

	# Reinstale PyTorch com CUDA
	pip uninstall torch torchvision torchaudio
	pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
	```

	### Erro: xFormers não compatível

	```bash
	# Instale versão específica
	pip install xformers==0.0.20

	# Ou compile do código fonte
	pip install -U xformers --index-url https://download.pytorch.org/whl/cu118
	```

	### Erro: Memória insuficiente

	```bash
	# Aumente swap (Linux)
	sudo fallocate -l 8G /swapfile
	sudo chmod 600 /swapfile
	sudo mkswap /swapfile
	sudo swapon /swapfile

	# Configure variáveis de ambiente
	export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512
	```

	### Erro: Dependências conflitantes

	```bash
	# Limpe cache pip
	pip cache purge

	# Crie ambiente limpo
	python -m venv fresh_env
	source fresh_env/bin/activate
	pip install --upgrade pip
	pip install -r requirements.txt
	```

	## 🔒 Configuração de Segurança

	### Variáveis de Ambiente

	```bash
	# .env
	HUGGINGFACE_TOKEN=your_token_here
	WANDB_API_KEY=your_wandb_key
	CUDA_VISIBLE_DEVICES=0
	PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512
	```

	### Limitações de Recursos

	```python
	# No app.py
	import resource

	# Limite de memória (8GB)
	resource.setrlimit(resource.RLIMIT_AS, (810241024*1024, -1))

	# Limite de processos
	resource.setrlimit(resource.RLIMIT_NPROC, (100, -1))
	```

	## 📊 Verificação da Instalação

	### Script de Teste

	```python
	# test_installation.py
	import torch
	import diffusers
	import transformers
	import gradio as gr
	import safetensors

	print("✅ Verificando instalação...")
	print(f"Python: {sys.version}")
	print(f"PyTorch: {torch.__version__}")
	print(f"CUDA disponível: {torch.cuda.is_available()}")
	print(f"GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'Não disponível'}")
	print(f"Diffusers: {diffusers.__version__}")
	print(f"Transformers: {transformers.__version__}")
	print(f"Gradio: {gr.__version__}")
	print("✅ Instalação verificada!")
	```

	### Teste de GPU

	```python
	# test_gpu.py
	import torch

	if torch.cuda.is_available():
	device = torch.device("cuda")
	x = torch.randn(1000, 1000).to(device)
	y = torch.randn(1000, 1000).to(device)
	z = torch.mm(x, y)
	print(f"✅ GPU funcionando: {torch.cuda.get_device_name(0)}")
	print(f"VRAM total: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
	print(f"VRAM livre: {torch.cuda.memory_reserved(0) / 1024**3:.1f} GB")
	else:
	print("❌ GPU não disponível")
	```

	## 🚀 Otimização de Performance

	### Configurações de Sistema

	```bash
	# Aumentar limites de arquivo
	echo "* soft nofile 65536" >> /etc/security/limits.conf
	echo "* hard nofile 65536" >> /etc/security/limits.conf

	# Otimizar scheduler
	echo "performance" > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
	```

	### Configurações PyTorch

	```python
	# No início do app.py
	import torch
	torch.backends.cudnn.benchmark = True
	torch.backends.cuda.matmul.allow_tf32 = True
	torch.backends.cudnn.allow_tf32 = True
	```

	## 📝 Logs e Monitoramento

	### Configuração de Logs

	```python
	import logging
	logging.basicConfig(
	level=logging.INFO,
	format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
	handlers=[
	logging.FileHandler('lora_trainer.log'),
	logging.StreamHandler()
	]
	)
	```

	### Monitoramento de Recursos

	```bash
	# Instalar htop e nvidia-ml-py
	pip install nvidia-ml-py3 psutil

	# Monitorar em tempo real
	watch -n 1 nvidia-smi
	```

	## 🔄 Atualizações

	### Atualizar Dependências

	```bash
	# Atualizar requirements
	pip install --upgrade -r requirements.txt

	# Atualizar kohya-ss
	cd sd-scripts
	git pull origin main
	```

	### Backup de Configurações

	```bash
	# Backup de modelos e configurações
	tar -czf backup_$(date +%Y%m%d).tar.gz /tmp/lora_training/
	```

	---

	Nota: Para suporte adicional, consulte a documentação oficial do kohya-ss e a comunidade Hugging Face.