π Refined BitTransformerLM: Organized codebase with best practices
Browse files- scripts/README.md +94 -0
scripts/README.md
ADDED
|
@@ -0,0 +1,94 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# BitTransformerLM Scripts
|
| 2 |
+
|
| 3 |
+
This directory contains organized scripts for BitTransformerLM development, training, and evaluation.
|
| 4 |
+
|
| 5 |
+
## Directory Structure
|
| 6 |
+
|
| 7 |
+
```
|
| 8 |
+
scripts/
|
| 9 |
+
βββ training/ # Training scripts and experiments
|
| 10 |
+
βββ examples/ # Example usage and demonstrations
|
| 11 |
+
βββ testing/ # Test scripts and validation
|
| 12 |
+
βββ benchmarks/ # Performance benchmarks
|
| 13 |
+
βββ tools/ # Utility scripts and data processing
|
| 14 |
+
```
|
| 15 |
+
|
| 16 |
+
## Training Scripts (`training/`)
|
| 17 |
+
|
| 18 |
+
- **basic_training.py** - Simple training setup for small models
|
| 19 |
+
- **breakthrough_training.py** - Advanced training with breakthrough techniques
|
| 20 |
+
- **cpu_edge_training.py** - CPU-optimized training for edge deployment
|
| 21 |
+
- **final_breakthrough_training.py** - Production training pipeline
|
| 22 |
+
- **full_attention_training.py** - Full attention mechanism training
|
| 23 |
+
- **full_bits_train.py** - Complete bit-level training
|
| 24 |
+
- **production_training.py** - Production-ready training script
|
| 25 |
+
- **progressive_scaleup.py** - Progressive model scaling
|
| 26 |
+
- **quick_training_run.py** - Fast training for development
|
| 27 |
+
|
| 28 |
+
## Example Scripts (`examples/`)
|
| 29 |
+
|
| 30 |
+
- **example.py** - Basic usage example
|
| 31 |
+
- **better_sampling.py** - Advanced sampling techniques
|
| 32 |
+
- **debug_generation.py** - Generation debugging utilities
|
| 33 |
+
- **raw_generation.py** - Low-level generation examples
|
| 34 |
+
- **simple_test.py** - Simple model testing
|
| 35 |
+
|
| 36 |
+
## Testing Scripts (`testing/`)
|
| 37 |
+
|
| 38 |
+
- **code_test.py** - Code functionality testing
|
| 39 |
+
- **diffusion_tests.py** - Diffusion mode testing
|
| 40 |
+
- **enhanced_generation_test.py** - Advanced generation testing
|
| 41 |
+
- **full_attention_inference_test.py** - Attention mechanism tests
|
| 42 |
+
- **test_conversation.py** - Conversational AI testing
|
| 43 |
+
|
| 44 |
+
## Benchmark Scripts (`benchmarks/`)
|
| 45 |
+
|
| 46 |
+
- **wikitext_benchmark.py** - WikiText dataset benchmarking
|
| 47 |
+
- **wikitext_schedule.py** - WikiText training schedule
|
| 48 |
+
|
| 49 |
+
## Utility Tools (`tools/`)
|
| 50 |
+
|
| 51 |
+
- **build_full_bits.py** - Bit sequence construction
|
| 52 |
+
- **create_dataset.py** - Dataset creation utilities
|
| 53 |
+
- **enhanced_checkpoint_system.py** - Advanced checkpointing
|
| 54 |
+
- **integration_flow.py** - Integration workflow
|
| 55 |
+
- **integration_schedule.py** - Integration scheduling
|
| 56 |
+
- **sync_to_hf.py** - HuggingFace synchronization
|
| 57 |
+
- **unified_workflow.py** - Unified training workflow
|
| 58 |
+
- **watcher.py** - File system monitoring
|
| 59 |
+
|
| 60 |
+
## Usage
|
| 61 |
+
|
| 62 |
+
All scripts support the standardized CLI interface provided by `bit_transformer.cli_standards`. Use `--help` with any script to see available options.
|
| 63 |
+
|
| 64 |
+
### Quick Start
|
| 65 |
+
|
| 66 |
+
```bash
|
| 67 |
+
# Train a small model
|
| 68 |
+
python scripts/training/basic_training.py --model-size small --epochs 5
|
| 69 |
+
|
| 70 |
+
# Run a simple test
|
| 71 |
+
python scripts/examples/simple_test.py --d-model 64
|
| 72 |
+
|
| 73 |
+
# Benchmark on WikiText
|
| 74 |
+
python scripts/benchmarks/wikitext_benchmark.py --dataset-name wikitext-2
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
### Environment Variables
|
| 78 |
+
|
| 79 |
+
Scripts support configuration via environment variables with `BT_` prefix:
|
| 80 |
+
|
| 81 |
+
```bash
|
| 82 |
+
export BT_D_MODEL=128
|
| 83 |
+
export BT_NUM_LAYERS=4
|
| 84 |
+
export BT_BATCH_SIZE=16
|
| 85 |
+
python scripts/training/basic_training.py
|
| 86 |
+
```
|
| 87 |
+
|
| 88 |
+
## Development Guidelines
|
| 89 |
+
|
| 90 |
+
- All scripts should use `bit_transformer.cli_standards` for argument parsing
|
| 91 |
+
- Include proper logging and error handling
|
| 92 |
+
- Support both CPU and GPU execution
|
| 93 |
+
- Follow the naming conventions established in existing scripts
|
| 94 |
+
- Add documentation for any new hyperparameters or features
|