|
# DEPRECATED |
|
|
|
All tasks in this file have been implemented (Stages 1–5). The document remains for historical reference only. |
|
|
|
Stage 1: Compression Algorithm Implementation |
|
|
|
Task 1: Choose Compression Method |
|
|
|
Prompt: |
|
|
|
Codex: Provide a concise PyTorch-compatible implementation of lossless binary compression and decompression (e.g., RLE, Huffman, or LZ-based) suitable for binary input sequences represented as tensors of bits. |
|
|
|
Task 2: Implement Compression Functions |
|
|
|
Prompt: |
|
|
|
Codex: Implement PyTorch functions compress_bits(input_tensor) and decompress_bits(compressed_tensor) that accept and return PyTorch tensors (dtype=torch.bool or torch.uint8). Ensure compress → decompress cycle perfectly reconstructs original data, and include simple unit tests. |
|
|
|
⸻ |
|
|
|
Stage 2: Encoder/Decoder Integration |
|
|
|
Task 3: Add Compression to Encoder Input |
|
|
|
Prompt: |
|
|
|
Codex: Modify BitTransformerLM’s input pipeline by wrapping the existing model forward pass with a forward_compressed(bits_tensor) method. This method should decompress incoming compressed bit tensors before embedding. Ensure it returns identical outputs as existing uncompressed inputs for verification. |
|
|
|
Task 4: Add Decompression to Decoder Output |
|
|
|
Prompt: |
|
|
|
Codex: Implement a PyTorch-compatible function model_output_decompress(output_bits_tensor) to decompress bit sequences output by BitTransformerLM. Integrate this function as an optional post-processing step after the model’s bitstream generation. |
|
|
|
⸻ |
|
|
|
Stage 3: Training and Evaluation Enhancements |
|
|
|
Task 5: Toggle Compression During Training |
|
|
|
Prompt: |
|
|
|
Codex: Modify the existing training loop to randomly compress input bit sequences with a configurable probability (compress_prob=0.5). Ensure that when compression is on, inputs are compressed and decompressed transparently, and when off, inputs bypass compression. |
|
|
|
Task 6: Evaluate Compressed vs Raw Performance |
|
|
|
Prompt: |
|
|
|
Codex: Extend the current training evaluation metrics to separately track loss, accuracy, and compression ratio for both compressed and raw sequences. Log these metrics clearly in the training output. |
|
|
|
⸻ |
|
|
|
Stage 4: Advanced Integration (Optional) |
|
|
|
Task 7: Multi-task Training for Compression Learning |
|
|
|
Prompt: |
|
|
|
Codex: Implement an optional multi-task training mode where the model occasionally sees compressed inputs directly without decompression. Add a separate loss calculation to monitor its performance on these compressed inputs. Track and log separately from normal next-bit prediction loss. |
|
|
|
Task 8: Compression-aware Safety Telemetry |
|
|
|
Prompt: |
|
|
|
Codex: Adjust the existing BitTransformerLM telemetry (K, C, and S metrics) to handle compressed sequences appropriately. Modify telemetry calculations to optionally apply metrics to decompressed outputs instead of raw bitstream when compression is enabled. |
|
|
|
⸻ |
|
|
|
Stage 5: Dashboard and Runtime Integration |
|
|
|
Task 9: Dashboard Compression UI Toggle |
|
|
|
Prompt: |
|
|
|
Codex: Add a simple UI toggle labeled “Enable Compression” to the existing BitTransformerLM dashboard, controlling whether inputs and outputs are automatically compressed and decompressed. Display compression ratio metrics when enabled. |
|
|
|
Task 10: Error Handling and User Feedback |
|
|
|
Prompt: |
|
|
|
Codex: Implement graceful error handling in the dashboard for compression and decompression failures. Provide clear user-facing feedback in the UI if decompression fails, along with suggestions or fallbacks. |
|
|
|
⸻ |
|
|
|
These ten tasks enable incremental, testable integration of binary compression/decompression into BitTransformerLM without fundamentally altering the core transformer model itself. |
|
|