File size: 2,677 Bytes
c1b38ee c210164 c1b38ee c210164 c1b38ee |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
---
license: mit
tags:
- arc
- abstract-reasoning
- neural-network
- transformer
- pytorch
library_name: pytorch
pipeline_tag: other
---
# ARC Neural Network - Rule Paradigm
This model is part of an Abstraction and Reasoning Corpus (ARC) neural network project focused on developing models capable of solving abstract reasoning challenges.
## Model Information
- **Checkpoint**: model_rule_paradigm_epoch_20_20250626_125402_best_val_loss.pth
- **Type**: rule_paradigm
- **Framework**: PyTorch
- **Architecture**: Two-stage rule-imprinted attention with RuleGenerator and RuleApplier transformers
## Training Details
- **Epoch**: 20
- **Loss**: 0.909375
## Architecture Details
### RuleGenerator Configuration
- **Model Dimension**: 1024
- **Attention Heads**: N/A
- **Encoder Layers**: N/A
- **Rule Token Dimension**: 256
### RuleApplier Configuration
- **Model Dimension**: 256
- **Attention Heads**: 16
- **Rule Imprint Layers**: 4
- **Spatial Layers**: 8
## Usage
```python
import torch
from huggingface_hub import hf_hub_download
# Download the checkpoint
checkpoint_path = hf_hub_download(
repo_id="artnoage/your-model-name",
filename="model.pth"
)
# Load the checkpoint
checkpoint = torch.load(checkpoint_path, map_location='cpu')
# The checkpoint contains model configurations for easy loading
if 'rule_generator_config' in checkpoint:
# Model configs are included - can reconstruct architecture automatically
print("Model configurations found in checkpoint")
print(f"Epoch: {checkpoint['epoch']}, Loss: {checkpoint['loss']}")
else:
# Legacy checkpoint - requires manual architecture specification
print("Legacy checkpoint - manual architecture specification needed")
```
## Project Overview
This is part of an ARC (Abstraction and Reasoning Corpus) neural network project that implements a novel rule-based paradigm using a revolutionary two-stage rule-imprinted attention architecture.
### Key Features
- **Two-stage rule-imprinted attention**: Stage 1 for rule imprinting via cross-attention, Stage 2 for spatial reasoning via self-attention
- **Rule consistency training**: Multiple rule extractions per task with consistency loss
- **Configurable tokenization**: Row-based (30 tokens) or meta-pixel (900 tokens) strategies
- **Mixed precision training**: AMP training with robust gradient scaling
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{arc-neural-network,
title={ARC Neural Network with Rule-Imprinted Attention},
author={Your Name},
year={2025},
url={https://github.com/your-username/ARC_NN}
}
```
## License
MIT License - See repository for full license details.
|