File size: 2,677 Bytes
c1b38ee
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c210164
c1b38ee
 
 
 
 
 
c210164
 
c1b38ee
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
license: mit
tags:
- arc
- abstract-reasoning
- neural-network
- transformer
- pytorch
library_name: pytorch
pipeline_tag: other
---

# ARC Neural Network - Rule Paradigm

This model is part of an Abstraction and Reasoning Corpus (ARC) neural network project focused on developing models capable of solving abstract reasoning challenges.

## Model Information

- **Checkpoint**: model_rule_paradigm_epoch_20_20250626_125402_best_val_loss.pth
- **Type**: rule_paradigm
- **Framework**: PyTorch
- **Architecture**: Two-stage rule-imprinted attention with RuleGenerator and RuleApplier transformers

## Training Details

- **Epoch**: 20
- **Loss**: 0.909375

## Architecture Details

### RuleGenerator Configuration
- **Model Dimension**: 1024
- **Attention Heads**: N/A
- **Encoder Layers**: N/A
- **Rule Token Dimension**: 256

### RuleApplier Configuration
- **Model Dimension**: 256
- **Attention Heads**: 16
- **Rule Imprint Layers**: 4
- **Spatial Layers**: 8

## Usage

```python
import torch
from huggingface_hub import hf_hub_download

# Download the checkpoint
checkpoint_path = hf_hub_download(
    repo_id="artnoage/your-model-name",
    filename="model.pth"
)

# Load the checkpoint
checkpoint = torch.load(checkpoint_path, map_location='cpu')

# The checkpoint contains model configurations for easy loading
if 'rule_generator_config' in checkpoint:
    # Model configs are included - can reconstruct architecture automatically
    print("Model configurations found in checkpoint")
    print(f"Epoch: {checkpoint['epoch']}, Loss: {checkpoint['loss']}")
else:
    # Legacy checkpoint - requires manual architecture specification
    print("Legacy checkpoint - manual architecture specification needed")
```

## Project Overview

This is part of an ARC (Abstraction and Reasoning Corpus) neural network project that implements a novel rule-based paradigm using a revolutionary two-stage rule-imprinted attention architecture.

### Key Features
- **Two-stage rule-imprinted attention**: Stage 1 for rule imprinting via cross-attention, Stage 2 for spatial reasoning via self-attention
- **Rule consistency training**: Multiple rule extractions per task with consistency loss
- **Configurable tokenization**: Row-based (30 tokens) or meta-pixel (900 tokens) strategies
- **Mixed precision training**: AMP training with robust gradient scaling

## Citation

If you use this model in your research, please cite:

```bibtex
@misc{arc-neural-network,
  title={ARC Neural Network with Rule-Imprinted Attention},
  author={Your Name},
  year={2025},
  url={https://github.com/your-username/ARC_NN}
}
```

## License

MIT License - See repository for full license details.