File size: 5,238 Bytes
78e33ca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66a06cc
78e33ca
66a06cc
 
 
 
 
 
 
 
d58e19e
 
 
66a06cc
d58e19e
 
 
 
 
 
 
66a06cc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
# Astra Project Setup Instructions

## Prerequisites
Make sure you have the following installed before proceeding:
- Python 3.12.4
- Git
- Git Large File Storage (LFS)

## Step 1: Install Git LFS
Git LFS (Large File Storage) is required for managing large files in the Astra project. Follow these steps to install Git LFS:

### Windows
1. Download the Git LFS installer from [Git LFS Releases](https://git-lfs.github.com/).
2. Run the installer and follow the setup instructions.
3. Open a terminal (Command Prompt or PowerShell) and run:
   ```sh
   git lfs install
   ```

### macOS
1. Install Git LFS using Homebrew:
   ```sh
   brew install git-lfs
   ```
2. Initialize Git LFS:
   ```sh
   git lfs install
   ```

### Linux
1. Install Git LFS using your package manager:
   - Debian/Ubuntu:
     ```sh
     sudo apt install git-lfs
     ```
   - Fedora:
     ```sh
     sudo dnf install git-lfs
     ```
   - Arch Linux:
     ```sh
     sudo pacman -S git-lfs
     ```
2. Initialize Git LFS:
   ```sh
   git lfs install
   ```

## Step 2: Install Python (Alternative: pyenv)
While Python 3.12.4 is required, it is recommended to use `pyenv` if you want to work with multiple Python versions or if you encounter errors while installing dependencies.

### Installing pyenv
#### macOS & Linux:
```sh
curl https://pyenv.run | bash
```
After installation, restart your terminal and install Python:
```sh
pyenv install 3.12.4
pyenv global 3.12.4
```

#### Windows:
Use [pyenv-win](https://github.com/pyenv-win/pyenv-win):
```sh
git clone https://github.com/pyenv-win/pyenv-win.git ~/.pyenv
setx PYENV "%USERPROFILE%\.pyenv"
setx PATH "%PYENV%\bin;%PYENV%\shims;%PATH%"
pyenv install 3.12.4
pyenv global 3.12.4
```

## Step 3: Clone the Repository
Clone the Astra project repository using Git:
```sh
git clone <repository_url>
cd astra
```

## Step 4: Install Dependencies
Install all required dependencies from the `requirements.txt` file:
```sh
pip install -r requirements.txt
```

## Step 5: Verify Installation
Ensure all dependencies are installed correctly by running:
```sh
python --version
pip list
```

## Step 6: Run the Application or Test the Model
You have two options to proceed:

### Option 1: Run the Gradio App
To open the Gradio app in your web browser and interact with the application, run:
```sh
python app.py
```

### Option 2: Test the Model with a Sample File
To test the fine-tuned model using a sample file, navigate to the root folder of the project and run the following command:
```sh
cd <root_folder>
python new_test_saved_finetuned_model.py \
    -workspace_name "ratio_proportion_change3_2223/sch_largest_100-coded" \
    -finetune_task "<finetune_task>" \
    -test_dataset_path "../../../../fileHandler/selected_rows.txt" \
    -finetuned_bert_classifier_checkpoint "ratio_proportion_change3_2223/sch_largest_100-coded/output/highGRschool10/bert_fine_tuned.model.ep42" \
    -e 1 \
    -b 1000
```
Replace `<finetune_task>` with the actual fine-tuning task value.
### Arguments

**`-workspace_name`**  
- Description: The folder/workspace name where the project, dataset, and model outputs are organized.  
- Example: `"ratio_proportion_change3_2223/sch_largest_100-coded"`

**`-finetune_task`**  
- Description: Specifies which fine-tuning strategy was applied to the model.  
- Options:  
  - **ASTRA-FT-HGR** β†’ Fine-tuned with 10% data from schools that have a **High Graduation Rate (HGR)**.  
  - **ASTRA-FT-FIRST10-WSKILLS**  
    - Checkpoint: `first10/bert_fine_tuned.model.first10%.wskills.ep24`  
    - Description: Fine-tuned with 10% of initial problems from both **HGR + LGR schools**, with **Prior Skills encoded** using **Bayesian Knowledge Tracing (BKT)**.  

  - **ASTRA-FT-FIRST10-WTIME**  
    - Checkpoint: `first10/bert_fine_tuned.model.first10%.wfaopttime.wttime.wttopttime.wttnoopttime.ep23`  
    - Description: Fine-tuned with 10% of initial problems from both **HGR + LGR schools**, using **temporal features** measuring student engagement in MATHia.  

  - **ASTRA-FT-FIRST10-WSKILLS_WTIME**  
    - Checkpoint: `first10/bert_fine_tuned.model.first10%.wskills.wfaopttime.wttime.wttopttime.wttnoopttime.ep40`  
    - Description: Fine-tuned with 10% of initial problems from both **HGR + LGR schools**, combining **Prior Skills (BKT) + temporal features**.
**`-test_dataset_path`**  
- Description: Path to the test dataset file that you want to use for evaluation.  
- Example: `"../../../../fileHandler/selected_rows.txt"`

**`-finetuned_bert_classifier_checkpoint`**  
- Description: The path to the saved fine-tuned BERT model checkpoint (specific `.model.epXX` file).  
- Example:  
  `"ratio_proportion_change3_2223/sch_largest_100-coded/output/highGRschool10/bert_fine_tuned.model.ep42"`  
- Note: `ep42` means the checkpoint from **epoch 42** during training.  

**`-e`**  
- Description: Number of epochs to run during testing (or evaluation).  
- Example: `-e 1` β†’ run evaluation once.  

**`-b`**  
- Description: Batch size for testing β€” determines how many test samples are processed together in each forward pass.  
- Example: `-b 1000` β†’ each batch will contain **1000 examples**.  

---

βœ… Your Astra project should now be fully set up and ready to use!