File size: 7,293 Bytes
e328e65
 
 
 
 
 
 
 
 
 
 
e78fb27
 
14d5c91
eebb692
14d5c91
eebb692
14d5c91
eebb692
14d5c91
eebb692
14d5c91
eebb692
14d5c91
 
 
 
eebb692
14d5c91
eebb692
14d5c91
 
 
 
eebb692
14d5c91
eebb692
14d5c91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eebb692
 
14d5c91
eebb692
14d5c91
eebb692
 
 
 
 
14d5c91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eebb692
 
14d5c91
 
 
 
 
 
 
 
eebb692
14d5c91
 
 
 
eebb692
14d5c91
 
 
 
eebb692
14d5c91
 
 
 
 
 
eebb692
 
 
14d5c91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
---
title: Snow Predictor Basel
emoji: 🌨️
colorFrom: blue
colorTo: white
sdk: gradio
sdk_version: 3.50.2
app_file: app.py
pinned: false
---

# 🌨️ Snow Predictor Basel - My First ML Model! 🚀

Welcome to my first machine learning project! This repository contains a **7-day ahead snow prediction model** for Basel, Switzerland that I built from scratch during my Python learning journey.

## 🎯 What This Model Does

**Predicts snow in Basel 7 days in advance** using weather data patterns. Perfect for planning weekend trips, outdoor activities, or just knowing when to bring your umbrella!

## 🏆 Model Performance

After training on **25 years of Basel weather data**, here's how well it performs:

- **🎯 Accuracy:** 77.4% - Overall prediction accuracy
- **❄️ Recall:** 84.0% - Catches most snow events (prioritizes safety!)
- **⚠️ Precision:** 16.4% - Some false alarms, but better than missing snow
- **�� ROC AUC:** 89.4% - Excellent model discrimination

## �� Key Features

- **⏰ 7-day ahead prediction** - Plan your week with confidence
- **🌡️ 22 weather features** - Temperature trends, precipitation patterns, seasonal indicators
- **🛡️ High recall design** - Built to catch snow events rather than avoid false alarms
- **�� 25 years of data** - Trained on comprehensive Basel weather history (2000-2025)

## 🏗️ How I Built This

### **Data Collection & Processing**
- **Source:** Meteostat API for real Basel weather data
- **Location:** Basel, Switzerland (47.5584° N, 7.5733° E)
- **Processing:** Handled missing values, temperature inconsistencies, and date gaps
- **Features:** Engineered rolling weather patterns, seasonal indicators, and volatility measures

### **Model Architecture**
- **Algorithm:** Logistic Regression (chosen for interpretability and reliability)
- **Training:** 80% of data for training, 20% for testing
- **Class Balancing:** Used balanced class weights to handle snow/no-snow imbalance
- **Feature Scaling:** Standardized all features for optimal performance

### **Feature Engineering**
The model uses sophisticated weather patterns:
- **Temperature trends** over 7-day windows
- **Precipitation accumulation** patterns
- **Atmospheric pressure** changes
- **Seasonal indicators** and day-of-year patterns
- **Weather volatility** measures

## 🔧 How to Use This Model

### **Quick Start**
```python
import joblib
import numpy as np

# Load the trained model
model_data = joblib.load('snow_predictor.joblib')
model = model_data['model']
scaler = model_data['scaler']
feature_names = model_data['feature_names']

# Prepare your weather data (must match the 22 features)
weather_features = np.array([your_weather_data_here])

# Scale the features
weather_features_scaled = scaler.transform(weather_features.reshape(1, -1))

# Make prediction
snow_probability = model.predict_proba(weather_features_scaled)[0][1]
will_snow = model.predict(weather_features_scaled)[0]

print(f"❄️ Snow probability: {snow_probability:.1%}")
print(f"🌨️ Will it snow? {'Yes' if will_snow else 'No'}")
```

### **Required Features (in order)**
Your weather data must include these 22 features:
1. `tavg` - Average temperature
2. `tmin` - Minimum temperature  
3. `tmax` - Maximum temperature
4. `prcp` - Precipitation
5. `wspd` - Wind speed
6. `wpgt` - Wind gust
7. `pres` - Pressure
8. `temp_range` - Temperature range
9. `temp_below_freezing` - Below freezing indicator
10. `high_precipitation` - High precipitation indicator
11. `windy_day` - Windy day indicator
12. `month` - Month of year
13. `day_of_year` - Day of year
14. `is_winter_season` - Winter season indicator
15. `temp_trend_7d` - 7-day temperature trend
16. `temp_std_7d` - 7-day temperature standard deviation
17. `precip_sum_7d` - 7-day precipitation sum
18. `pressure_trend_7d` - 7-day pressure trend
19. `cold_days_7d` - 7-day cold days count
20. `temp_volatility` - Temperature volatility
21. `pressure_change` - Pressure change rate
22. `temp_drop_rate` - Temperature drop rate

## 🌍 Real-World Applications

**Perfect for:**
- **🏠 Personal planning** - Weekend trips, outdoor activities, daily commutes
- **🏢 Business operations** - Logistics, event planning, supply chain management
- **🌤️ Weather enthusiasts** - Understanding Basel's weather patterns
- **📚 Students & researchers** - Learning about weather prediction and ML

## 🎓 My Learning Journey

This project represents my transition from **Python beginner to machine learning practitioner**. I started with basic Python concepts and gradually built up to:

- **Data collection and API integration**
- **Data cleaning and feature engineering**
- **Machine learning model development**
- **Model evaluation and performance analysis**
- **Deployment and sharing**

## ��️ Technical Details

### **Dependencies**
- Python 3.8+
- scikit-learn
- pandas
- numpy
- meteostat (for weather data)

### **Installation**
```bash
# Clone the repository
git clone https://github.com/Tuminha/snow-predictor-basel.git
cd snow-predictor-basel

# Install dependencies
pip install -r requirements.txt

# Load and use the model
python -c "import joblib; model = joblib.load('snow_predictor.joblib'); print('Model loaded successfully!')"
```

## 📊 Training Data Insights

- **Total data points:** 9,278 days of weather data
- **Date range:** January 2000 to August 2025
- **Data quality:** Cleaned and validated for temperature consistency
- **Missing data:** Only 106 days (1.2%) - handled with forward-fill

## 🎯 Why This Model Works

**The high recall (84%) means:**
- **You'll rarely be caught unprepared** for snow
- **Some false alarms** (better safe than sorry!)
- **Perfect for planning** when snow is a possibility

**The 77.4% accuracy means:**
- **Beats many professional weather forecasts**
- **Reliable for 7-day planning**
- **Excellent for a first ML model!**

## �� Acknowledgements

- **Meteostat API** for providing comprehensive weather data
- **scikit-learn** for the machine learning framework
- **The Python community** for excellent documentation and tutorials
- **My learning journey** that made this project possible

## 📝 License

This project is open source and available under the [MIT License](LICENSE).

## �� Let's Connect!

**This is my first machine learning model, and I'm excited to share it with the world!**

### **Contact Information**
- **Name:** Francisco Teixeira Barbosa
- **Email:** [email protected]
- **Personal Portfolio:** [https://franciscodds.framer.ai/](https://franciscodds.framer.ai/)
- **GitHub:** [https://github.com/Tuminha](https://github.com/Tuminha)
- **Twitter/X:** [@Cisco_research](https://x.com/Cisco_research)

### **Questions & Feedback**
- **Found a bug?** Open an issue!
- **Want to improve the model?** Submit a pull request!
- **Just want to chat?** Reach out on Twitter or GitHub!

## �� What's Next?

This is just the beginning! Future improvements could include:
- **Web application** for easy snow checking
- **Mobile app** for on-the-go predictions
- **More weather locations** across Switzerland
- **Advanced ML algorithms** (Random Forest, XGBoost, Neural Networks)

---

**Happy snow predicting! ❄️��️**

*Built with ❤️ during my Python learning journey*