File size: 3,322 Bytes
c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 19301c9 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 c87b31f 69c2b16 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
# π Day 2: Sentiment & Zero-Shot Smackdown β Arabic Edition
Today, I dove deeper into two powerful Hugging Face pipelines β **Sentiment Analysis** and **Zero-Shot Classification** β with a twist: I focused on Arabic and multilingual performance. The goal? To find models that _actually_ handle Arabic well, dialects and all. ππ€
---
## π Part 1: Sentiment Analysis Recap
Previously, I found that the default `pipeline("sentiment-analysis")` worked okay for English but... Arabic? Not so much. So today was all about discovering **a better Arabic sentiment model**.
### π§ͺ Models Tested:
| Model | Result |
| -------------------------------------------------------- | ---------------------------------------------------------- |
| `default` (`pipeline("sentiment-analysis")`) | π Good on English <br>π ~55% accurate on Arabic |
| `Anwaarma/Improved-Arabert-twitter-sentiment-No-dropout` | π Inaccurate, struggled with meaning |
| `Abdo36/Arabert-Sentiment-Analysis-ArSAS` | π«€ Slightly better than default, but dialect handling weak |
### π₯ Winner: `CAMeL-Lab/bert-base-arabic-camelbert-mix-sentiment`
- β
**Accuracy**: 95β99% on clear Arabic sentences
- π¬ **Dialect-Friendly**: Correctly classified Egyptian slang like
`"Ψ§ΩΩΨ§Ψ― Ψ³ΩΨ§Ω Ψ§ΩΨͺΩΩ ΨͺΩΩ Ψ¬Ψ§Ψ±ΩΨ§ ΨΉΨ³Ω"` β **Positive 97%**
- β οΈ **Weakness**: Lower performance on English (61%) and French (50%)
### π§ Key Takeaways
- **Fine-tuned, language-specific models** seriously outperform the defaults.
- **Dialect support = must-have** for real-world, diverse data.
---
## π Part 2: Zero-Shot Classification β Arabic & Multilingual Trials
Could zero-shot models understand Arabic prompts _and_ interpret labels in multiple languages? Letβs see how they fared in a multilingual arena! π§ͺπ
### 1οΈβ£ `morit/arabic_xlm_xnli`
- β Inaccurate, even on Arabic-only prompts
- β Misaligned labels and scores
### 2οΈβ£ β
`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7`
|Scenario|Accuracy|Improvement|
|---|---|---|
|Arabic β Arabic labels|β
96.1%|+21%|
|Arabic β English labels|β
86%|+37%|
|English β English labels|β
84%|+9%|
|English β Arabic labels|β
84%|vs. ~30% default|
|Mixed labels (Arabic + English)|β
92%|RTL handled properly|
π **UI Smartness**:
- Keeps English labels left-aligned, Arabic right-aligned
- No visual bugs or mis-scored outputs from RTL quirks βοΈ
---
## π§ Final Summary for Day 2
- π― **Model selection matters**: The right model can boost performance by **20β25%**!
- π£οΈ **Dialect support** is key β generic models donβt cut it in nuanced use cases.
- π **Language pairing (input β label)** is critical for zero-shot reliability.
- π§βπ» **Proper RTL handling** helps avoid UI headaches and scoring issues.
---
## π‘ Whatβs Next?
- π Try out another Hugging Face pipeline: **translation** or **summarization** sound exciting!
- π Keep expanding my **language-aware model notebook** β with more dialects, labels, and real-world tests. |