|
# π Day 2: Sentiment & Zero-Shot Smackdown β Arabic Edition |
|
|
|
Today, I dove deeper into two powerful Hugging Face pipelines β **Sentiment Analysis** and **Zero-Shot Classification** β with a twist: I focused on Arabic and multilingual performance. The goal? To find models that _actually_ handle Arabic well, dialects and all. ππ€ |
|
|
|
--- |
|
|
|
## π Part 1: Sentiment Analysis Recap |
|
|
|
Previously, I found that the default `pipeline("sentiment-analysis")` worked okay for English but... Arabic? Not so much. So today was all about discovering **a better Arabic sentiment model**. |
|
|
|
### π§ͺ Models Tested: |
|
|
|
| Model | Result | |
|
| -------------------------------------------------------- | ---------------------------------------------------------- | |
|
| `default` (`pipeline("sentiment-analysis")`) | π Good on English <br>π ~55% accurate on Arabic | |
|
| `Anwaarma/Improved-Arabert-twitter-sentiment-No-dropout` | π Inaccurate, struggled with meaning | |
|
| `Abdo36/Arabert-Sentiment-Analysis-ArSAS` | π«€ Slightly better than default, but dialect handling weak | |
|
### π₯ Winner: `CAMeL-Lab/bert-base-arabic-camelbert-mix-sentiment` |
|
|
|
- β
**Accuracy**: 95β99% on clear Arabic sentences |
|
|
|
- π¬ **Dialect-Friendly**: Correctly classified Egyptian slang like |
|
`"Ψ§ΩΩΨ§Ψ― Ψ³ΩΨ§Ω Ψ§ΩΨͺΩΩ ΨͺΩΩ Ψ¬Ψ§Ψ±ΩΨ§ ΨΉΨ³Ω"` β **Positive 97%** |
|
|
|
- β οΈ **Weakness**: Lower performance on English (61%) and French (50%) |
|
|
|
|
|
### π§ Key Takeaways |
|
|
|
- **Fine-tuned, language-specific models** seriously outperform the defaults. |
|
|
|
- **Dialect support = must-have** for real-world, diverse data. |
|
|
|
|
|
--- |
|
|
|
## π Part 2: Zero-Shot Classification β Arabic & Multilingual Trials |
|
|
|
Could zero-shot models understand Arabic prompts _and_ interpret labels in multiple languages? Letβs see how they fared in a multilingual arena! π§ͺπ |
|
|
|
### 1οΈβ£ `morit/arabic_xlm_xnli` |
|
|
|
- β Inaccurate, even on Arabic-only prompts |
|
|
|
- β Misaligned labels and scores |
|
|
|
|
|
### 2οΈβ£ β
`MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7` |
|
|
|
|Scenario|Accuracy|Improvement| |
|
|---|---|---| |
|
|Arabic β Arabic labels|β
96.1%|+21%| |
|
|Arabic β English labels|β
86%|+37%| |
|
|English β English labels|β
84%|+9%| |
|
|English β Arabic labels|β
84%|vs. ~30% default| |
|
|Mixed labels (Arabic + English)|β
92%|RTL handled properly| |
|
|
|
π **UI Smartness**: |
|
|
|
- Keeps English labels left-aligned, Arabic right-aligned |
|
|
|
- No visual bugs or mis-scored outputs from RTL quirks βοΈ |
|
|
|
|
|
--- |
|
|
|
## π§ Final Summary for Day 2 |
|
|
|
- π― **Model selection matters**: The right model can boost performance by **20β25%**! |
|
|
|
- π£οΈ **Dialect support** is key β generic models donβt cut it in nuanced use cases. |
|
|
|
- π **Language pairing (input β label)** is critical for zero-shot reliability. |
|
|
|
- π§βπ» **Proper RTL handling** helps avoid UI headaches and scoring issues. |
|
|
|
|
|
--- |
|
|
|
## π‘ Whatβs Next? |
|
|
|
- π Try out another Hugging Face pipeline: **translation** or **summarization** sound exciting! |
|
|
|
- π Keep expanding my **language-aware model notebook** β with more dialects, labels, and real-world tests. |