AngelPanizo's picture
Add BERTopic model
fd6ec28 verified
---
tags:
- bertopic
library_name: bertopic
pipeline_tag: text-classification
---
# MARTINI_enrich_BERTopic_MartinCostelloNews
This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
## Usage
To use this model, please install BERTopic:
```
pip install -U bertopic
```
You can use the model as follows:
```python
from bertopic import BERTopic
topic_model = BERTopic.load("AIDA-UPM/MARTINI_enrich_BERTopic_MartinCostelloNews")
topic_model.get_topic_info()
```
## Topic overview
* Number of topics: 38
* Number of training documents: 4471
<details>
<summary>Click here for an overview of all topics.</summary>
| Topic ID | Topic Keywords | Topic Frequency | Label |
|----------|----------------|-----------------|-------|
| -1 | vaccinated - nhs - boris - freedom - next | 20 | -1_vaccinated_nhs_boris_freedom |
| 0 | mariupol - zelensky - crimea - russians - nazis | 2286 | 0_mariupol_zelensky_crimea_russians |
| 1 | globalists - davos - trudeau - sunak - hitler | 227 | 1_globalists_davos_trudeau_sunak |
| 2 | propaganda - traitor - goebbels - guilty - headline | 140 | 2_propaganda_traitor_goebbels_guilty |
| 3 | cashless - hyperinflation - monetary - cbdc - communism | 130 | 3_cashless_hyperinflation_monetary_cbdc |
| 4 | police - arrested - whitehall - offended - golliwog | 115 | 4_police_arrested_whitehall_offended |
| 5 | bbc - broadcaster - freesat - misinformation - goebbels | 111 | 5_bbc_broadcaster_freesat_misinformation |
| 6 | paedophilia - teachers - perverted - lgbtq - inappropriate | 110 | 6_paedophilia_teachers_perverted_lgbtq |
| 7 | migrants - britain - invasion - illegally - 4calais | 106 | 7_migrants_britain_invasion_illegally |
| 8 | france - liberte - protesters - presidential - austrians | 95 | 8_france_liberte_protesters_presidential |
| 9 | lockdowns - scaremongering - h5n1 - contagious - 2020 | 95 | 9_lockdowns_scaremongering_h5n1_contagious |
| 10 | swindon - corbyn - lockdown - poster - democratically | 74 | 10_swindon_corbyn_lockdown_poster |
| 11 | vaccination - 17yr - child - brainwashed - jcvi | 69 | 11_vaccination_17yr_child_brainwashed |
| 12 | canberra - protesting - quarantine - palaszczuk - apartheid | 60 | 12_canberra_protesting_quarantine_palaszczuk |
| 13 | trudeau - ottawa - convoy - manitoba - freedom | 58 | 13_trudeau_ottawa_convoy_manitoba |
| 14 | england - pandemic - died - stillbirths - 29 | 57 | 14_england_pandemic_died_stillbirths |
| 15 | jets - davos - greta - hypocrites - degrowth | 54 | 15_jets_davos_greta_hypocrites |
| 16 | unvaccinated - nurses - sacked - compulsory - javid | 52 | 16_unvaccinated_nurses_sacked_compulsory |
| 17 | pfizer - vaccinations - dangers - whistleblower - adenovirus | 49 | 17_pfizer_vaccinations_dangers_whistleblower |
| 18 | vaccinated - vaxxers - coerced - discrimination - anothers | 43 | 18_vaccinated_vaxxers_coerced_discrimination |
| 19 | netherlands - brussels - rutte - agricultural - despots | 41 | 19_netherlands_brussels_rutte_agricultural |
| 20 | nhs - nurses - underfunded - consocialists - strike | 39 | 20_nhs_nurses_underfunded_consocialists |
| 21 | dublin - migrants - irishmen - ballymun - stabbings | 38 | 21_dublin_migrants_irishmen_ballymun |
| 22 | oxford - councillors - counties - gloabalist - newcastle | 34 | 22_oxford_councillors_counties_gloabalist |
| 23 | gates - vaxxines - eugenics - malaria - funded | 33 | 23_gates_vaxxines_eugenics_malaria |
| 24 | petrol - 37bn - britons - skyrocketing - corruption | 33 | 24_petrol_37bn_britons_skyrocketing |
| 25 | racists - discriminated - prejudice - supremacy - slaughter | 32 | 25_racists_discriminated_prejudice_supremacy |
| 26 | epstein - ghislaine - clinton - trafficking - released | 28 | 26_epstein_ghislaine_clinton_trafficking |
| 27 | injections - deaths - janssen - pericarditis - thrombocytopenia | 27 | 27_injections_deaths_janssen_pericarditis |
| 28 | midazolam - fentanyl - overdosed - euthanised - nhs | 27 | 28_midazolam_fentanyl_overdosed_euthanised |
| 29 | masks - pandemic - biohazard - distancing - snot | 26 | 29_masks_pandemic_biohazard_distancing |
| 30 | kneeling - booed - footballers - wembley - racially | 25 | 30_kneeling_booed_footballers_wembley |
| 31 | teslas - diesel - charging - lithium - tonnes | 25 | 31_teslas_diesel_charging_lithium |
| 32 | democratically - ukip - disenfranchise - voters - sheeple | 25 | 32_democratically_ukip_disenfranchise_voters |
| 33 | farmers - shortages - starve - fertiliser - uk | 23 | 33_farmers_shortages_starve_fertiliser |
| 34 | watford - footballer - sudden - collapses - coincidence | 22 | 34_watford_footballer_sudden_collapses |
| 35 | lockdowns - shanghai - communism - xi - qr | 21 | 35_lockdowns_shanghai_communism_xi |
| 36 | censorship - facistbook - instagram - unpublish - libsoftiktok | 21 | 36_censorship_facistbook_instagram_unpublish |
</details>
## Training hyperparameters
* calculate_probabilities: True
* language: None
* low_memory: False
* min_topic_size: 10
* n_gram_range: (1, 1)
* nr_topics: None
* seed_topic_list: None
* top_n_words: 10
* verbose: False
* zeroshot_min_similarity: 0.7
* zeroshot_topic_list: None
## Framework versions
* Numpy: 1.26.4
* HDBSCAN: 0.8.40
* UMAP: 0.5.7
* Pandas: 2.2.3
* Scikit-Learn: 1.5.2
* Sentence-transformers: 3.3.1
* Transformers: 4.46.3
* Numba: 0.60.0
* Plotly: 5.24.1
* Python: 3.10.12