Climate policy instrument classifiers
Collection
Models to identify and classify scientific literature on climate policy instruments
•
8 items
•
Updated
Predicts whether an abstract relates to a climate policy instrument.
The model was finetuned from climatebert, using a dataset of manually labelled scientific abstracts.
A text can be classified using the text classification pipeline:
from transformers import TextClassificationPipeline, AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained(
'climatebert/distilroberta-base-climate-f',
model_max_length=512
)
model = AutoModelForSequenceClassification.from_pretrained(
'evidence-for-climate-solutions/climatebert-policyinstruments-relevant'
)
pipe = TextClassificationPipeline(
model=model,
tokenizer=tokenizer,
truncation=True,
top_k=None,
function_to_apply='softmax'
)
res = pipe('Carbon pricing is thought of as an economically efficient policy to reduce GHG emissions')
The model was trained on a limited subset of scientific abstracts (see paper for details). It's use as a general classifier for identifying climate policy relevant texts should approached with caution
The model was trained using nested cross-validation to optimise hyperparameters code
Evaluation results from nested cross-validation are detailed here
@article{callaghanMachineLearningMap2025,
title = {Machine Learning Map of Climate Policy Literature Reveals Disparities between Scientific Attention, Policy Density, and Emissions},
author = {Callaghan, Max and Banisch, Lucy and {Doebbeling-Hildebrandt}, Niklas and Edmondson, Duncan and Flachsland, Christian and Lamb, William F. and Levi, Sebastian and {M{\"u}ller-Hansen}, Finn and Posada, Eduardo and Vasudevan, Shraddha and Minx, Jan C.},
year = {2025},
month = feb,
journal = {npj Climate Action},
volume = {4},
number = {1},
pages = {1--14},
publisher = {Nature Publishing Group},
issn = {2731-9814},
doi = {10.1038/s44168-024-00196-0},
urldate = {2025-02-19},
abstract = {Current climate mitigation policies are not sufficient to meet the Paris temperature target, and ramping up efforts will require rapid learning from the scientific literature on climate policies. This literature is vast and widely dispersed, as well as hard to define and categorise, hampering systematic efforts to learn from it. We use a machine learning pipeline using transformer-based language models to systematically map the relevant scientific literature on climate policies at scale and in real-time. Our ``living systematic map'' of climate policy research features a set of 84,990 papers, and classifies each of them by policy instrument type, sector, and geography. We explore how the distribution of these papers varies across countries, and compare this to the distribution of emissions and enacted climate policies. Results suggests a potential stark under-representation of industry sector policies, as well as diverging attention between science and policy with respect to economic and regulatory instruments.},
copyright = {2025 The Author(s)},
langid = {english},
keywords = {Climate-change mitigation},
}
Base model
climatebert/distilroberta-base-climate-f