Model Card for evidence-for-climate-solutions/climatebert-policyinstruments-relevant

Predicts whether an abstract relates to a climate policy instrument.

Model Details

Model Description

The model was finetuned from climatebert, using a dataset of manually labelled scientific abstracts.

Developed by: Max Callaghan
Model type: Text classification model
Language(s) (NLP): English
License: MIT
Finetuned from model: climatebert/distilroberta-base-climate-f

Model Sources

Paper [optional]: [https://www.nature.com/articles/s44168-024-00196-0](Machine learning map of climate policy literature reveals disparities between scientific attention, policy density, and emissions)

Uses

Direct Use

A text can be classified using the text classification pipeline:

from transformers import TextClassificationPipeline, AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained(
    'climatebert/distilroberta-base-climate-f',
    model_max_length=512
)

model = AutoModelForSequenceClassification.from_pretrained(
    'evidence-for-climate-solutions/climatebert-policyinstruments-relevant'
)

pipe = TextClassificationPipeline(
    model=model, 
    tokenizer=tokenizer,
    truncation=True,
    top_k=None,
    function_to_apply='softmax'
)

res = pipe('Carbon pricing is thought of as an economically efficient policy to reduce GHG emissions')

Bias, Risks, and Limitations

The model was trained on a limited subset of scientific abstracts (see paper for details). It's use as a general classifier for identifying climate policy relevant texts should approached with caution

Training details

The model was trained using nested cross-validation to optimise hyperparameters code

Evaluation

Evaluation results from nested cross-validation are detailed here

Citation

BibTeX

@article{callaghanMachineLearningMap2025,
  title = {Machine Learning Map of Climate Policy Literature Reveals Disparities between Scientific Attention, Policy Density, and Emissions},
  author = {Callaghan, Max and Banisch, Lucy and {Doebbeling-Hildebrandt}, Niklas and Edmondson, Duncan and Flachsland, Christian and Lamb, William F. and Levi, Sebastian and {M{\"u}ller-Hansen}, Finn and Posada, Eduardo and Vasudevan, Shraddha and Minx, Jan C.},
  year = {2025},
  month = feb,
  journal = {npj Climate Action},
  volume = {4},
  number = {1},
  pages = {1--14},
  publisher = {Nature Publishing Group},
  issn = {2731-9814},
  doi = {10.1038/s44168-024-00196-0},
  urldate = {2025-02-19},
  abstract = {Current climate mitigation policies are not sufficient to meet the Paris temperature target, and ramping up efforts will require rapid learning from the scientific literature on climate policies. This literature is vast and widely dispersed, as well as hard to define and categorise, hampering systematic efforts to learn from it. We use a machine learning pipeline using transformer-based language models to systematically map the relevant scientific literature on climate policies at scale and in real-time. Our ``living systematic map'' of climate policy research features a set of 84,990 papers, and classifies each of them by policy instrument type, sector, and geography. We explore how the distribution of these papers varies across countries, and compare this to the distribution of emissions and enacted climate policies. Results suggests a potential stark under-representation of industry sector policies, as well as diverging attention between science and policy with respect to economic and regulatory instruments.},
  copyright = {2025 The Author(s)},
  langid = {english},
  keywords = {Climate-change mitigation},
}

evidence-for-climate-solutions
/

climatebert-policyinstruments-relevant