Model Card for Model ID

This model is a fine-tuned version of Llama-2-7b for sentence category classification using the SciNLI dataset. It has been trained to classify scientific sentences into categories like Contrasting, Reasoning, Entailment, and Neutral.

Model Details

Model Description

This model is designed for Natural Language Inference (NLI) on scientific texts. It has been fine-tuned on the SciNLI dataset, which consists of sentence pairs extracted from scholarly papers on NLP and computational linguistics. The model helps in recognizing the semantic relationship between pairs of sentences in scientific texts.

  • Developed by: [Firoz Shaik]
  • Funded by [optional]: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: [Causal Language Model]
  • Language(s) (NLP): [English]
  • License: [More Information Needed]
  • Finetuned from model [optional]: [Llama-2-7b]

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

The model can be directly used for sentence category classification tasks in scientific literature.

Downstream Use [optional]

The model can be integrated into larger NLP pipelines for tasks like scientific text summarization, question answering, and commonsense reasoning.

Out-of-Scope Use

The model is not intended for general-purpose text classification outside the scientific domain. It should not be used for generating text that requires deep factual correctness without further validation.

Bias, Risks, and Limitations

The model may inherit biases present in the SciNLI dataset. It might not perform well on non-scientific texts or texts from domains not represented in the training data.

Recommendations

Users should be aware of the potential biases and limitations. It is recommended to validate the model's predictions, especially for critical applications.

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("path_to_model")
model = AutoModelForCausalLM.from_pretrained("path_to_model")

inputs = tokenizer("Your input text", return_tensors="pt")
outputs = model(**inputs)

[More Information Needed]

## Training Details

### Training Data

The dataset can be downloaded from https://shorturl.at/gQKY6.

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

#### Preprocessing [optional]



#### Training Hyperparameters

Training Hyperparameters
Training regime: bf16 mixed precision
Learning rate: 2e-4
Batch size: 1
Gradient accumulation steps: 4
Warmup steps: 2
Max steps: 20
Optimizer: paged_adamw_8bit
Evaluation strategy: steps
Evaluation steps: 1

#### Speeds, Sizes, Times [optional]

<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

[More Information Needed]

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

### Testing Data, Factors & Metrics

#### Testing Data

The model was evaluated on the SciNLI test set, containing sentence pairs from scientific papers.

#### Factors

[More Information Needed]

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->

[More Information Needed]

### Results

[More Information Needed]

#### Summary



## Model Examination [optional]

<!-- Relevant interpretability work for the model goes here -->

[More Information Needed]

## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]

## Technical Specifications [optional]

### Model Architecture and Objective

[More Information Needed]

### Compute Infrastructure

[More Information Needed]

#### Hardware

[More Information Needed]

#### Software

[More Information Needed]

## Citation [optional]

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

**BibTeX:**

[More Information Needed]

**APA:**

[More Information Needed]

## Glossary [optional]

<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

[More Information Needed]

## More Information [optional]

[More Information Needed]

## Model Card Authors [optional]

[More Information Needed]

## Model Card Contact

[More Information Needed]
Downloads last month
41
Safetensors
Model size
6.74B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support