metadata

base_model:
  - Qwen/Qwen3-4B
tags:
  - text-generation-inference
  - transformers
  - trl
  - qwen
  - wsd
  - ambiguity
license: apache-2.0
language:
  - en
datasets:
  - deshanksuman/Reasoning_WSD_dataset
pipeline_tag: text-classification

Uploaded model

Developed by: deshanksuman
License: apache-2.0
Finetuned from model : Qwen/Qwen3-4B

Dataset

Fews Training data arranged in the format of Instruction, Input and output with advanced Reasonining for sense identification. The data generation has been semi automated using the Arcee models. The data has been validated by the human for it's structure and the correctnes.

The data source can be accessed here: deshanksuman/Reasoning_WSD_dataset

Hyperparameter for Training

per_device_train_batch_size=4,
gradient_accumulation_steps=8,
warmup_steps=50,
num_train_epochs=2,
learning_rate=2e-4,
fp16=not torch.cuda.is_bf16_supported(),
bf16=torch.cuda.is_bf16_supported(),
logging_steps=10,
optim="adamw_torch",
weight_decay=0.01,
lr_scheduler_type="linear",
seed=3407

This is developed by Deshan Sumanathilaka https://sumanathilaka.github.io

Acknowledgement

We acknowledge the support of the Supercomputing Wales project, which is part-funded by the European Regional Development Fund (ERDF) via Welsh Government.