gpt-peter-2.7B / README.md

Librarian Bot: Add base_model information to model (#1)

1f6e514 about 1 year ago

1.94 kB

	---
	tags:
	- gpt-neo
	- gpt-peter
	- chatbot
	inference: false
	base_model: EleutherAI/gpt-neo-2.7B
	---


	# pszemraj/gpt-peter-2.7B

	- This model is a fine-tuned version of [EleutherAI/gpt-neo-2.7B](https://huggingface.co/EleutherAI/gpt-neo-2.7B) on about 80k WhatsApp and iMessage texts.
	- The model is too large to use the inference API. linked [here](https://colab.research.google.com/gist/pszemraj/a59b43813437b43973c8f8f9a3944565/testing-pszemraj-gpt-peter-2-7b.ipynb) is a notebook for testing in Colab.
	- alternatively, you can message [a bot on telegram](http://t.me/GPTPeter_bot) where I test LLMs for dialogue generation
	- the telegram bot code and the model training code can be found [in this repository](https://github.com/pszemraj/ai-msgbot)


	## Usage in python

	Install the transformers library if you don't have it:
	```
	pip install -U transformers
	```

	load the model into a `pipeline` object:

	```
	from transformers import pipeline
	import torch
	my_chatbot = pipeline('text-generation',
	'pszemraj/gpt-peter-2.7B',
	device=0 if torch.cuda.is_available() else -1,
	)
	```

	generate text!

	```
	my_chatbot('Did you ever hear the tragedy of Darth Plagueis The Wise?')
	```

	_(example above for simplicity, but adding generation parameters such as `no_repeat_ngram_size` are recommended to get better generations)_

	## Training procedure


	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 6e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- distributed_type: multi-GPU
	- gradient_accumulation_steps: 32
	- total_train_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.05
	- num_epochs: 1

	### Framework versions

	- Transformers 4.17.0
	- Pytorch 1.10.0+cu113
	- Datasets 2.0.0
	- Tokenizers 0.11.6