Pranav Deshpande's picture

1

Pranav Deshpande

pranav-deshpande

https://pranavdeshpande.com

pranav-deshpande

AI & ML interests

Game playing agents

Recent Activity

liked a model about 1 month ago

sesame/csm-1b

replied to Sentdex's post about 1 year ago

Working through the Reddit dataset, one thing that occurs to me is we pretty much always train LLMs to be a conversation between 2 parties like Bot/Human or Instruction/Response. It seems far more common with internet data that we have multi-speaker/group discussions with a dynamic number of speakers. This also seems to be more realistic to the real world too and requires a bit more understanding to model. Is there some research into this? I have some ideas of how I'd like to implement it, but I wonder if some work has already been done here?

reacted to Sentdex's post with 👍 about 1 year ago

Working through the Reddit dataset, one thing that occurs to me is we pretty much always train LLMs to be a conversation between 2 parties like Bot/Human or Instruction/Response. It seems far more common with internet data that we have multi-speaker/group discussions with a dynamic number of speakers. This also seems to be more realistic to the real world too and requires a bit more understanding to model. Is there some research into this? I have some ideas of how I'd like to implement it, but I wonder if some work has already been done here?

View all activity

Organizations

None yet

pranav-deshpande's activity

liked a model about 1 month ago

sesame/csm-1b

Text-to-Speech • Updated Mar 16 • 55k • 1.99k

replied to Sentdex's post about 1 year ago

I was thinking exactly the same thing when ChatGPT first came out! I have run some minor experiments with causal language modeling by having a fixed number of users/speakers and then instruct fine-tuning the base/foundational model. "Dynamic number of speakers" sounds interesting, though! Maybe there is a clever way to inject new tokens into the vocabulary to achieve this.

Would love to contribute tothis initiative.

reacted to Sentdex's post with 👍 about 1 year ago

Post

Working through the Reddit dataset, one thing that occurs to me is we pretty much always train LLMs to be a conversation between 2 parties like Bot/Human or Instruction/Response.

It seems far more common with internet data that we have multi-speaker/group discussions with a dynamic number of speakers. This also seems to be more realistic to the real world too and requires a bit more understanding to model.

Is there some research into this? I have some ideas of how I'd like to implement it, but I wonder if some work has already been done here?

5 replies

·