does starcoder support multi language?

by paulcx - opened

I can't find the language information (not programming lang) from training dataset and I'm wondering if starcoder can understand multi language query?

BigCode org

@paulcx Yes it can be true although we focus on English language understanding, but it can respond to Chinese prompt also according to my personal experience.

BigCode org

As @SivilTaram specified it can respond in some of the most popular natural languages, probably because they were highly present in markdown data and html for example as well as comments inside code. Here's a figure of natural language distribution in comments included in python code from The Stack paper (we didn't do a similar analysis for other languages)


loubnabnl changed discussion status to closed

Sign up or log in to comment