does starcoder support multi language?
#29
by
paulcx
- opened
I can't find the language information (not programming lang) from training dataset and I'm wondering if starcoder can understand multi language query?
@paulcx Yes it can be true although we focus on English language understanding, but it can respond to Chinese prompt also according to my personal experience.
As @SivilTaram specified it can respond in some of the most popular natural languages, probably because they were highly present in markdown data and html for example as well as comments inside code. Here's a figure of natural language distribution in comments included in python code from The Stack paper (we didn't do a similar analysis for other languages)
loubnabnl
changed discussion status to
closed