Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Yi Cui's picture
23 19 1

Yi Cui

onekq
hkarma's profile picture IamValeAI's profile picture branikita's profile picture
·
https://onekq.ai
  • onekq_ai
  • onekq
  • yicui

AI & ML interests

Benchmark, Code Generation Model

Recent Activity

posted an update about 19 hours ago
Okay, Qwen3 coder does much better than Qwen3 (coding model for coding), but GPT OSS still maintains SOTA for open source models. https://huggingface.co/spaces/onekq-ai/WebApp1K-models-leaderboard
updated a Space about 19 hours ago
onekq-ai/WebApp1K-models-leaderboard
posted an update 3 days ago
Kimi K2 is a bit disappointing by my expectations. It is on a par with Codex mini. https://huggingface.co/spaces/onekq-ai/WebApp1K-models-leaderboard
View all activity

Organizations

MLX Community's profile picture ONEKQ AI's profile picture

authored a paper 6 months ago

Tests as Prompt: A Test-Driven-Development Benchmark for LLM Code Generation

Paper • 2505.09027 • Published May 13
authored 3 papers about 1 year ago

A Case Study of Web App Coding with OpenAI Reasoning Models

Paper • 2409.13773 • Published Sep 19, 2024 • 6

WebApp1K: A Practical Code-Generation Benchmark for Web App Development

Paper • 2408.00019 • Published Jul 30, 2024 • 1

Insights from Benchmarking Frontier Language Models on Web App Code Generation

Paper • 2409.05177 • Published Sep 8, 2024 • 7
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs