Measuring Hong Kong Massive Multi-Task Language Understanding Paper • 2505.02177 • Published May 4
Towards Advanced Mathematical Reasoning for LLMs via First-Order Logic Theorem Proving Paper • 2506.17104 • Published Jun 20 • 1
SafeLawBench: Towards Safe Alignment of Large Language Models Paper • 2506.06636 • Published Jun 7