Swiss AI Initiative

Team

university

https://www.swiss-ai.org/

swiss-ai

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

mjaggi updated a model about 2 hours ago

swiss-ai/Apertus-70B-2509

mjaggi updated a model about 2 hours ago

swiss-ai/Apertus-70B-Instruct-2509

mjaggi updated a model about 2 hours ago

swiss-ai/Apertus-8B-Instruct-2509

View all activity

mjaggi

updated 4 models about 2 hours ago

mansaripo

updated 2 models about 4 hours ago

swiss-ai/Apertus-70B-2509

Text Generation • 71B • Updated about 2 hours ago • 456 • 62

swiss-ai/Apertus-8B-2509

Text Generation • 8B • Updated about 2 hours ago • 3k • 68

mjaggi

authored a paper 3 days ago

Benchmarking Optimizers for Large Language Model Pretraining

Paper • 2509.01440 • Published 4 days ago • 19

mjaggi

authored a paper 2 months ago

FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language

Paper • 2506.20920 • Published Jun 26 • 70

nathanrchn

authored a paper 3 months ago

zip2zip: Inference-Time Adaptive Vocabularies for Language Models via Token Compression

Paper • 2506.01084 • Published Jun 1 • 7

atcbosselut

authored 4 papers 6 months ago

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Paper • 2412.03304 • Published Dec 4, 2024 • 21

The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units

Paper • 2411.02280 • Published Nov 4, 2024 • 1

DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests

Paper • 2501.04671 • Published Jan 8

PICLe: Pseudo-Annotations for In-Context Learning in Low-Resource Named Entity Detection

Paper • 2412.11923 • Published Dec 16, 2024

nathanrchn

authored a paper 6 months ago

Generating Structured Outputs from Language Models: Benchmark and Studies

Paper • 2501.10868 • Published Jan 18 • 2

atcbosselut

authored 6 papers 9 months ago

COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

Paper • 1906.05317 • Published Jun 12, 2019

Discovering Knowledge-Critical Subnetworks in Pretrained Language Models

Paper • 2310.03084 • Published Oct 4, 2023

RECKONING: Reasoning through Dynamic Knowledge Encoding

Paper • 2305.06349 • Published May 10, 2023 • 1

Breaking the Language Barrier: Improving Cross-Lingual Reasoning with Structured Self-Attention

Paper • 2310.15258 • Published Oct 23, 2023 • 2

CRAB: Assessing the Strength of Causal Relationships Between Real-world Events

Paper • 2311.04284 • Published Nov 7, 2023

Mitigating Label Biases for In-context Learning

Paper • 2305.19148 • Published May 28, 2023

AI & ML interests

Recent Activity

Team members 17

swiss-ai's activity