Resources for Measure what Matters: Psychometric Evaluation of AI with Situational Judgment Tests)(https://arxiv.org/abs/2510.22170)
AI & ML interests
We work with you to develop a high impact AI strategy for your industry, refine your data foundations and design meaningful human-AI interactions. We also empower you to develop, integrate and test the latest AI technologies responsibly.
Recent Activity
View all activity
models 10
thoughtworks/DeepSeek-R1-Distill-Qwen-14B-Eagle3
Text Generation • Updated • 266
thoughtworks/DeepSeek-R1-Distill-Qwen-7B-Eagle3
Text Generation • Updated • 258
thoughtworks/Qwen2.5-7B-Instruct-Eagle3
Text Generation • Updated • 257
thoughtworks/Llama-3.2-3B-Instruct-Eagle3
Text Generation • Updated • 253
thoughtworks/Qwen3-32B-Eagle3
Text Generation • Updated • 241
thoughtworks/Qwen3-14B-Eagle3
Text Generation • Updated • 232
thoughtworks/Qwen3-8B-Eagle3
Text Generation • Updated • 234
thoughtworks/Qwen2.5-14B-Instruct-Eagle3
Text Generation • Updated • 218
thoughtworks/Llama-3.1-8B-Instruct-Eagle3
Text Generation • Updated • 215
thoughtworks/GLM-4.7-Flash-Eagle3
Text Generation • 0.1B • Updated • 195 • 2
datasets 12
thoughtworks/ablation_psychometrics_personas
Viewer • Updated • 500 • 20
thoughtworks/gemma_psychometrics_personas_responses
Viewer • Updated • 3.98M • 183 • 1
thoughtworks/psychometric_personas
Viewer • Updated • 23.6k • 132
thoughtworks/psychometric_sjts_analysis
Viewer • Updated • 1.85k • 76
thoughtworks/psychometric_personas_responses
Viewer • Updated • 4.57M • 161 • 1
thoughtworks/CulturalCounterfactuals
Updated • 8
thoughtworks/psychometric_human_annotations
Viewer • Updated • 55 • 8
thoughtworks/parliamentary_personas
Viewer • Updated • 2.2k • 11
thoughtworks/psychometric_personas_temp
Viewer • Updated • 50 • 11
thoughtworks/wiki_bio
Viewer • Updated • 728k • 23