InternLM

company
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

internlm's activity

clefourrier 
posted an update 19 days ago
view post
Post
611
Always surprised that so few people actually read the FineTasks blog, on
✨how to select training evals with the highest signal✨

If you're serious about training models without wasting compute on shitty runs, you absolutely should read it!!

An high signal eval actually tells you precisely, during training, how wel & what your model is learning, allowing you to discard the bad runs/bad samplings/...!

The blog covers in depth prompt choice, metrics, dataset, across languages/capabilities, and my fave section is "which properties should evals have"👌
(to know on your use case how to select the best evals for you)

Blog: HuggingFaceFW/blogpost-fine-tasks
  • 2 replies
·
vansin 
posted an update 3 months ago
view post
Post
3502
🔥MedAgentBench Amazing Work🚀

Just explored #MedAgentBench from @Yale researchers and it's mind-blowing! They've created a cutting-edge benchmark that finally exposes the true capabilities of LLMs in complex medical reasoning.

⚡ Key discoveries:

DeepSeek R1 & OpenAI O3 dominate clinical reasoning tasks
Agent-based frameworks deliver exceptional performance-cost balance
Open-source alternatives are closing the gap at fraction of the cost

This work shatters previous benchmarks that failed to challenge today's advanced models.
The future of medical AI is here: https://github.com/gersteinlab/medagents-benchmark
#MedicalAI #MachineLearning #AIinHealthcare 🔥