Florent Daudens

fdaudens

AI & ML interests

AI & Journalism

Articles

Organizations

fdaudens's activity

posted an update 1 day ago
view post
Post
629
Fascinating point from @thomwolf at Web Summit: AI misuse (deepfakes, fake news) is actually easier to make with closed models, not with open-source ones.

This challenges the common narrative that open-source AI is inherently more dangerous. The reality is more nuanced - while we may think open source is technically easier to misuse, closed models' accessibility and product-focused design appear to be driving more actual harm.

Important context for current AI safety discussions and regulation debates.

Do you agree? ๐Ÿ‘‡
  • 1 reply
ยท
posted an update 2 days ago
view post
Post
1949
๐Ÿคฏ AI progress keeps blowing my mind! Just experienced Qwen's new Coder demo - built a complete flashcard web app with a single prompt. The results are incredible!

This demo is part of the new Qwen2.5 Coder family (0.5B to 32B models), surpassing/matching GPT4o and Claude Sonnet 3.5 across multiple coding benchmarks.

- 128K context window for 14B/32B models
- Drop-in replacement for GPT-4 in Cursor & Artifacts
- Models on the Hub under Apache 2.0 license

๐Ÿ”— Try it yourself: Qwen/Qwen2.5-Coder-Artifacts

This is democratization of coding in real-time. Excited to see AI tools becoming more capable and accessible.

What would you build with this? Share your ideas below! ๐Ÿ‘‡

#AI #Programming #TechInnovation #OpenSource #SoftwareDevelopment
  • 1 reply
ยท
posted an update 8 days ago
view post
Post
2330
Just tested Argilla's new data annotation feature - it's a game changer for AI project quality.

Upload CSVs, work with published datasets, or improve existing ones directly on HuggingFace Hub. Setup took < 2 minutes, no code needed (see example below where I selected a dataset to classify tweets in categories).

Real world impact: Missing in Chicago won a Pulitzer using a similar approach - 200 volunteers labeled police misconduct files to train their model. That's the power of good data annotation.

Three immediate use cases I see:
- Build collaborative training sets with your community (surprisingly underused in AI journalism)
- Turn your website chatbot logs into high-quality fine-tuning data
- Compare generated vs published content (great for SEO headlines)

Works for solo projects or teams up to 100 people. All integrated with HuggingFace Hub for immediate model training.

Interesting to see tools like this making data quality more accessible. Data quality is the hidden driver of AI success that we don't talk about enough.

- Check out the blogpost: https://huggingface.co/blog/argilla-ui-hub
- And the quickstart guide: https://docs.argilla.io/latest/getting_started/quickstart/

reacted to m-ric's post with ๐Ÿš€ 8 days ago
view post
Post
2456
๐—›๐˜‚๐—ป๐˜†๐˜‚๐—ฎ๐—ป-๐—Ÿ๐—ฎ๐—ฟ๐—ด๐—ฒ ๐—ท๐˜‚๐˜€๐˜ ๐—ฟ๐—ฒ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ฏ๐˜† ๐—ง๐—ฒ๐—ป๐—ฐ๐—ฒ๐—ป๐˜: ๐—Ÿ๐—ฎ๐—ฟ๐—ด๐—ฒ๐˜€๐˜ ๐—ฒ๐˜ƒ๐—ฒ๐—ฟ ๐—ผ๐—ฝ๐—ฒ๐—ป ๐— ๐—ผ๐—˜ ๐—Ÿ๐—Ÿ๐— , ๐—ผ๐—ป๐—น๐˜† ๐Ÿฑ๐Ÿฎ๐—• ๐—ฎ๐—ฐ๐˜๐—ถ๐˜ƒ๐—ฒ ๐—ฝ๐—ฎ๐—ฟ๐—ฎ๐—บ๐—ฒ๐˜๐—ฒ๐—ฟ๐˜€ ๐—ฏ๐˜‚๐˜ ๐—ฏ๐—ฒ๐—ฎ๐˜๐˜€ ๐—Ÿ๐—Ÿ๐—ฎ๐— ๐—” ๐Ÿฏ.๐Ÿญ-๐Ÿฐ๐Ÿฌ๐Ÿฑ๐—• ๐—ผ๐—ป ๐—บ๐—ผ๐˜€๐˜ ๐—ฎ๐—ฐ๐—ฎ๐—ฑ๐—ฒ๐—บ๐—ถ๐—ฐ ๐—ฏ๐—ฒ๐—ป๐—ฐ๐—ต๐—บ๐—ฎ๐—ฟ๐—ธ๐˜€ ๐Ÿš€

โšก Mixture of Experts (MoE) architecture: 389 B parameters in total, but only 52B are activated for any input

๐Ÿงช Trained on 7T tokens, including 1.5T tokens of synthetic data

๐Ÿ—๏ธ Architecture : Novel "recycle routing" prevents token dropping when experts are overrloaded

๐Ÿ“Š Great benchmark results: Surpasses Llama-3-405B-Instruct in most benchmarks although it has 8x fewer active parameters
โ€ฃ Impressive perf on MATH: 77.4

๐Ÿ‹ย Large context length: up to 256K tokens

๐Ÿ”’ License:
โ€ฃ Commercial use allowed, except if your products have >100M monthly active users
โ€ฃ No access in the EU

๐Ÿค—ย Model weights available on HF!

Read the full paper here ๐Ÿ‘‰ย  Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent (2411.02265)
posted an update 12 days ago
view post
Post
1182
๐ŸŽ™๏ธ "We need digital sobriety." @sasha challenges Big Tech's race for nuclear energy on BBC AI Decoded. Instead of pursuing more power, shouldn't we first ask if we really need AI everywhere?

Such an eye-opening chat! Check it out here: https://www.youtube.com/watch?v=3wAduy52mGc

posted an update 13 days ago
view post
Post
2337
First AI Journalism Lab cohort just wrapped - endless inspiration for newsrooms:
- Ludwig Siegele built an AI style checker for The Economist
- Rodney Gibbs created a tool helping small newsrooms analyze stories through user needs
- Monsur Hussain developed AI trend monitoring system for fact-checking WhatsApp claims
- David Cohn built a system for analyzing audience engagement
- Clare Spencer crafted video personas with AI

The insights on adoption during the discussion were fascinating - their approach really resonated with me. Instead of forcing AI tools onto teams, they emphasized getting skeptics involved early in testing and creating safe spaces for open discussion. Start small with enthusiastic participants, build a community of internal AI champions, and focus on solving specific problems rather than pushing for adoption.

As a coach, I also learned a lot. My 5 key takeaways:
- Newsrooms are bursting with AI x journalism innovation
- Internal alignment > technical challenges. Strong dev/PM relationships = magic
- Early prototyping + user involvement = better adoption. Set realistic expectations & embrace feedback
- Cross-newsroom collaboration supercharges innovation
- Great products can emerge in weeks with proper scoping

See the projects: https://www.youtube.com/watch?v=5PMxMDfDI_0&

Kudos to Kyle Plantz, Nikita Roy, Craig Newmark Graduate School of Journalism at CUNY for making it happen!
  • 1 reply
ยท
posted an update 15 days ago
view post
Post
2241
๐Ÿ” NYT leveraged AI to investigate election interference by analyzing 400+ hours of recorded meetings - that's 5M words of data!

AI spotted patterns, humans verified facts. Every AI-flagged quote was manually verified against source recordings. Really appreciate that they published their full methodology - transparency matters when using AI in journalism.

A perfect blend of tech & journalism.

The future of journalism isn't robots replacing reporters - it's AI helping humans process massive datasets more efficiently. Sometimes the most powerful tech solutions are the least flashy ones.

Read the article: https://www.nytimes.com/interactive/2024/10/28/us/politics/inside-the-movement-behind-trumps-election-lies.html?unlocked_article_code=1.Vk4.ucv9.dbHVquTQaf0G&smid=nytcore-ios-share
reacted to albertvillanova's post with ๐Ÿš€ 15 days ago
view post
Post
3015
๐Ÿš€ Exciting update! You can now compare multiple models side-by-side with the Hugging Face Open LLM Comparator! ๐Ÿ“Š

open-llm-leaderboard/comparator

Dive into multi-model evaluations, pinpoint the best model for your needs, and explore insights across top open LLMs all in one place. Ready to level up your model comparison game?
reacted to clem's post with ๐Ÿ”ฅ 19 days ago
view post
Post
4064
This is no Woodstock AI but will be fun nonetheless haha. Iโ€™ll be hosting a live workshop with team members next week about the Enterprise Hugging Face hub.

1,000 spots available first-come first serve with some surprises during the stream!

You can register and add to your calendar here: https://streamyard.com/watch/JS2jHsUP3NDM
ยท
posted an update 20 days ago
view post
Post
2749
๐Ÿคฏ Plot twist: Size isn't everything in AI! A lean 32B parameter model just showed up to the party and outperformed a 70B one. Efficiency > Scale? The AI world just got more interesting...

Cohere For AI released Aya Expanse, a new family of multilingual models (8B and 32B) spanning 23 popular languages.

Models: CohereForAI/c4ai-aya-expanse-671a83d6b2c07c692beab3c3
Blog post: https://huggingface.co/blog/aya-expanse
Demo: CohereForAI/aya_expanse
posted an update 20 days ago
view post
Post
1341
Just watched @thomwolf tear down the over-hyped AGI narrative in 30 seconds - and it's refreshingly grounded.

No wild speculation about superintelligence timelines or consciousness. Just practical insights from someone who really understands the technology.

This is the kind of level-headed perspective that helps us focus on what AI can actually do today (which is already transformative) rather than getting lost in AGI fantasy. Worth your time if you want to understand AI progress without the hype.

Watch the full interview at CogX here: https://www.youtube.com/watch?v=IjL_6Th6Ea0
posted an update 29 days ago
view post
Post
414
New York Times to Perplexity: Stop Using Our Stuff

The publisher has sent generative-AI startup Perplexity a โ€œcease and desistโ€ notice demanding that the firm stop accessing and using its content, according to a copy of the letter reviewed by The Wall Street Journal.

Perplexity CEO Aravind Srinivas said in an interview that Perplexity isnโ€™t ignoring the Timesโ€™s efforts to block crawling of its site. He said the company plans on responding to the legal notice by the Timesโ€™s deadline of Oct. 30.

โ€œWe are very much interested in working with every single publisher, including the New York Times,โ€ Srinivas said. โ€œWe have no interest in being anyoneโ€™s antagonist here.โ€

https://www.wsj.com/business/media/new-york-times-to-bezos-backed-ai-startup-stop-using-our-stuff-20faf2eb?mod=rss_Technology
posted an update about 1 month ago
view post
Post
1340
newsrooms, i see you using deepl or an llm for translation without logging your adjustments. you're wasting gold and you know it's bad!

with this notebook from the argilla team, you can:
- adjust your translations in a neat interface,
- log them to build custom datasets,
- fine-tune your model.

your translation will become better and better, gradually aligning more with your style guide. no more starting from scratch!

Notebook by @sdiazlor : https://colab.research.google.com/drive/1sR1wfOs_pNrdm3Mwjo_qJRG7NnEb9a7W#scrollTo=yNm8N5GoRD2o

#AITranslation #JournalismTech
posted an update about 1 month ago
view post
Post
1963
This is how AI can be useful in journalism: Just tested DataTalk - a tool that lets you dig through campaign finance data with just your words.

It's transforming complex FEC filings and OpenSecrets datasets into actionable insights for journalists.

Key features for newsrooms:
- Natural language queries on FEC data
- Rapid insights on donors, spending, special interests
- SQL access for deep dives

Tested it out:
- Retrieved how much Harris and Trump raised
- Found top donors instantly (#1 is Timothy Mellonโ€”have you heard about him?)
- Uncovered big self-funders like David Trone ($62M)

Pros:
- Saves hours of data wrangling
- Surfaces story leads quickly
- Transparent AI retrieving steps makes this tool auditable

Awesome work by Stanford University Open Virtual Assistant Lab, Big Local News, and Columbia University - Graduate School of Journalism. Expert-guided.

Remember: Always verify. Use for leads, not final copy. But this is gold for finding new leads.

How might this change campaign finance reporting? What other datasets need this treatment?

Try it out: https://www.datatalk.genie.stanford.edu/

#AIJournalism #campaignfinance #datajournalism #election2024
posted an update about 1 month ago
view post
Post
3017
The Nobel Prize background for Hopfield and Hinton's work on neural networks is pure gold. It's a masterclass in explaining AI basics.

Key takeaways from the conclusion:
- ML applications are expanding rapidly. We're still figuring out which will stick.
- Ethical discussions are crucial as the tech develops.
- Physics ๐Ÿค AI: A two-way street of innovation.

Some mind-blowing AI applications in physics:
- Discovering the Higgs particle
- Cleaning up gravitational wave data
- Hunting exoplanets
- Predicting molecular structures
- Designing better solar cells

We're just scratching the surface. The interplay between AI and physics is reshaping both fields.

Bonus: The illustrations accompanying the background document are really neat. (Credit: Johan Jarnestad/The Royal Swedish Academy of Sciences)

#AI #MachineLearning #Physics #Ethics #Innovation
  • 1 reply
ยท
reacted to clem's post with ๐Ÿš€ about 1 month ago
view post
Post
4133
Open-source AI creates healthy competition in a field where natural tendencies lead to extreme concentration of power. Imagine a world where only one or two companies could build software. This is the biggest risk and ethical challenge of them all IMO. Let's fight this!
  • 3 replies
ยท
posted an update about 1 month ago
view post
Post
624
still sending all your info to a black-box api when using an llm? you know it's bad. try this instead: you can run dozens of models right in your browser from Google, Microsoft, Mistral, Meta, Qwen, and Smollm. private and secure. you'll thank me later.

cfahlgren1/webllm-playground
  • 2 replies
ยท
posted an update about 1 month ago
view post
Post
472
๐Ÿ”Š Great new tool for audio: Voice Restoration with a Transformer-based Model!

Enable your sound to hear the improvement.

Try it out:
jadechoghari/VoiceRestore
posted an update about 1 month ago
posted an update about 1 month ago
view post
Post
2091
Want to supercharge your journalism with AI but don't know where to start? I've got you covered. ๐Ÿš€

Ran two workshops at Media Party w/ Brown Institute for Media Innovation at Columbia University this week-end, packed with open-source AI tools for journalists. Thought you might find 'em useful too, so I'm open-sourcing my slides ๐Ÿ˜‰

Here's a taste of what's in the beginner's deck (no-code tools focus):
- Scrape websites without coding
- Analyze bias in AI image generators
- Transcribe audio/video on your device
- Edit images with words
- Extract info from docs, websites, PDFs
- Analyze images, handwriting, videos
- Create custom AI assistants

Tons more in the deck.

๐Ÿ‘‰ Full presentation link: https://docs.google.com/presentation/d/1Q887BhrcrDDgfi0O-Mbx2GI1De2H3GkIiXmd9MvmxYE/edit?usp=sharing

#AIinJournalism #MediaInnovation #OpenSourceAI #MediaParty