Today in Privacy & AI Tooling - introducing a nifty new tool to examine where data goes in open-source apps on π€
HF Spaces have tons (100Ks!) of cool demos leveraging or examining AI systems - and because most of them are OSS we can see exactly how they handle user data ππ
That requires actually reading the code though, which isn't always easy or quick! Good news: code LMs have gotten pretty good at automatic review, so we can offload some of the work - here I'm using Qwen/Qwen2.5-Coder-32B-Instruct to generate reports and it works pretty OK π
The app works in three stages: 1. Download all code files 2. Use the Code LM to generate a detailed report pointing to code where data is transferred/(AI-)processed (screen 1) 3. Summarize the app's main functionality and data journeys (screen 2) 4. Build a Privacy TLDR with those inputs
It comes with a bunch of pre-reviewed apps/Spaces, great to see how many process data locally or through (private) HF endpoints π€
π«...And we're live!π« Seasonal newsletter from ethicsy folks at Hugging Face, exploring the ethics of "AI Agents" https://huggingface.co/blog/ethics-soc-7 Our analyses found: - There's a spectrum of "agent"-ness - *Safety* is a key issue, leading to many other value-based concerns Read for details & what to do next! With @evijit , @giadap , and @sasha
π€π€ π» Speaking of AI agents ... ...Is easier with the right words ;)
My colleagues @meg@evijit@sasha and @giadap just published a wonderful blog post outlining some of the main relevant notions with their signature blend of value-informed and risk-benefits contrasting approach. Go have a read!
πͺπΊ Policy Thoughts in the EU AI Act Implementation πͺπΊ
There is a lot to like in the first draft of the EU GPAI Code of Practice, especially as regards transparency requirements. The Systemic Risks part, on the other hand, is concerning for both smaller developers and for external stakeholders.
I wrote more on this topic ahead of the next draft. TLDR: more attention to immediate large-scale risks and to collaborative solutions supported by evidence can help everyone - as long as developers disclose sufficient information about their design choices and deployment contexts.
π·π½ββοΈππ¨ Announcing the Foundation Model Development Cheatsheet!
My first π€Postπ€ ever to announce the release of a fantastic collaborative resource to support model developers across the full development stack: The FM Development Cheatsheet available here: https://fmcheatsheet.org/
The cheatsheet is a growing database of the many crucial resources coming from open research and development efforts to support the responsible development of models. This new resource highlights essential yet often underutilized tools in order to make it as easy as possible for developers to adopt best practices, covering among other aspects: π§πΌβπ€βπ§πΌ data selection, curation, and governance; π accurate and limitations-aware documentation; β‘ energy efficiency throughout the training phase; π thorough capability assessments and risk evaluations; π environmentally and socially conscious deployment strategies.
We strongly encourage developers working on creating and improving models to make full use of the tools listed here, and to help keep the resource up to date by adding the resources that you yourself have developed or found useful in your own practice π€