Michael Anthony PRO

MikeDoes

AI & ML interests

Privacy, Large Language Model, Explainable

Recent Activity

reacted to their post with 🧠 2 days ago
In data privacy, 92% accuracy is not an A-grade. Privacy AI needs to be better. That's the stark takeaway from a recent benchmark by Diego Mouriño (Making Science), who put today's top PII detection methods to the test on call center transcripts using the Ai4Privacy dataset. They pitted cutting-edge LLMs (like GPT-4 & Gemini) against traditional systems (like Cloud DLPs). The results show that our trust in these tools might be misplaced. 📊 The Hard Numbers: Even top-tier LLMs peaked at a reported 92% accuracy, leaving a potential dangerous 8% gap where your customer's data can leak. They particularly struggled with basics like 'last names' and 'street addresses'. The old guard? Traditional rule-based systems reportedly achieved a shocking 50% accuracy. A coin toss with your customers' privacy. This tells us that for privacy tasks, off-the-shelf accuracy is a vanity metric. The real metric is the cost of a single failure—one leaked name, one exposed address. While no tool is perfect, some are better than others. Diego’s full analysis breaks down which models offer the best cost-to-accuracy balance in this flawed landscape. It's a must-read for anyone serious about building trustworthy AI. #DataPrivacy #AI #LLM #RiskManagement #MetricsThatMatter #InfoSec Find the full post here: https://www.makingscience.com/blog/protecting-customer-privacy-how-to-remove-pii-from-call-center-transcripts/ Dataset: https://huggingface.co/datasets/ai4privacy/pii-masking-400k/tree/main
reacted to their post with 👀 2 days ago
In data privacy, 92% accuracy is not an A-grade. Privacy AI needs to be better. That's the stark takeaway from a recent benchmark by Diego Mouriño (Making Science), who put today's top PII detection methods to the test on call center transcripts using the Ai4Privacy dataset. They pitted cutting-edge LLMs (like GPT-4 & Gemini) against traditional systems (like Cloud DLPs). The results show that our trust in these tools might be misplaced. 📊 The Hard Numbers: Even top-tier LLMs peaked at a reported 92% accuracy, leaving a potential dangerous 8% gap where your customer's data can leak. They particularly struggled with basics like 'last names' and 'street addresses'. The old guard? Traditional rule-based systems reportedly achieved a shocking 50% accuracy. A coin toss with your customers' privacy. This tells us that for privacy tasks, off-the-shelf accuracy is a vanity metric. The real metric is the cost of a single failure—one leaked name, one exposed address. While no tool is perfect, some are better than others. Diego’s full analysis breaks down which models offer the best cost-to-accuracy balance in this flawed landscape. It's a must-read for anyone serious about building trustworthy AI. #DataPrivacy #AI #LLM #RiskManagement #MetricsThatMatter #InfoSec Find the full post here: https://www.makingscience.com/blog/protecting-customer-privacy-how-to-remove-pii-from-call-center-transcripts/ Dataset: https://huggingface.co/datasets/ai4privacy/pii-masking-400k/tree/main
posted an update 2 days ago
In data privacy, 92% accuracy is not an A-grade. Privacy AI needs to be better. That's the stark takeaway from a recent benchmark by Diego Mouriño (Making Science), who put today's top PII detection methods to the test on call center transcripts using the Ai4Privacy dataset. They pitted cutting-edge LLMs (like GPT-4 & Gemini) against traditional systems (like Cloud DLPs). The results show that our trust in these tools might be misplaced. 📊 The Hard Numbers: Even top-tier LLMs peaked at a reported 92% accuracy, leaving a potential dangerous 8% gap where your customer's data can leak. They particularly struggled with basics like 'last names' and 'street addresses'. The old guard? Traditional rule-based systems reportedly achieved a shocking 50% accuracy. A coin toss with your customers' privacy. This tells us that for privacy tasks, off-the-shelf accuracy is a vanity metric. The real metric is the cost of a single failure—one leaked name, one exposed address. While no tool is perfect, some are better than others. Diego’s full analysis breaks down which models offer the best cost-to-accuracy balance in this flawed landscape. It's a must-read for anyone serious about building trustworthy AI. #DataPrivacy #AI #LLM #RiskManagement #MetricsThatMatter #InfoSec Find the full post here: https://www.makingscience.com/blog/protecting-customer-privacy-how-to-remove-pii-from-call-center-transcripts/ Dataset: https://huggingface.co/datasets/ai4privacy/pii-masking-400k/tree/main
View all activity

Organizations

Ai4Privacy's profile picture Social Post Explorers's profile picture Mistral AI Game Jam's profile picture AI STATUS CODES's profile picture