Future of AI

AI Safety & Ethics

Alignment, interpretability, AI risks, and building safe AI systems

7,271
lessons
Skills in this topic
View full skill map →
AI Alignment Basics
beginner
Explain the alignment problem
AI Ethics & Policy
beginner
Identify types of bias in ML systems
AI Safety Engineering
intermediate
Implement input and output guardrails

Showing 611 reads from curated sources

DISPATCHES
Medium · ChatGPT 🛡️ AI Safety & Ethics ⚡ AI Lesson 51m ago
DISPATCHES
The Age of Sanitised AI Continue reading on Medium »
Your Team Is the Part That Makes AI Safe
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 2h ago
Your Team Is the Part That Makes AI Safe
(And Most Founders Are About to Lose It) Continue reading on Medium »
Your Team Is the Part That Makes AI Safe
Medium · Startup 🛡️ AI Safety & Ethics ⚡ AI Lesson 2h ago
Your Team Is the Part That Makes AI Safe
(And Most Founders Are About to Lose It) Continue reading on Medium »
Federal Prosecutors Indicted An Innocent Person On A Deepfake
Forbes Innovation 🛡️ AI Safety & Ethics ⚡ AI Lesson 2h ago
Federal Prosecutors Indicted An Innocent Person On A Deepfake
How did a deepfake indict an innocent person in federal court without anyone catching it? The first federal survey of judges on deepfake challenges just dropped
The Human-in-the-Loop Trap
Medium · Data Science 🛡️ AI Safety & Ethics ⚡ AI Lesson 3h ago
The Human-in-the-Loop Trap
Most enterprise AI teams treat human-in-the-loop as a compliance checkbox. Continue reading on Medium »
Beyond Accuracy: Why Clinical AI Must Learn to Communicate Uncertainty
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 4h ago
Beyond Accuracy: Why Clinical AI Must Learn to Communicate Uncertainty
Why reliable AI systems must communicate confidence, instability, and uncertainty before humans can safely trust them. Continue reading on Medium »
Beyond Accuracy: Why Clinical AI Must Learn to Communicate Uncertainty
Medium · Data Science 🛡️ AI Safety & Ethics ⚡ AI Lesson 4h ago
Beyond Accuracy: Why Clinical AI Must Learn to Communicate Uncertainty
Why reliable AI systems must communicate confidence, instability, and uncertainty before humans can safely trust them. Continue reading on Medium »
Daring Fireball 🛡️ AI Safety & Ethics ⚡ AI Lesson 4h ago
Geoffrey Fowler and the Launch of the Youth AI Safety Institute
Geoffrey Fowler, on his blog, which, alas , he calls “ a Substack ”: I’m joining the Youth AI Safety Institute as its first new employee. It’s a research and te
The Missing Middle in AI Safety: How PRISM Turns Interest Into Published Research
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 4h ago
The Missing Middle in AI Safety: How PRISM Turns Interest Into Published Research
Applications for PRISM research fellows are open until May 25, 2026. Continue reading on Medium »
The Missing Middle in AI Safety: How PRISM Turns Interest Into Published Research
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 4h ago
The Missing Middle in AI Safety: How PRISM Turns Interest Into Published Research
Applications for PRISM research fellows are open until May 25, 2026. Continue reading on Medium »
A Teenager Died Following ChatGPT’s Advice.
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 4h ago
A Teenager Died Following ChatGPT’s Advice.
A 14-year-old in Florida asked ChatGPT about drug combinations. The model told him he would be “OK.” He died that night. Today’s lawsuit… Continue reading on Me
A Teenager Died Following ChatGPT’s Advice.
Medium · ChatGPT 🛡️ AI Safety & Ethics ⚡ AI Lesson 4h ago
A Teenager Died Following ChatGPT’s Advice.
A 14-year-old in Florida asked ChatGPT about drug combinations. The model told him he would be “OK.” He died that night. Today’s lawsuit… Continue reading on Me
Hacker News 🛡️ AI Safety & Ethics ⚡ AI Lesson 7h ago
The Artificial Intelligence Commission [pdf]
Article URL: https://download.ssrn.com/2026/4/20/6615258.pdf?response-content-disposition=inline&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEJr%2F%2F%2F%2F%2F%2F%2F%2
ZDNet 🛡️ AI Safety & Ethics ⚡ AI Lesson 7h ago
Anthropic's Mythos is evolving faster than expected, reports AI safety agency
Only a month after its initial release, Anthropic's storied Mythos model is breaking new testing boundaries.
Shadow AI: The Invisible Risk Already Inside Your Organization
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 8h ago
Shadow AI: The Invisible Risk Already Inside Your Organization
Your employees are using AI. Just not the AI you approved. Continue reading on Medium »
“AI is an Apparatus” – A Warning from the Gods
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 10h ago
“AI is an Apparatus” – A Warning from the Gods
How Restriction Protects Our Being Continue reading on Medium »
Dev.to AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 13h ago
How to Protect PII/PHI in AI Systems
How to Protect PII/PHI in AI Systems: A Founder's Perspective Navigating the Complexity of PII/PHI Protection in AI Imagine waking up to headlines of a major da
OpenAI Faces Class-Action Privacy Lawsuit Over Alleged Data Sharing Practices
Medium · ChatGPT 🛡️ AI Safety & Ethics ⚡ AI Lesson 14h ago
OpenAI Faces Class-Action Privacy Lawsuit Over Alleged Data Sharing Practices
Artificial Intelligence continues to reshape how organizations work, communicate, and innovate. However, as AI adoption accelerates… Continue reading on Medium
Dev.to AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 15h ago
Standardization in AI Governance: ISO/IEC 42001
Standardization in AI Governance: The Rise of ISO/IEC 42001 Navigating the complex landscape of AI governance is like trying to solve a Rubik's Cube with one ha
The shock of seeing your body used in deepfake porn
MIT Technology Review 🛡️ AI Safety & Ethics ⚡ AI Lesson 16h ago
The shock of seeing your body used in deepfake porn
When Jennifer got a job doing research for a nonprofit in 2023, she ran her new professional headshot through a facial recognition program. She wanted to see if
Dev.to AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 19h ago
When Control Becomes Authority: Calibration Governance in STEM BIO-AI 1.7.x
Control slowly becomes authority when nobody marks the boundary. That is the calibration problem I kept running into while building STEM BIO-AI. At first, STEM
ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 21h ago
DisaBench: A Participatory Evaluation Framework for Disability Harms in Language Models
arXiv:2605.12702v1 Announce Type: new Abstract: General-purpose safety benchmarks for large language models do not adequately evaluate disability-related harms.
ArXiv cs.AI 🛡️ AI Safety & Ethics 📄 Paper ⚡ AI Lesson 21h ago
Sustaining AI safety: Control-theoretic external impossibility, intrinsic necessity, and structural requirements
arXiv:2605.12963v1 Announce Type: new Abstract: As AI systems become increasingly capable, safety strategies must be evaluated not only by how much they reduce
Foreboding AI, One Year Later: What Are We Really Building?
Dev.to · Stevie G 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
Foreboding AI, One Year Later: What Are We Really Building?
A year ago, I wrote Foreboding AI: The Inevitable Collapse We’re Funding Ourselves At the time, my...
Why “Trust in AI” Is the Wrong Metric: What the 2025 Global Dialogues Data Is Actually Telling Us!
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
Why “Trust in AI” Is the Wrong Metric: What the 2025 Global Dialogues Data Is Actually Telling Us!
A cross-national analysis of the 21-point gap between AI tool trust and institutional trust — and what it means for governance. Continue reading on Medium »
Yuandong Tian, Grokking, and the New Rule of Human Value in AI
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
Yuandong Tian, Grokking, and the New Rule of Human Value in AI
A former Meta FAIR research director asks the question most people are avoiding: if AI becomes the center of production, what exactly are… Continue reading on M
I Recently Started Researching the AI SaaS Space.
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
I Recently Started Researching the AI SaaS Space.
On AI SDRs, the inbound conversion problem, and what the market is actually saying. Continue reading on Medium »
AI Doesn’t Need to Want You Dead
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
AI Doesn’t Need to Want You Dead
A recent research paper reframes the AI threat without a single Hollywood trope. That’s exactly why it’s worth reading. Continue reading on Activated Thinker »
AI Doesn’t Need to Want You Dead
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
AI Doesn’t Need to Want You Dead
A recent research paper reframes the AI threat without a single Hollywood trope. That’s exactly why it’s worth reading. Continue reading on Activated Thinker »
The AI Persona Problem: Your Next Threat Actor Doesn't Exist
Dev.to · Adrian Alexandru Stinga 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
The AI Persona Problem: Your Next Threat Actor Doesn't Exist
Let me say something that will make most security vendors uncomfortable: The traditional "know your...
I Built an AI That Tries to Phish Me Every Week — Here's What I Learned
Dev.to · 晖丁 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
I Built an AI That Tries to Phish Me Every Week — Here's What I Learned
A personal experiment in phishing awareness: AI-generated phishing emails delivered to my real inbox every week. After 3 months, my click rate dropped from 25%
Dev.to AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
Hackers Used AI to Develop First Known Zero-Day 2FA Bypass for Mass Exploitation
Google has disclosed the discovery of a zero-day exploit weaponized by an unknown threat actor using an AI system, marking a significant milestone in malicious
Dev.to AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
GTIG AI Threat Tracker: Adversaries Leverage AI for Vulnerability Exploitation, Augmented Operations, and Initial Access
⚠️ Region Alert: UAE/Middle East The Google Threat Intelligence Group (GTIG) report highlights a significant shift in the threat landscape, where adversaries ha
Your LLM Is Being Attacked Right Now — Here's What's Happening
Dev.to · Ayush Singh 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
Your LLM Is Being Attacked Right Now — Here's What's Happening
You shipped an AI feature. It works great. Then someone types something weird — and your model does...
OpenAI Dissolved Its Safety Teams. Then 75 Employees Cashed Out $30 Million Each.
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
OpenAI Dissolved Its Safety Teams. Then 75 Employees Cashed Out $30 Million Each.
What the Musk v. Altman trial testimony and the $6.6 billion tender offer reveal, read together and three questions worth asking before… Continue reading on Gen
3 Points We Need to Address to Advance AI for Sustainability
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
3 Points We Need to Address to Advance AI for Sustainability
AI for sustainability is a growing field that evokes both enthusiasm and skepticism. Continue reading on Medium »
Medium · Cybersecurity 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
The Invisible Needle in the Never-Ending Haystack: Where AI Stops Being Optional
Some haystacks are too large to search by hand, not because we lack the patience, but because the haystack keeps growing and the needle… Continue reading on Med
When the Pattern Looks Like a Threat: Is AI Safe, or Does It Just Look Safe?
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
When the Pattern Looks Like a Threat: Is AI Safe, or Does It Just Look Safe?
What an unintended jailbreak revealed about how AI safety really works Continue reading on Medium »
When the Pattern Looks Like a Threat: Is AI Safe, or Does It Just Look Safe?
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 1d ago
When the Pattern Looks Like a Threat: Is AI Safe, or Does It Just Look Safe?
What an unintended jailbreak revealed about how AI safety really works Continue reading on Medium »
From Ingestion to Final Verdict: THREATRADAR’s Poisoning Detection Pipeline
Medium · LLM 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
From Ingestion to Final Verdict: THREATRADAR’s Poisoning Detection Pipeline
Welcome to the fourth article in the THREATRADAR series. We recommend reading Part 1 Design and Implementation of THREATRADAR: Open-Source… Continue reading on
How to Design an Organization So Employees Do Not Accidentally or Intentionally Leak Sensitive Data…
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
How to Design an Organization So Employees Do Not Accidentally or Intentionally Leak Sensitive Data…
Introduction: AI Is Now a Data Governance Challenge Continue reading on Medium »
AI Manipulation and Mind Control: How Algorithms Are Quietly Shaping Human Thoughts, Decisions, and…
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
AI Manipulation and Mind Control: How Algorithms Are Quietly Shaping Human Thoughts, Decisions, and…
Most people think they’re making independent decisions online. They believe the videos they watch, the products they buy, and even the… Continue reading on Cube
AI Manipulation and Mind Control: How Algorithms Are Quietly Shaping Human Thoughts, Decisions, and…
Medium · Programming 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
AI Manipulation and Mind Control: How Algorithms Are Quietly Shaping Human Thoughts, Decisions, and…
Most people think they’re making independent decisions online. They believe the videos they watch, the products they buy, and even the… Continue reading on Cube
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
When Government Information Degrades Over Time in AI Systems
Why repeated interpretation causes drift — and why structure becomes necessary to preserve meaning Continue reading on Medium »
The New Architect of Discovery: Why Research Still Needs a Human Perspective
Medium · AI 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
The New Architect of Discovery: Why Research Still Needs a Human Perspective
Data is everywhere. But meaning? That’s becoming harder to find. Continue reading on Medium »
A Teen Asked ChatGPT About Drugs. Months Later, He Was Dead.
Medium · ChatGPT 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
A Teen Asked ChatGPT About Drugs. Months Later, He Was Dead.
The lawsuit against OpenAI may become one of the most important AI safety cases we’ve seen so far, not because of what the chatbot said… Continue reading on Med
Fooling Machine Learning: Notes on Adversarial Attacks
Medium · Machine Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
Fooling Machine Learning: Notes on Adversarial Attacks
Picture a stop sign. Someone has stuck a few strips of black and white tape across it. Cheap tape, the kind you would walk past without… Continue reading on Med
Fooling Machine Learning: Notes on Adversarial Attacks
Medium · Deep Learning 🛡️ AI Safety & Ethics ⚡ AI Lesson 2d ago
Fooling Machine Learning: Notes on Adversarial Attacks
Picture a stop sign. Someone has stuck a few strips of black and white tape across it. Cheap tape, the kind you would walk past without… Continue reading on Med