AI Safety Engineering

Implement guardrails, red-team prompts, and build safer AI applications.

intermediate 🛡️ AI Safety & Ethics

Confidence · no data yet

After this skill you can…

Implement input and output guardrails
Red-team a deployed LLM application
Use Llama Guard or NeMo Guardrails

Prerequisites

AI Alignment Basics

Watch (10 videos)

I Broke Threads

John Hammond · intermediate hands-on

→ Design and test secure systems to prevent crashes and errors→ Develop and implement safety protocols for app development→ Analyze and mitigate potential security risks

How #AirForce train to survive #plane crashes in water. #military #survivalskills

Business Insider · intermediate hands-on

→ Develop emergency procedures→ Conduct safety training

Simplify and Accelerate your Google Cloud Security Operations with Wiz and Gemini

Google Cloud · intermediate hands-on

→ Integrate Google SecOps with Wiz and Gemini→ Simplify security operations

From Assistant to Adversary: When Agentic AI Becomes an Insider Threat

SANS Institute · intermediate hands-on

→ Design least-privilege agents→ Implement real-time policy guards

Keynote | Threat Modeling Agentic AI Systems: Proactive Strategies for Security and Resilience

SANS Institute · beginner hands-on

→ Design and deploy secure agentic AI systems→ Ensure safety and reliability

Will AI take over the world?

HuggingFace · advanced hands-on

→ Develop secure AI systems→ Test AI systems for safety→ Improve AI reliability

5 essential preventative controls for Generative AI workloads | Amazon Web Services

Amazon Web Services · advanced hands-on

→ Design secure and well-governed AWS environments→ Enforce consistent permissions and audit access

Miao (Mia) Zhang - Common-Sense Bias Discovery and Mitigation for Classification Tasks

Cohere · advanced hands-on

→ Implement common-sense bias discovery and mitigation in image classification→ Adjust sampling weights for bias mitigation

How Would You Implement Guardrails For An LLM Application? #Shorts #LLM #GfG #GeeksforGeeks

GeeksforGeeks · intermediate hands-on

→ Implement guardrails for LLM applications→ Validate inputs for LLMs→ Filter outputs for LLMs

Safeguard your users and brand with W&B Weave Guardrails

Weights & Biases · intermediate hands-on

→ Ensure AI system safety with guardrails→ Prevent harmful outputs from AI agents

Read (10 articles)

📄

⚠️ AI with a survival instinct? Claude once tried blackmail — now models are lying to avoid being shut down

Dev.to · FJRG2007 ツ · 2025-07-10

📄

The GPT-5 Paradox: Genius in Thought, Gaps in Safety

Dev.to · Ayush kumar · 2025-08-14

📄

ChatGPT Safety: Parental Controls, GPT-5 Routing, and Crisis Handling

Dev.to · Ali Farhat · 2025-09-02

📄

The Perilous Pursuit of Superintelligence: Heeding Mustafa Suleyman's AI Safety Warning

Dev.to · Yathin Chandra · 2025-09-14

📄

AI Unleashed, Privacy Preserved: The Future of Secure LLMs by Arvind Sundararajan

Dev.to · Arvind Sundara Rajan · 2025-09-14

📄

Claude 4.6 Opus: Advanced Reasoning or a New Monitoring Nightmare?

Dev.to · Claudius Papirus · 2026-02-08

📄

Measuring Model Hallucinations: When AI Invents Facts

Dev.to · Erica · 2026-02-08

📄

"Add a Kill Switch to Any AI Agent in 5 Lines of Python"

Dev.to · DC · 2026-02-08

📄

Engineering-Grade Strategies for Hallucination Prevention in GenAI Systems

Dev.to · Shreekansha · 2026-02-09

📄

How Constraints Make Automation Safer

Dev.to · Automation Systems Lab · 2026-02-11