Applied AI

MLOps & LLMOps

Model deployment, experiment tracking, monitoring, inference optimisation and AI pipelines

250

lessons

Skills in this topic

5 skills — Sign in to track your progress

View full skill map →

Experiment Tracking

Log experiments with MLflow or Weights & Biases

Model Deployment

Wrap a model in a FastAPI endpoint

Model Monitoring

Set up drift detection with Evidently AI

Define feature views in Feast

Set up LangSmith or Langfuse for LLM tracing

Videos 23 Reads 227

All Reads (227) Articles (199)Blog Posts (11)Tutorials (16)News (1)

Level: All Beginner Intermediate Advanced

Newest Popular Oldest

ArXiv cs.AI 🏭 MLOps & LLMOps 📄 Paper ⚡ AI Lesson 2mo ago

Revealing Domain-Spatiality Patterns for Configuration Tuning: Domain Knowledge Meets Fitness Landscapes

arXiv:2603.19897v1 Announce Type: cross Abstract: Configuration tuning for better performance is crucial in quality assurance. Yet, there has long been a myster

InfoQ AI/ML 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Spring News Roundup: Third Milestone Releases of Boot, Security, Integration, AI and AMQP

There was a flurry of activity in the Spring ecosystem during the week of March 16th, 2026, highlighting the third milestone releases of: Spring Boot, Spring Se

‘Major Issues’—Microsoft Confirms Emergency Update For Windows Users

Forbes Innovation 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

‘Major Issues’—Microsoft Confirms Emergency Update For Windows Users

Microsoft update breaks Windows — emergency fix suddenly released and available now.

InfoQ AI/ML 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

QCon London 2026: Introducing Tansu.io — Rethinking Kafka for Lean Operations

Peter Morgan introduced Tansu at QCon London, an open-source, Kafka-compatible, stateless, leaderless broker that scales to zero, with pluggable storage (S3, SQ

The Verge 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Microsoft is ending the Windows Update nightmare — and letting you pause them indefinitely

In 2015, Microsoft decided that you shouldn't be in control of updating your PC anymore. At first, it seemed like a good idea to keep malware at bay - but soon,

ZDNet AI 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Chainguard is racing to fix trust in AI-built software - here's how

Chainguard is expanding beyond open-source security to protect open-core software, AI agent skills, and GitHub Actions.

InfoQ AI/ML 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Article: Configuration as a Control Plane: Designing for Safety and Reliability at Scale

Configuration has evolved from static deployment files into a live control plane that directly shapes system behavior. The evolution of configuration management

Why Vibe Coders Still Need To Think Like Software Engineers

Forbes Innovation 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Why Vibe Coders Still Need To Think Like Software Engineers

This article explains why project scoping, architecture, testing, and human oversight remain essential, even as AI changes how software gets built.

ZDNet AI 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

6 reasons a minimal Linux install might be the smartest move you make

It turns out, there are reasons why those tiny 'minimal install' options are available on Linux.

AWS Machine Learning 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Enhanced metrics for Amazon SageMaker AI endpoints: deeper visibility for better performance

SageMaker AI endpoints now support enhanced metrics with configurable publishing frequency. This launch provides the granular visibility needed to monitor, trou

AWS Machine Learning 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Enforce data residency with Amazon Quick extensions for Microsoft Teams

In this post, we will show you how to enforce data residency when deploying Amazon Quick Microsoft Teams extensions across multiple AWS Regions. You will learn

Towards Data Science 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Vibe Coding with AI: Best Practices for Human-AI Collaboration in Software Development

Accelerate coding with AI while staying in control and building reliable, production-ready software. The post Vibe Coding with AI: Best Practices for Human-AI C

InfoQ AI/ML 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Microsoft Adds DRA-Backed NVIDIA vGPU Support to AKS

The Azure Kubernetes Service team shared a detailed guide on how to use Dynamic Resource Allocation (DRA) with NVIDIA vGPU technology on AKS. his update improve

InfoQ AI/ML 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

QCon London 2026: Wrangling Telemetry at Scale, a Guide to Self-Hosted Observability

At QCon London 2026, Colin Douch discussed building and operating self-hosted monitoring stacks, surveyed the current tooling landscape, and explained how to bu

OpenAI News 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

OpenAI to acquire Astral

Accelerates Codex growth to power the next generation of Python developer tools

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Securing Enterprise AI with Weaviate

A complete guide on how to secure Weaviate enterprise deployments with OIDC, RBAC, and multi-tenant isolation.

InfoQ AI/ML 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

QCon London 2026: SBOMs Move From Best Practice to Legal Obligation as CRA Enforcement Looms

In a talk at QCon London 2026, Viktor Petersson argued that software teams are running out of time to adopt SBOMs (Software Bills of Materials) due to pending l

AIOps 101: The 3 Pillars of Reliably Deploying AI Models (Sponsored)

KDnuggets 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

AIOps 101: The 3 Pillars of Reliably Deploying AI Models (Sponsored)

In the lab, your AI model might seem perfect, but the real world is often where it breaks.

AWS Machine Learning 🏭 MLOps & LLMOps ⚡ AI Lesson 2mo ago

Build an offline feature store using Amazon SageMaker Unified Studio and SageMaker Catalog

This blog post provides step-by-step guidance on implementing an offline feature store using SageMaker Catalog within a SageMaker Unified Studio domain. By adop

AWS Machine Learning 🏭 MLOps & LLMOps ⚡ AI Lesson 3mo ago

Improve operational visibility for inference workloads on Amazon Bedrock with new CloudWatch metrics for TTFT and Estimated Quota Consumption

Today, we’re announcing two new Amazon CloudWatch metrics for Amazon Bedrock, TimeToFirstToken and EstimatedTPMQuotaUsage. In this post, we cover how these work

How we rebuilt the search architecture for high availability in GitHub Enterprise Server

GitHub Engineering 🏭 MLOps & LLMOps ⚡ AI Lesson 3mo ago

How we rebuilt the search architecture for high availability in GitHub Enterprise Server

Here's how we made the search experience better, faster, and more resilient for GHES customers. The post How we rebuilt the search architecture for high availab

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 3mo ago

Weaviate 1.36 Release

This release introduces HFresh vector index (Preview), and brings Server-side Batching, Object TTL, Async Replication Improvements, Drop Inverted Indices, and B

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 3mo ago

Weaviate Authentication & Authorization: A Complete Security Guide

Learn how to secure your Weaviate vector database with API keys, OIDC, and role-based access control (RBAC). Includes practical examples and setup steps.

Building Prometheus: How Backend Aggregation Enables Gigawatt-Scale AI Clusters

Engineering at Meta 🏭 MLOps & LLMOps ⚡ AI Lesson 4mo ago

Building Prometheus: How Backend Aggregation Enables Gigawatt-Scale AI Clusters

We’re sharing details of the role backend aggregation (BAG) plays in building Meta’s gigawatt-scale AI clusters like Prometheus. BAG allows us to seamlessly con

When protections outlive their purpose: A lesson on managing defense systems at scale

GitHub Engineering 🏭 MLOps & LLMOps ⚡ AI Lesson 4mo ago

When protections outlive their purpose: A lesson on managing defense systems at scale

User feedback led us to clean up outdated mitigations. See why observability and lifecycle management are critical for defense systems. The post When protection

OpenAI News 🏭 MLOps & LLMOps ⚡ AI Lesson 5mo ago

Datadog uses Codex for system-level code review

OpenAI and Datadog brand graphic with the OpenAI wordmark on the left, the Datadog logo on the right, and a central abstract brown fur-like texture panel on a w

OpenAI News 🏭 MLOps & LLMOps ⚡ AI Lesson 6mo ago

How We Used Codex to Ship Sora for Android in 28 Days

OpenAI shipped Sora for Android in 28 days using Codex. AI-assisted planning, translation, and parallel coding workflows helped a nimble team deliver rapid, rel

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 6mo ago

Announcing the new Weaviate Java Client v6

The Weaviate Java client v6 is now generally available! This release brings a completely redesigned API that embraces modern Java patterns, simplifies common op

OpenAI News 🏭 MLOps & LLMOps ⚡ AI Lesson 6mo ago

Inside JetBrains—the company reshaping how the world writes code

JetBrains is integrating GPT-5 across its coding tools, helping millions of developers design, reason, and build software faster.

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 7mo ago

Weaviate security release - Medium and High severity fixes for CVE-2025-67818 and CVE-2025-67819

Weaviate announces two CVEs that are fixed in updated versions of our product.

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 8mo ago

Weaviate is now ISO 27001 compliant

Announcing Weaviate has achieved ISO 27001 compliance.

OpenAI News 🏭 MLOps & LLMOps ⚡ AI Lesson 8mo ago

Introducing upgrades to Codex

Codex just got faster, more reliable, and better at real-time collaboration and tackling tasks independently anywhere you develop—whether via the terminal, IDE,

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 9mo ago

Using Weaviate Cloud Queries in MacOS apps

A practical guide on using Weaviate Cloud Queries in MacOS apps.

Torch compile caching for inference speed

Replicate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 9mo ago

Torch compile caching for inference speed

Cache your compiled models for faster boot and inference times

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 9mo ago

Evals and Guardrails in Enterprise workflows (Part 1)

Evals and Guardrails in enterprise workflows part 1

Hugging Face Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 10mo ago

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

Hugging Face Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 10mo ago

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 11mo ago

Introducing the New Weaviate Confluent Apache Kafka® Connector: Real-Time Vector Data Pipelines Made Easy

Learn about the new certified Weaviate Confluent Apache Kafka Connector!

Hugging Face Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 11mo ago

Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure

How GitHub engineers tackle platform problems

GitHub Engineering 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

How GitHub engineers tackle platform problems

Our best practices for quickly identifying, resolving, and preventing issues at scale. The post How GitHub engineers tackle platform problems appeared first on

OpenAI News 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

Shipping code faster with o3, o4-mini, and GPT-4.1

CodeRabbit uses OpenAI models to revolutionize code reviews—boosting accuracy, accelerating PR merges, and helping developers ship faster with fewer bugs and hi

OpenAI News 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

Introducing Stargate UAE

We’re launching Stargate UAE – the first international deployment of Stargate, OpenAI’s AI infrastructure platform.

NVIDIA H100 GPUs are here

Replicate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

NVIDIA H100 GPUs are here

NVIDIA H100 GPUs are here, with better performance and lower cost.

Hugging Face Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

How to Build an MCP Server with Gradio

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

Building AI Search APIs with Hono.js

Master modern search capabilities by building a powerful API server with Hono.js. Learn how to implement vector, hybrid, and generative search while maintaining

Weaviate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

Hardening your Weaviate OSS Installation

Moving your Weaviate OSS instance into production? Check out our handy guide to securing your Weaviate database in the cloud, including all the helpful modules

OpenAI News 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

We introduce MLE-bench, a benchmark for measuring how well AI agents perform at machine learning engineering.

FLUX is fast and it's open source

Replicate Blog 🏭 MLOps & LLMOps ⚡ AI Lesson 1y ago

FLUX is fast and it's open source

FLUX is now much faster on Replicate, and we’ve made our optimizations open-source so you can see exactly how they work and build upon them.