Applied AI
Image & Video AI
Stable Diffusion, Midjourney, DALL-E, Sora, ControlNet and AI video generation
Skills in this topic
3 skills — Sign in to track your progress
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
3d ago
Histogram-constrained Image Generation
arXiv:2606.31683v1 Announce Type: cross Abstract: Diffusion models have emerged as a dominant paradigm in generative modeling, enabling high-fidelity sampling f
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
4d ago
Learning to Adaptively Allocate Gaussians for Arbitrary-Scale Image Super-Resolution
arXiv:2606.29400v1 Announce Type: cross Abstract: In computer graphics, visual content is continuously warped, zoomed and resampled. This occurs when engines up
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
4d ago
Resonant Brane Splatting for Arbitrary-Scale Super-Resolution
arXiv:2606.29453v1 Announce Type: cross Abstract: Arbitrary-Scale Super-Resolution (ASR) reconstructs images at continuous magnification factors. Recent methods
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
5d ago
Home3D 1.0: A High-Fidelity Image-to-3D Asset Generation System for Interior Design
arXiv:2606.27923v1 Announce Type: cross Abstract: We present Home3D 1.0, a modular image-to-3D generation system that produces high-quality 3D assets from a sin
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
5d ago
BiDeMem: Bidirectional Degradation Memory for Explainable Image Restoration
arXiv:2606.28112v1 Announce Type: cross Abstract: Degradation-aware prompts, conditions, and latent priors are increasingly used in image restoration, yet they
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
5d ago
Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook
arXiv:2411.19537v2 Announce Type: replace-cross Abstract: We survey deepfake generation and detection techniques, covering all deepfake media types: image, vide
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1w ago
Scaling Multi-Reference Image Generation with Dynamic Reward Optimization
arXiv:2606.26947v1 Announce Type: cross Abstract: While personalized image generation has achieved remarkable progress, multi-reference image generation (MRIG)
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1w ago
Safe Autoregressive Image Generation with Iterative Self-Improving Codebooks
arXiv:2606.27147v1 Announce Type: cross Abstract: Unlike diffusion-based models that operate in continuous latent spaces, autoregressive unified multimodal mode
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1w ago
BELDE: Building a Large-scale Earth-observation Land-cover Dataset for Europe
arXiv:2606.20909v1 Announce Type: cross Abstract: Earth observation imagery plays a critical role in environmental monitoring, urban planning, disaster assessme
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1w ago
Text-to-Image Generative AI for Modeling and Simulation: Methods, Opportunities, and Applications
arXiv:2606.20991v1 Announce Type: cross Abstract: Text-to-image generation is a form of generative artificial intelligence (GenAI) that converts textual descrip
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1w ago
MoECodec: Image Compression for joint human and machine perception via Mixture-of-Experts
arXiv:2606.21033v1 Announce Type: cross Abstract: Image compression for machines calls for a unified codec that serves multiple downstream vision tasks. Existin
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1w ago
Semantic Browsing: Controllable Diversity for Image Generation
arXiv:2606.23679v1 Announce Type: cross Abstract: Modern text-to-image models excel in visual fidelity and prompt adherence. However, this strict adherence come
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1w ago
Hierarchical Concept-to-Appearance Guidance for Multi-Subject Image Generation
arXiv:2602.03448v2 Announce Type: replace-cross Abstract: Multi-subject image generation aims to synthesize images that faithfully preserve the identities of mu
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
2w ago
SketchXplain: Intuitive Visual Explanations of Image Classifiers with Sketches
arXiv:2606.17646v1 Announce Type: cross Abstract: Saliency map visualizations explain image-based AI predictions by pointing to regions, but these are often uni
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
2w ago
ActiveSAM: Image-Conditional Class Pruning for Fast and Accurate Open-Vocabulary Segmentation
arXiv:2606.16996v1 Announce Type: cross Abstract: Segment Anything Model 3 (SAM 3) provides a strong frozen backbone for concept-prompted segmentation, but appl
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
3w ago
Visual-TCAV: Concept-based Attribution and Saliency Maps for Post-hoc Explainability in Image Classification
arXiv:2411.05698v3 Announce Type: replace-cross Abstract: Convolutional Neural Networks (CNNs) have shown remarkable performance in image classification. Howeve
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
3w ago
CleanPatrick: A Benchmark for Image Data Cleaning
arXiv:2505.11034v2 Announce Type: replace-cross Abstract: Robust machine learning depends on clean data, yet current image data cleaning benchmarks rely on synt
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
3w ago
ZIPP:Zero-shot Image Personalization from Personas
arXiv:2606.08841v1 Announce Type: new Abstract: Text-to-image diffusion models are increasingly deployed in open-ended creative contexts, yet their outputs rema
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
3w ago
Page image classifier fine-tuned on century-spanning archives of scanned documents for further content-specific processing
arXiv:2606.07558v1 Announce Type: cross Abstract: Purpose: Digitization projects in the humanities produce vast, heterogeneous archives of historical documents,
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
3w ago
MedVision: Benchmarking Quantitative Medical Image Analysis
arXiv:2511.18676v2 Announce Type: replace-cross Abstract: Current vision-language models (VLMs) in medicine are primarily designed for categorical question answ
Reddit r/MachineLearning
🎨 Image & Video AI
⚡ AI Lesson
3w ago
Open image generation models are closer to closed-source quality than this sub thinks [D]
I run evaluations on generative image models as part of my workflow, mostly comparing coherence, prompt adherence, and compositional accuracy across different a
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
3w ago
STREAM: Stochastic Riemannian Flow Matching with Anisotropic Decoder for Digital Histopathology Image Generation
arXiv:2606.07036v1 Announce Type: cross Abstract: Synthetic histopathology image generation addresses critical challenges in computational pathology, including
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
4w ago
Balancing Image Compression and Generation with Bootstrapped Tokenization
arXiv:2606.05552v1 Announce Type: cross Abstract: Despite progress in image tokenization, standard methods encode redundant information by mixing all granularit
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Diffusion Image Generation with Explicit Modeling of Data Manifold Geometry
arXiv:2606.00094v1 Announce Type: cross Abstract: Image generative models aim to sample data points from the underlying data manifold, a task that requires lear
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Initialization is Half the Battle: Generating Diverse Images from a Guidance Potential Posterior
arXiv:2606.02453v1 Announce Type: cross Abstract: Despite the remarkable fidelity of generative models, they frequently suffer from mode collapse. Existing stra
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
GPIC: A Giant Permissive Image Corpus for Visual Generation
arXiv:2605.30341v1 Announce Type: cross Abstract: Studying scalable methods for visual generative modeling requires large, accessible, and stable datasets. We i
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Self-Cascaded Diffusion Models for Arbitrary-Scale Image Super-Resolution
arXiv:2506.07813v2 Announce Type: replace-cross Abstract: Arbitrary-scale image super-resolution aims to upsample images to any desired resolution, offering gre
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Phase-Aware Wavelet-Based-Scattering Encoder-Decoder for Dense Predictions
arXiv:2605.24621v1 Announce Type: cross Abstract: Scattering transforms achieve Lipschitz stability and translation invariance, but dense prediction tasks requi
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Leveraging pretrained RGB denoisers for hyperspectral image restoration
arXiv:2605.24769v1 Announce Type: cross Abstract: Hyperspectral image restoration faces several challenges, including limited training data, strong sensor speci
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Fusion Embedding for Pose-Guided Person Image Synthesis with Diffusion Model
arXiv:2412.07333v2 Announce Type: replace-cross Abstract: Pose-Guided Person Image Synthesis (PGPIS) aims to generate human images in specified poses while pres
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
EditCaption: Human-Refined SFT and HAE-DPO for Image Editing Instruction Synthesis
arXiv:2604.08213v2 Announce Type: replace-cross Abstract: High-quality source-target image pairs with precise editing instructions are essential for instruction
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution
arXiv:2605.23264v1 Announce Type: cross Abstract: Generative priors in Image Super-Resolution (SR) often compromise faithful restoration, we attribute this limi
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
VDE Bench: Evaluating The Capability of Image Editing Models to Modify Visual Documents
arXiv:2602.00122v2 Announce Type: replace-cross Abstract: In recent years, image editing models have made significant progress, enabling users to manipulate vis
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Perception-based Image Denoising via Generative Compression
arXiv:2602.11553v2 Announce Type: replace-cross Abstract: Image denoising aims to remove noise while preserving structural details and perceptual realism, yet d
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
FSCM: Frequency-Enhanced Spatial-Spectral Coupled Mamba for Infrared Hyperspectral Image Colorization
arXiv:2605.15880v1 Announce Type: cross Abstract: Thermal infrared imaging is robust to illumination variations and smoke interference, making it important for
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
A Paired Point-of-Care Ultrasound Dataset for Image Quality Enhancement and Benchmarking via a cGAN Baseline
arXiv:2605.08282v1 Announce Type: cross Abstract: Purpose: We aim to enhance the image quality of point-of-care ultrasound (POCUS) devices using deep learning a
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts
arXiv:2605.09296v1 Announce Type: cross Abstract: Recent generative models can produce images that appear highly realistic, raising challenges in distinguishing
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
CASCADE: Context-Aware Relaxation for Speculative Image Decoding
arXiv:2605.07230v1 Announce Type: cross Abstract: Autoregressive generation is a powerful approach for high-fidelity image synthesis, but it remains computation
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Leveraging Image Generators to Address Training Data Scarcity: The Gen4Regen Dataset for Forest Regeneration Mapping
arXiv:2605.05627v1 Announce Type: cross Abstract: Sustainable forest management relies on precise species composition mapping, yet traditional ground surveys ar
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
DBMSolver: A Training-free Diffusion Bridge Sampler for High-Quality Image-to-Image Translation
arXiv:2605.05889v1 Announce Type: cross Abstract: Diffusion-based image-to-image (I2I) translation excels in high-fidelity generation but suffers from slow samp
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
Continuous Expert Assembly: Instance-Conditioned Low-Rank Residuals for All-in-One Image Restoration
arXiv:2605.06127v1 Announce Type: cross Abstract: Real-world image degradation is often unknown, spatially non-uniform, and compositional, requiring all-in-one
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
PixelGen: Improving Pixel Diffusion with Perceptual Supervision
arXiv:2602.02493v2 Announce Type: replace-cross Abstract: Pixel diffusion generates images directly in pixel space, avoiding the VAE artifacts and representatio
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
1mo ago
DiffCap-Bench: A Comprehensive, Challenging, Robust Benchmark for Image Difference Captioning
arXiv:2605.04503v1 Announce Type: cross Abstract: Image Difference Captioning (IDC) generates natural language descriptions that precisely identify differences
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
2mo ago
Towards High Fidelity Face Swapping: A Comprehensive Survey and New Benchmark
arXiv:2605.00883v1 Announce Type: cross Abstract: Face swapping has witnessed significant progress in recent years, largely driven by advances in deep generativ
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
2mo ago
TOC-SR: Task-Optimal Compact diffusion for Image Super Resolution
arXiv:2605.02767v1 Announce Type: cross Abstract: Diffusion models have recently demonstrated strong performance for image restoration tasks, including super-re
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
2mo ago
DIPLI: Deep Image Prior Lucky Imaging for Blind Astronomical Image Restoration
arXiv:2503.15984v3 Announce Type: replace-cross Abstract: Modern image restoration and super-resolution methods utilize deep learning due to its superior perfor
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
2mo ago
VIPaint: Image Inpainting with Pre-Trained Diffusion Models via Variational Inference
arXiv:2411.18929v2 Announce Type: replace-cross Abstract: Diffusion probabilistic models learn to remove noise added during training, generating novel data (e.g
ArXiv cs.AI
🎨 Image & Video AI
📄 Paper
⚡ AI Lesson
2mo ago
EAD-Net: Emotion-Aware Talking Head Generation with Spatial Refinement and Temporal Coherence
arXiv:2604.23325v1 Announce Type: cross Abstract: Emotionally talking head video generation aims to generate expressive portrait videos with accurate lip synchr
DeepCamp AI