📰 AI News
6,245 articles · Updated every 3 hours
All
⚡ AI Lessons (4935)
ArXiv cs.AIOpenAI NewsHugging Face BlogForbes InnovationDev.to AIWeaviate Blog
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
16h ago
GeoSURGE: Geo-localization using Semantic Fusion with Hierarchy of Geographic Embeddings
arXiv:2510.01448v2 Announce Type: replace-cross Abstract: Worldwide visual geo-localization aims to determine the geographic location of an image anywhere on Ea
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
Attention-Aligned Reasoning for Large Language Models
arXiv:2510.03223v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) tend to generate a long reasoning chain when solving complex tasks. Howev
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
16h ago
Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices
arXiv:2510.06882v2 Announce Type: replace-cross Abstract: Edge devices have limited resources, which inevitably leads to situations where stream processing serv
ArXiv cs.AI
📄 Paper
16h ago
Gelina: Unified Speech and Gesture Synthesis via Interleaved Token Prediction
arXiv:2510.12834v3 Announce Type: replace-cross Abstract: Human communication is multimodal, with speech and gestures tightly coupled, yet most computational me
ArXiv cs.AI
📄 Paper
16h ago
Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas Across 41 Occupations
arXiv:2510.21011v2 Announce Type: replace-cross Abstract: As generative AI tools are increasingly used to portray people in professional roles, understanding th
ArXiv cs.AI
📄 Paper
16h ago
Compositional Image Synthesis with Inference-Time Scaling
arXiv:2510.24133v2 Announce Type: replace-cross Abstract: Despite their impressive realism, modern text-to-image models still struggle with compositionality, of
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding
arXiv:2511.00810v3 Announce Type: replace-cross Abstract: Graphical user interface (GUI) grounding is a key capability for computer-use agents, mapping natural-
ArXiv cs.AI
📄 Paper
16h ago
Causal Graph Neural Networks for Healthcare
arXiv:2511.02531v5 Announce Type: replace-cross Abstract: Healthcare artificial intelligence systems often degrade in performance when deployed across instituti
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
Route Experts by Sequence, not by Token
arXiv:2511.06494v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) architectures scale large language models (LLMs) by activating only a subset
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
16h ago
Binary Verification for Zero-Shot Vision
arXiv:2511.10983v2 Announce Type: replace-cross Abstract: We propose a training-free, binary verification workflow for zero-shot vision with off-the-shelf VLMs.
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
Any4D: Open-Prompt 4D Generation from Natural Language and Images
arXiv:2511.18746v2 Announce Type: replace-cross Abstract: While video-generation-based embodied world models have gained increasing attention, their reliance on
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
Aligning LLMs with Biomedical Knowledge using Balanced Fine-Tuning
arXiv:2511.21075v2 Announce Type: replace-cross Abstract: Aligning Large Language Models (LLMs) with biomedical knowledge requires understanding both concepts a
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos
arXiv:2512.01707v2 Announce Type: replace-cross Abstract: Streaming video understanding requires models not only to process temporally incoming frames, but also
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning
arXiv:2512.02425v2 Announce Type: replace-cross Abstract: Recent advances in video large language models have demonstrated strong capabilities in understanding
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
16h ago
Fluent Alignment with Disfluent Judges: Post-training for Lower-resource Languages
arXiv:2512.08777v2 Announce Type: replace-cross Abstract: We propose a post-training method for lower-resource languages that preserves the fluency of language
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
16h ago
Particulate: Feed-Forward 3D Object Articulation
arXiv:2512.11798v2 Announce Type: replace-cross Abstract: We introduce Particulate, a feed-forward model that, given a 3D mesh of an object, infers its articula
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
arXiv:2512.13607v2 Announce Type: replace-cross Abstract: Building general-purpose reasoning models with reinforcement learning (RL) entails substantial cross-d
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations
arXiv:2512.14080v2 Announce Type: replace-cross Abstract: Mixture of Experts (MoE) models have emerged as the de facto architecture for scaling up language mode
ArXiv cs.AI
📄 Paper
⚡ AI Lesson
16h ago
PathFinder: Advancing Path Loss Prediction for Single-to-Multi-Transmitter Scenario
arXiv:2512.14150v3 Announce Type: replace-cross Abstract: Radio path loss prediction (RPP) is critical for optimizing 5G networks and enabling IoT, smart city,
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
Dual-objective Language Models: Training Efficiency Without Overfitting
arXiv:2512.14549v3 Announce Type: replace-cross Abstract: This paper combines autoregressive and masked-diffusion training objectives without any architectural
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
MRG-R1: Reinforcement Learning for Clinically Aligned Medical Report Generation
arXiv:2512.16145v2 Announce Type: replace-cross Abstract: Medical report generation aims to automatically produce radiology-style reports from medical images, s
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs
arXiv:2512.16378v3 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) expand beyond text, integrating speech as a native modality has given
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
16h ago
The Dual-State Architecture for Reliable LLM Agents
arXiv:2512.20660v2 Announce Type: replace-cross Abstract: Large Language Models deployed as code generation agents exhibit stochastic behavior incompatible with
ArXiv cs.AI
👁️ Computer Vision
📄 Paper
⚡ AI Lesson
16h ago
RoAD Benchmark: How LiDAR Models Fail under Coupled Domain Shifts and Label Evolution
arXiv:2601.07855v2 Announce Type: replace-cross Abstract: For 3D perception systems to operate reliably in real-world environments, they must remain robust to e
DeepCamp AI