5,060 articles

📰 ArXiv cs.AI

Articles from ArXiv cs.AI · 5,060 articles · Updated every 3 hours · View all reads

All ⚡ AI Lessons (13857) ArXiv cs.AIDev.to · FORUM WEBDev.to AIForbes InnovationOpenAI NewsMedium · Programming
ArXiv cs.AI 📄 Paper 5d ago
PinpointQA: A Dataset and Benchmark for Small Object-Centric Spatial Understanding in Indoor Videos
arXiv:2604.08991v1 Announce Type: cross Abstract: Small object-centric spatial understanding in indoor videos remains a significant challenge for multimodal lar
ArXiv cs.AI 📄 Paper 5d ago
ASTRA: Adaptive Semantic Tree Reasoning Architecture for Complex Table Question Answering
arXiv:2604.08999v1 Announce Type: cross Abstract: Table serialization remains a critical bottleneck for Large Language Models (LLMs) in complex table question a
ArXiv cs.AI 📄 Paper 5d ago
Towards Linguistically-informed Representations for English as a Second or Foreign Language: Review, Construction and Application
arXiv:2604.09008v1 Announce Type: cross Abstract: The widespread use of English as a Second or Foreign Language (ESFL) has sparked a paradigm shift: ESFL is not
ArXiv cs.AI 📄 Paper 5d ago
Identification and Anonymization of Named Entities in Unstructured Information Sources for Use in Social Engineering Detection
arXiv:2604.09016v1 Announce Type: cross Abstract: This study addresses the challenge of creating datasets for cybercrime analysis while complying with the requi
ArXiv cs.AI 📄 Paper 5d ago
Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA
arXiv:2604.09019v1 Announce Type: cross Abstract: Two-hop QA retrieval splits queries into two regimes determined by whether the hop-2 entity is explicitly name
ArXiv cs.AI 📄 Paper 5d ago
Noise-Aware In-Context Learning for Hallucination Mitigation in ALLMs
arXiv:2604.09021v1 Announce Type: cross Abstract: Auditory large language models (ALLMs) have demonstrated strong general capabilities in audio understanding an
ArXiv cs.AI 📄 Paper 5d ago
Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection
arXiv:2604.09024v1 Announce Type: cross Abstract: Multi-modal large language models (MLLMs) have emerged as powerful tools for analyzing Internet-scale image da
ArXiv cs.AI 📄 Paper 5d ago
Skill-Conditioned Visual Geolocation for Vision-Language
arXiv:2604.09025v1 Announce Type: cross Abstract: Vision-language models (VLMs) have shown a promising ability in image geolocation, but they still lack structu
ArXiv cs.AI 📄 Paper 5d ago
CONDESION-BENCH: Conditional Decision-Making of Large Language Models in Compositional Action Space
arXiv:2604.09029v1 Announce Type: cross Abstract: Large language models have been widely explored as decision-support tools in high-stakes domains due to their
ArXiv cs.AI 📄 Paper 5d ago
U-Cast: A Surprisingly Simple and Efficient Frontier Probabilistic AI Weather Forecaster
arXiv:2604.09041v1 Announce Type: cross Abstract: AI-based weather forecasting now rivals traditional physics-based ensembles, but state-of-the-art (SOTA) model
ArXiv cs.AI 📄 Paper 5d ago
Watt Counts: Energy-Aware Benchmark for Sustainable LLM Inference on Heterogeneous GPU Architectures
arXiv:2604.09048v1 Announce Type: cross Abstract: While the large energy consumption of Large Language Models (LLMs) is recognized by the community, system oper
ArXiv cs.AI 📄 Paper 5d ago
PDE-regularized Dynamics-informed Diffusion with Uncertainty-aware Filtering for Long-Horizon Dynamics
arXiv:2604.09058v1 Announce Type: cross Abstract: Long-horizon spatiotemporal prediction remains a challenging problem due to cumulative errors, noise amplifica
ArXiv cs.AI 📄 Paper 5d ago
Learning Vision-Language-Action World Models for Autonomous Driving
arXiv:2604.09059v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models have recently achieved notable progress in end-to-end autonomous driving b
ArXiv cs.AI 📄 Paper 5d ago
Frequency-Enhanced Diffusion Models: Curriculum-Guided Semantic Alignment for Zero-Shot Skeleton Action Recognition
arXiv:2604.09063v1 Announce Type: cross Abstract: Human action recognition is pivotal in computer vision, with applications ranging from surveillance to human-r
ArXiv cs.AI 📄 Paper 5d ago
NyayaMind- A Framework for Transparent Legal Reasoning and Judgment Prediction in the Indian Legal System
arXiv:2604.09069v1 Announce Type: cross Abstract: Court Judgment Prediction and Explanation (CJPE) aims to predict a judicial decision and provide a legally gro
ArXiv cs.AI 📄 Paper 5d ago
Beyond Isolated Clients: Integrating Graph-Based Embeddings into Event Sequence Models
arXiv:2604.09085v1 Announce Type: cross Abstract: Large-scale digital platforms generate billions of timestamped user-item interactions (events) that are crucia
ArXiv cs.AI 📄 Paper 5d ago
DeepGuard: Secure Code Generation via Multi-Layer Semantic Aggregation
arXiv:2604.09089v1 Announce Type: cross Abstract: Large Language Models (LLMs) for code generation can replicate insecure patterns from their training data. To
ArXiv cs.AI 📄 Paper 5d ago
CLIP-Inspector: Model-Level Backdoor Detection for Prompt-Tuned CLIP via OOD Trigger Inversion
arXiv:2604.09101v1 Announce Type: cross Abstract: Organisations with limited data and computational resources increasingly outsource model training to Machine L
ArXiv cs.AI 📄 Paper 5d ago
Scheming in the wild: detecting real-world AI scheming incidents with open-source intelligence
arXiv:2604.09104v1 Announce Type: cross Abstract: Scheming, the covert pursuit of misaligned goals by AI systems, represents a potentially catastrophic risk, ye
ArXiv cs.AI 📄 Paper 5d ago
TensorHub: Scalable and Elastic Weight Transfer for LLM RL Training
arXiv:2604.09107v1 Announce Type: cross Abstract: Modern LLM reinforcement learning (RL) workloads require a highly efficient weight transfer system to scale tr