Core AI
Large Language Models
Deep dives into GPT, Claude, Gemini, Llama and the transformers powering modern AI
Skills in this topic
5 skills — Sign in to track your progress
LLM Foundations
beginner
Explain how transformers generate text
Prompt Craft
beginner
Write zero-shot and few-shot prompts
LLM Engineering
intermediate
Call LLM APIs with function/tool use
Fine-tuning LLMs
advanced
Prepare fine-tuning datasets
Multimodal LLMs
advanced
Use GPT-4V / Claude Vision for image understanding
All Reads (29,819)
Articles (12686)Blog Posts (5636)Tutorials (2392)Research Papers (8231)News (874)

Medium · Machine Learning
🧠 Large Language Models
⚡ AI Lesson
5d ago
The Illusion of Deep Learning: Why AI Needs Brainwaves to Remember
Static MLPs are holding back Large Language Models. Discover how the Continuum Memory System (CMS) escapes the “Frequency Zero” trap and… Continue reading on AI
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Recursive Self-Evolving Agents via Held-Out Selection
arXiv:2606.28374v1 Announce Type: new Abstract: LLM agents are increasingly improved without weight updates by evolving a natural-language artifact, such as ref
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Data and Evaluation Closed-Loop for Model Capability Enhancement
arXiv:2606.28471v1 Announce Type: new Abstract: Model capability is the central variable in LLM pre-training, yet is never observed directly: data shapes it pro
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
GPTNT: Benchmarking Real-Time Collaboration Between Multimodal Agents on Keep Talking And Nobody Explodes
arXiv:2606.28514v1 Announce Type: new Abstract: Multimodal models are increasingly deployed to solve tasks collaboratively with humans or other artificial agent
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
IMCBench: A benchmark for multimodal LLMs in Image-grounded Medical Conversations
arXiv:2606.28556v1 Announce Type: new Abstract: Recent advances in large language models and vision-language models have enabled reasoning over multimodal data,
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Search for Truth from Reasoning: A Dynamic Representation Editing Framework for Steering LLM Trajectories
arXiv:2606.28589v1 Announce Type: new Abstract: Current approaches to enhance Large Language Model (LLM) reasoning, such as Chain-of-Thought and "Wait" prompts,
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Aristotelian Virtue Profiling of LLMs through Ethical Dilemmas
arXiv:2606.28683v1 Announce Type: new Abstract: Large Language Models (LLMs) often face ethical tradeoffs in which several responses may be defensible but expre
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
BV-Blend: Uncertainty-Weighted Historical Baselines for Stable Critic-Free RL with Verifiable Rewards
arXiv:2606.28707v1 Announce Type: new Abstract: Critic-free reinforcement learning with verifiable rewards (RLVR), exemplified by Group Relative Policy Optimiza
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Mechanistic Personality Analysis of LLMs Steering Personality via Latent Feature Interventions
arXiv:2606.28770v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated the ability to simulate human-like OCEAN personality traits in ge
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Primary ICD Category Prediction using LLM-based Probing
arXiv:2606.28798v1 Announce Type: new Abstract: Objective: ICD codes are central to reimbursement, research, and population health surveillance, yet automated c
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Customized Generative AI Agent for Transportation Engineering Practice: A Development and Continued Pre-training Guideline
arXiv:2606.29014v1 Announce Type: new Abstract: Recent advancements in generative artificial intelligence (AI) and large language models (LLMs) have shown signi
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Memory as an Attack Surface in LLM Agents: A Study on Multiple-Choice Question Answering
arXiv:2606.29030v1 Announce Type: new Abstract: AI agents extend conventional large language model (LLM) applications by integrating language understanding with
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Low-cost concept-based localized explanations: How far can we get with training-free approaches?
arXiv:2606.29069v1 Announce Type: new Abstract: Concept-based Explainable AI (C-XAI) seeks human-understandable explanations grounded in semantic concepts, yet
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Characterizing Large Language Model Agentic Workflows: A Study on N8n Ecosystem
arXiv:2606.29116v1 Announce Type: new Abstract: Large Language Models (LLMs) are rapidly being adopted in low-code and no-code automation platforms, where non-e
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
HiComm: Hierarchical Communication for Multi-agent Reinforcement Learning
arXiv:2606.29126v1 Announce Type: new Abstract: Cooperative multi-agent reinforcement learning (MARL) often relies on communication to mitigate partial observab
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Flow Reasoning Models: Scaling Reasoning Through Iterative Self-Refinement
arXiv:2606.29150v1 Announce Type: new Abstract: Discrete flow models have recently shown promising performance on few-step text generation; however, when naivel
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Selective Memory Retention for Long-Horizon LLM Agents
arXiv:2606.29178v1 Announce Type: new Abstract: When does retention matter for memory-augmented LLM agents? We study this with TraceRetain, a lightweight framew
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Measuring Graph-to-Graph Semantic Similarity in Knowledge Graphs: An Empirical Evaluation of Knowledge Graph Embeddings
arXiv:2606.29180v2 Announce Type: new Abstract: A Knowledge Graph (KG) represents facts as structured triples and is widely used to organize relational knowledg
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Evidence-Informed LLM Beliefs for Continual Scientific Discovery
arXiv:2606.29182v1 Announce Type: new Abstract: Open-ended scientific discovery with large language models (LLMs) increasingly operates as a long-horizon loop o
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
PolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents
arXiv:2606.29225v1 Announce Type: new Abstract: LLM agents handle user requests on behalf of organizations through tool calls and must follow the company polici
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
When Summaries Distort Decisions: Information Fidelity in LLM-Compressed Financial Analysis
arXiv:2606.29251v1 Announce Type: new Abstract: Financial decision-makers face more information than they can directly inspect, making context compression neces
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
The Complexity Ceiling Benchmark: A Multi-Domain Evaluation of Sequential Reasoning Under Depth Scaling
arXiv:2606.29278v1 Announce Type: new Abstract: We introduce the Complexity Ceiling Benchmark (CCB), a controlled evaluation of how language-model reasoning dec
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Process Advantage Signal Shaping: A Paradigm-Agnostic Middleware for Process-Supervised RL in LLM Reasoners
arXiv:2606.29296v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) is a default recipe for process-supervised reinforcement learning of L
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Hierarchical Experimentalist Agents
arXiv:2606.29315v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used to take actions in the real world and support human decision-
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
PHF: Privileged Hidden Flow for On-Policy Self-Distillation
arXiv:2606.29340v1 Announce Type: new Abstract: On-policy self-distillation (OPSD) trains a reasoning model on rollouts sampled from its own policy by matching
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
When LLMs Develop Languages: Symbolic Communication for Efficient Multi-Agent Reasoning
arXiv:2606.29354v1 Announce Type: new Abstract: Chain-of-Thought (CoT) improves large language models (LLMs) on difficult reasoning tasks, but it often incurs l
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Diagnosing and Repairing Factual Errors in RAG under Budget Constraints
arXiv:2606.29377v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) improves the factuality of large language models by grounding responses in
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
LLM-Guided Planning for Multi-hop Reasoning over Multimodal Nuclear Regulatory Documents
arXiv:2606.29399v1 Announce Type: new Abstract: Reviewing nuclear regulatory documents requires multi-hop reasoning across tens of thousands of pages, where jud
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
FADE: Mitigating Hallucinations by Reducing Language-Prior Dominance in Large Vision-Language Models
arXiv:2606.29431v1 Announce Type: new Abstract: Despite the impressive capabilities of Large Vision-Language Models (LVLMs), they remain susceptible to hallucin
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Faults in Our Formal Benchmarking: Dataset Defects and Evaluation Failures in Lean Theorem Proving
arXiv:2606.29493v1 Announce Type: new Abstract: Benchmarks for LLM-assisted theorem proving in Lean are often treated as intrinsically reliable because every so
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Cognitive World Models for Process-Level Social Influence Evaluation
arXiv:2606.29495v1 Announce Type: new Abstract: Social influence dialogue changes user behavior by altering internal cognitive states. The central evaluation qu
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
SFBench: The SciFy Scientific Feasibility Benchmark
arXiv:2606.29630v1 Announce Type: new Abstract: We present SFBench, a benchmark dataset for evaluating systems that assess the feasibility of scientific claims.
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Budgeted Act-or-Defer Multi-Agent LLM Deliberation with Local Reliability Bounds
arXiv:2606.29654v1 Announce Type: new Abstract: Multi-agent deliberation among LLMs can improve reasoning, but deployment requires deciding when the current ans
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Diversity is the Strength of the AI Crowd
arXiv:2606.29661v1 Announce Type: new Abstract: Top AI forecasting systems are approaching superforecaster-level accuracy on future world events, but still rely
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Toward Secure and Reliable PDDL Formalization of Large Language Models with Planner-in-the-Loop Feedback
arXiv:2606.29700v1 Announce Type: new Abstract: Planning often requires symbolic specifications that are both executable and verifiable. For large language mode
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
GUICrafter: Weakly-Supervised GUI Agent Leveraging Massive Unannotated Screenshots
arXiv:2606.29705v1 Announce Type: new Abstract: Data, as the fundamental substrate of modern intelligence, has greatly driven the development of current foundat
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
DEEPMED Search: An Open-Source Agentic Platform for Medical Deep Research with Introspective Verification
arXiv:2606.29746v1 Announce Type: new Abstract: Navigating the deluge of heterogeneous medical data, from academic literature (PubMed) to clinical guidelines (W
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Rethinking Generative Reconstruction Attacks against Graph Neural Network Models
arXiv:2606.29748v1 Announce Type: new Abstract: The application of graph data in numerous disciplines raises the need for gathering and analyzing huge volumes o
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
CLQT: A Closed-Loop, Cost-Aware, Strategy-Consistent Benchmark for Diagnostic Evaluation of LLM Portfolio-Management Agents
arXiv:2606.29771v1 Announce Type: new Abstract: LLM agents are increasingly cast as autonomous portfolio managers, and benchmarks have moved from financial ques
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
The CRISTAL Method: Neurosymbolic analysis from AI-synthesized world models
arXiv:2606.29799v1 Announce Type: new Abstract: This project introduces the CRISTAL Method (Coherent Reliable Intentional Synthesis of Truthful Analysis Logic),
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Beyond Triplet Plausibility: Relation Set Completion in Knowledge Graphs
arXiv:2606.29860v2 Announce Type: new Abstract: Knowledge graphs (KGs) organize real-world knowledge as triplets and underpin many downstream applications. Due
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
AI Training Manager: Bounded Closed-Loop Control of Adaptive Training Recipes
arXiv:2606.29871v1 Announce Type: new Abstract: We present the AI Training Manager, a bounded LLM-based supervisory controller for adaptive machine learning tra
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
HippoSpark: An On-Demand Experience System for LLM Reasoning
arXiv:2606.29929v1 Announce Type: new Abstract: Distilling historical trajectories into reusable experience to enhance future problem-solving has become a focal
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
First-Order Temporal Logic Tensor Networks
arXiv:2606.29972v1 Announce Type: new Abstract: Most of the existing neuro-symbolic AI methods focus on the scenario of static knowledge where objects do not ch
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Exploration and Online Transfer with Behavioral Foundation Models
arXiv:2606.29980v2 Announce Type: new Abstract: Zero-shot Transfer in Reinforcement Learning (RL) aims to train an agent that can generate optimal policies for
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Be Faithful When Response: Returning Fluent and Grounded Answers for Vision-Language Models Reinforcement Learning
arXiv:2606.29984v1 Announce Type: new Abstract: Reinforcement Learning (RL) is an important paradigm for improving the reasoning capabilities of Vision-Language
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Structural Certification for Reliable Physical Design with Language Models
arXiv:2606.30107v1 Announce Type: new Abstract: An unreliable language model can be made to produce reliable physical designs if the authority to assert is move
ArXiv cs.AI
🧠 Large Language Models
📄 Paper
⚡ AI Lesson
5d ago
Open Problems in Constitutional Preference Reconstruction
arXiv:2606.30116v1 Announce Type: new Abstract: Pairwise preference data is widely used for training and evaluating language models (e.g., RLHF), but each datap
DeepCamp AI