Using LLMs for safe low-level programming | Microsoft Research Forum
Skills:
LLM Foundations90%Fine-tuning LLMs80%LLM Engineering80%Prompt Craft70%Prompt Systems Engineering70%
Key Takeaways
The video discusses using Large Language Models (LLMs) for safe low-level programming, specifically for memory safety issues in C and C++ and for automatically fixing compilation errors in Rust code, leveraging tools like Chei compiler, MSA tool, Program dependence graph, and Rust assistant.
Full Transcript
[Music] the following talk combines two projects that both harness the lm's capabilities to understand and produce code both aim to help developers tackle the difficulties of safe low-level programming one to ensure memory safety in Legacy C code the other presents rust assistance a tool for developers to automatically fix compilation errors and [Music] rust hi my name is aim rogi and I'm a researcher in the future of scalable software engineering organization in Microsoft research I'm going to talk to you about our paper llm assistance for memory safety this paper will be presented at the 47th International Conference on software engineering in May later this year the lack of memory safety in low-level languages like C and C++ is one of the leading causes of software security vulnerabilities for instance a study by Microsoft estimated that 70% of the security bugs that Microsoft fixes and assigns a CV every year are due to memory safety issues researchers have proposed safe tets of C for example check C that with the help of additional Source level annotations provide memory safety guarantees with low performance overheads however the cost of adding these annotations and the code restructuring required to enable them becomes a bottleneck in the adoption of these tools in General application of formal verification to real software faces the same challenge in our paper we explore the use of pre-trained large language models to help with the task of code restructuring and inferring Source annotations required to adopt chei let's consider an example that takes an array of integers as input and sums the first n elements to reason about the memory safety of this function chei requires an annotation on P one such annotation is as shown here this tells the compiler that P is an array with at least n Elements which is enough to ensure the safety of memory accesses in this function it also helps impose an explicit obligation on the callers of this function that they must pass an appropriately sized array to it our goal is to infer such annotations with the help of llms for this problem llm seem like a perfect match it is hard to encode reasoning about real world code and complex code patterns in symbolic tools llms on the other hand have demonstrated tremendous code comprehension and reasoning capabilities similar to what programmers have even for real world code second llm hallucinations might lead to incorrect annotations but they cannot compromise memory safety once the annotations are added to the code the chei compiler guarantees memory safety even when the annotations are incorrect this way we get best of both the world however working with llms for whole program transformation in large code bases represents another challenge we need to break the task into smaller subtasks that can fit into llm prompts while adding relevant symbolic context to each prompt put another way in order for llms to be able to reason like programmers we need to provide them contexts that a programmer would otherwise consider our paper presents a framework for doing just that with the help of program dependence graph working in tandem with llms we Implement our ideas in a tool called MSA and evaluate it on real world code basis ranging up to 20,000 lines of code we observe that MSA can infer 86% of the annotations that state-of-the-art symbolic tools cannot Although our paper focuses on memory safety our methodology is more General and can be used to effectively leverage llms for scaling the use of formal verification to real software most importantly doing so without compromising on the soundness guarantees we are really excited about this research Direction up next my colleague fantesy will tell you about how we are leveraging llms to make it easier for the programmers to adopt rust thank you hello everyone I'm pandis and today I will be presenting our work on leveraging the power of large language models for safe low-level programming specifically I will focus on our recent paper about rust assistant which is a tool that uses llms to automatically fix compilation errors in code written in r this work was done together with other individuals that are listed on the screen and will appear in the International Conference on software engineering later this spring okay let's dive in why do we care about safe low-level programming with rust so the rust programming language with its memory and concurrency safety guarantees has established itself as a viable choice for building low-level software systems over the traditional and safe Alternatives like C and C++ these guarantees come from a strong ownership based type system which enforces memory and concurrency safety at compiled time however rust poses a steep learning care for developers especially when they encounter compilation errors related to Advanced language features such as ownership lifetime or trades at the same time rust is becoming increasingly more popular every year so as more and more developers adopt rust for writing critical software systems it is essential to tackle the difficulty in writing code in Rust in Microsoft research we created a tool called rust assistant that leverages the power of state-of-the-art llms to help developers by automatically suggesting fixes for R compilation errors our tool uses a careful combination of prompting techniques as well as iteration between a large language model and the rust compiler to deliver high accuracy of fixes rust assistant is able to achieve an impressive pick accuracy of roughly 74% on real world compilation errors in popular open source Ros repositories on GitHub okay let's now see how rust assistant Works step by step let's begin with the First Step Building the code and passing the build errors such errors can range from simple syntax mistakes to very complicated issues involving traits lifetime or ownership rules in Rust code spread across multiple files so when a developer writes rust code that doesn't compile the r compiler generates detailed error messages that include the error code the location of the error as well as documentation of examples related to this error code to illustrate this process let's look at the very simple example on the screen in this case the developer is trying to compare a custom verbosity level numeration on their code using the greater or equal operator however the r compiler throws an error stating that this binary operation cannot be applied to verbosity level the compiler suggest that the reason behind this error is because verbosity level does not Implement a trade that is required for performing such comparisons in Rust this detailed error message is precisely what rust assistant captures at the step preparing it for the next stage of processing at the next step rust assistant takes this detailed error information that is generated in the previous step and focuses on extracting the specific parts of the code that directly relevant to this error looking at the example on the screen the code Snippets related to the enumeration and it's using the log error function are automatically extracted by our tool this includes not only the problematic line of code but also other code Snippets that provide necessary context for understanding and resolving the error the tool also captures the error details such as the error code and the accompanying compiler suggestion about the missing tradeit for performing the comparison these extracted code Snippets and error details are then packaged into a prompt for the llm this ensures that the llm receives only the essential information required to suggest an accurate fix without being overwhelmed by relevant parts of the code base this careful localization step is crucial for both efficiency and accuracy especially when dealing with very large code bases now let's move to the last step here assistant sends the carefully localized prompt which includes the arrow details and the relevant code Snippets to the large language model API the llm generates a proposed fix formatted as a code diff in other words does not include ude the entire codee snippet for efficiency but only the new edited or deleted code lines for example in the case of our build error the llm suggests adding the missing trait to the enumeration as shown here on the screen this fix ensures that the comparison using the greater oral operator will now work as intended next rust assistant passes this adjusted fix and applies the changes to the appropriate file in the codebase once the fixes are applied our tool run runs again the r compiler to verify if the build error has been resolved if the code compiles then great news the process is now complete and we can do further validations like running any unit tests however if new errors appear or if the fix doesn't fully resolve the issue R assistant sends the updated context back to the LM iterating until the code compiles error fre and this iterative process allows our tool to handle complex multi-step fixes while ensuring correctness and alignment with the developer intent of course the example that I showed here is a very simple one but you can imagine the tool being able to fix much more complicated build errors to summarize I presented a quick walkthr of how rust assistant can be used to help developers automatically fixed build errors in the rust code basis in our paper we evaluated rust assistant on the top 100 rust repositories on GitHub and showed that it can achieve an impressive pick accuracy of roughly 74% on rear old compilation errors we invite you to read our ex paper as it not only discusses the evaluation results in detail but also dives into interesting technical details such as how we designed our prompts as well as various techniques that we developed for scaling rust assistant on very large code basis without losing accuracy thank you for listening
Original Description
Aseem Rastogi, Principal Researcher, and Pantazis Deligiannis, Principal Research Engineer from Microsoft Research FoSSE (Future of Scalable Software Engineering) discuss the technical results from ICSE'2025 on using Large Language Models (LLMs) for safe low-level programming. The results demonstrate LLMs inferring machine-checkable memory safety invariants in legacy C code, and how LLMs assist in fixing compilation errors in Rust codebases.
LLM assistance for memory safety: https://www.microsoft.com/en-us/research/uploads/prod/2024/08/main.pdf
RustAssistant: Using LLMs to fix compilation errors in Rust code: https://www.microsoft.com/en-us/research/uploads/prod/2024/08/paper.pdf
This session aired on February 25, 2025, at Microsoft Research Forum, Episode 5.
Register for the series: https://aka.ms/registerresearchforumYTe5
Continue watching episode 5: https://aka.ms/researchforumYTe5
Explore all previous episodes: https://aka.ms/researchforumYTplaylist
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from Microsoft Research · Microsoft Research · 0 of 60
← Previous
Next →
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Frontiers in ML: Learning from Limited Labeled Data: Challenges and Opportunities for NLP
Microsoft Research
Frontiers in Machine Learning: Climate Impact of Machine Learning
Microsoft Research
Frontiers in Machine Learning: Security and Machine Learning
Microsoft Research
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
Microsoft Research
Early Indicators of the Effect of the Global Shift to Remote Work on People with Disabilities
Microsoft Research
Remote Work and Well-Being
Microsoft Research
Challenges and Gratitude of Software Developers During COVID-19 Working From Home
Microsoft Research
Towards a Practical Virtual Office for Mobile Knowledge Workers
Microsoft Research
Impact of COVID-19 crisis on the future of work in India
Microsoft Research
Empowering and Supporting Remote Software Development Team Members through a Culture of Allyship
Microsoft Research
How Work From Home Affects Collaboration: Information Workers in a Natural Experiment During COVID19
Microsoft Research
Phong Surface: Efficient 3D Model Fitting using Lifted Optimization
Microsoft Research
Managing Tasks Across the Work-Life Boundary: Opportunities, Challenges, and Directions
Microsoft Research
Microsoft Urban Futures Summer Workshop | Data Driven Urban Transformation [Day 1]
Microsoft Research
Microsoft Urban Futures Summer Workshop | Sensors and Data [Day 2]
Microsoft Research
Microsoft Urban Futures Summer Workshop | Policy and Social Impact [Day 3]
Microsoft Research
Directions in ML: Algorithmic foundations of neural architecture search
Microsoft Research
MineRL Competition 2020
Microsoft Research
Can we make better software by using ML and AI techniques? With Chandra Maddila and Chetan Bansal
Microsoft Research
From Paper to Product
Microsoft Research
SkinnerDB: Regret Bounded Query Evaluation using RL
Microsoft Research
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
Microsoft Research
Programming with Proofs for High-assurance Software
Microsoft Research
Platform for Situated Intelligence Overview
Microsoft Research
Directional Sources & Listeners in Interactive Sound Propagation using Reciprocal Wave Field Coding
Microsoft Research
Galactic Bell Star Music Demo
Microsoft Research
Importing Animations in Microsoft Expressive Pixels (9 of 9)
Microsoft Research
Welcome to Microsoft Expressive Pixels (1 of 9)
Microsoft Research
Getting Started with Microsoft Expressive Pixels (2 of 9)
Microsoft Research
Creating an Image in Microsoft Expressive Pixels (3 of 9)
Microsoft Research
Creating Animations in Microsoft Expressive Pixels (4 of 9)
Microsoft Research
Managing Animation Galleries in Microsoft Expressive Pixels (5 of 9)
Microsoft Research
Creating Fragments in Microsoft Expressive Pixels (6 of 9)
Microsoft Research
Using Layers in Microsoft Expressive Pixels (7 of 9)
Microsoft Research
Exporting Animations with Microsoft Expressive Pixels (8 of 9)
Microsoft Research
What Kind of Computation is Human Cognition? A Brief History of Thought (Episode 2/2)
Microsoft Research
What Kind of Computation is Human Cognition? A Brief History of Thought (Episode 1/2)
Microsoft Research
Planeverb: Interactive sound propagation for dynamic scenes using 2D wave simulation
Microsoft Research
Making cryptography accessible, efficient, and scalable with Dr. Divya Gupta and Dr. Rahul Sharma
Microsoft Research
Beyond the mega-data center: networking multi-data center regions (SIGCOMM 2020 Talk)
Microsoft Research
Optics for the cloud – Light at the end of the tunnel? (SIGCOMM 2020 Workshop)
Microsoft Research
Beyond the mega-data center: networking multi-data center regions (SIGCOMM 2020 short talk)
Microsoft Research
Sirius: A Flat Datacenter Network with Nanosecond Optical Switching (SIGCOMM 2020 short talk)
Microsoft Research
Novel Image Captioning
Microsoft Research
Forest Sound Scene Simulation and Bird Localization with Distributed Microphone Arrays
Microsoft Research
Decoding Music Attention from “EEG headphones”: a User-friendly Auditory Brain-computer Interface
Microsoft Research
How does holographic storage work?
Microsoft Research
The physics of hologram formation in iron doped lithium niobate
Microsoft Research
Introduction to coax: A Modular RL Package
Microsoft Research
Directions in ML: "Neural architecture search: Coming of age"
Microsoft Research
Microsoft Research AI Breakthroughs 2020: 20 minute research talks + Q&A panel
Microsoft Research
Fireside Chat with Johannes Gehrke during Microsoft Research AI Breakthroughs 2020
Microsoft Research
Fireside Chat with Susan Dumais during Microsoft Research AI Breakthroughs 2020
Microsoft Research
Microsoft Research AI Breakthroughs 2020: 20 minute research talks, Q&A panel, and event wrap-up
Microsoft Research
Clinical Research with FHIR
Microsoft Research
Soundscape Street Preview
Microsoft Research
Tilt-Responsive Techniques for Digital Drawing Boards
Microsoft Research
SurfaceFleet: Exploring Distributed Interactions Unbounded from Device, Application, User, and Time
Microsoft Research
Haptic PIVOT: On-Demand Handhelds in VR
Microsoft Research
SurfaceFleet Supplemental Video Demonstration (UIST 2020)
Microsoft Research
More on: LLM Foundations
View skill →Related Reads
📰
📰
📰
📰
Your Claude Prompts Are Broken and You Don’t Know It Yet.
Medium · LLM
50 AI Prompts That Would Have Saved Me Hundreds of Hours in College
Medium · AI
50 AI Prompts That Would Have Saved Me Hundreds of Hours in College
Medium · ChatGPT
Everyone Is Learning Prompt Engineering. I Think the Next Skill Is Loop Engineering.
Medium · AI
🎓
Tutor Explanation
DeepCamp AI