Master's thesis; Large Language Models Code Generation and Verification
We are looking for a dedicated master’s student to join us in the Connected Intelligence Unit at RISE.
The Connected Intelligence Unit is part of RISE Computer Science in Kista. The current research focus is on devising intelligent autonomous systems for controlling and allocating resources in future computer and communication networks. Among the group's key technologies are the Internet of Things (IoT) and Edge computing. The unit conducts projects together with industry and academic partners from Sweden and across the world.
Background and Purpose
Large Language Models (LLMs) have shown remarkable potential in generating complex code, scripts, and configurations, and are increasingly being integrated into real-world systems for automation. However, despite their capabilities, "hallucinations”, i.e., incorrect or nonsensical outputs produced by the model, remain a significant challenge. These errors can undermine the trust of developers and operators, especially when deploying code or configurations into production environments where mistakes can lead to severe consequences. Such problems emphasize the need for rigorous formal verification processes to ensure the correctness and reliability of LLM-generated outputs before the deployment in production systems.
Thesis Description
This thesis aims to evaluate and enhance the performance (i.e., accuracy and reliability) of LLMs in generating correct and functional code and scripts. The main objectives are to (i) systematically analyze the limitations of LLMs in code generation and verification, (ii) integrate formal verification tools into LLMs’ code generation process, and (iii) explore the potential development of a system incorporating specialized smaller models for generating verifiable code. You will deploy and evaluate one or more widely-used open- and closed-source LLMs, acquire foundational knowledge in formal verification techniques, and identify opportunities to integrate these
techniques into the LLM code generation workflow.
Terms:
Start Time: As soon as possible
Scope: 30 hp
Location: RISE Computer Science, Kista, Stockholm. Option to partially work remotely.
Who are you?
We expect you to have a solid knowledge of machine learning theory, good programming skills (especially Python and C; Dafny or SMT solvers are a plus), and an interest in computer systems and solving complex problems.
We are looking forward to receiving your application!
To know more, please contact Dejan Kostic (dejan.kostic@ri.se). The thesis will be conducted together with the KTH Networked System Laboratory. Applications should include a brief personal letter, CV, recent grades, and a code excerpt. Candidates are encouraged to send in their application as soon as possible but at the latest by the 15th of January 2025. Suitable applicants will be interviewed as soon as applications are received.
Master thesis, Large Language Models, Systems, Code Generation, Code Verification, RISE, Stockholm
Om jobbet
Ort
Kista
Anställningsform
Tidsbegränsad anställning
Job type
Student - examensarbete/praktik
Kontaktperson
Dejan Kostic
+46737652043
Referensnummer
2024/274
Sista ansökningsdag
2025-01-15
Skicka in din ansökan