Skip to main content
Search
Menu

Master's thesis; Efficient AI Model Inference in the 6G Edge-Cloud

Master's thesis; Efficient AI Model Inference in the 6G Edge-Cloud 
Compute Continuum

We are looking for a dedicated master’s student to join us in the Connected Intelligence Unit at RISE. The Connected Intelligence Unit is part of RISE Computer Science in Kista. The current research focus is on devising intelligent autonomous systems for monitoring and allocating resources in future computer and communication networks. The unit conducts projects together with industry and academic partners from Sweden and across the world.

Background and Purpose
The convergence of AI and next-generation communication networks (6G) is set to revolutionize the future of computing, enabling smarter and more autonomous network systems. Efficient utilization of resources, particularly GPUs, is crucial to support the real-time inference needs of AI-driven applications in the 6G era. This thesis explores how AI model inference can be optimized within an Edge-Cloud environment, where dynamic resource allocation and monitoring are key to ensuring seamless service delivery, scalability, and reduced energy consumption in future networks.

Thesis Description
The aim of this thesis is to investigate how GPU resources are allocated and shared across multiple AI inference tasks in Edge-Cloud compute clusters. The focus will be on analyzing, designing, and implementing efficient AI model inference towards minimizing resource fragmentation and ensuring better utilization. You will deploy multiple large language models (LLMs) in a virtualized environment, e.g., a Kubernetes cluster, and propose improved strategies for monitoring GPU resources, and informing the scheduler to enhance decision-making. 
The outcomes will contribute to more efficient resource allocation strategies in AI-driven 6G networks.

Duration: 6 months of full-time work (with potential for extension).
Application: as soon as possible. It is strongly encouraged to apply before October 15th, 2024.
Start date: as soon as possible, or by January 2025 at the latest.
Scope: 30 hp.
Location: RISE Computer Science, Kista, Stockholm. Option to partially work remotely.

Who are you?
We expect you to have strong and solid knowledge of algorithms & data structures, good programming skills (C++/C, Python, or similar), ground knowledge in machine learning, and an interest in solving complex problems.

Welcome with your application!
To know more, please contact Daniel Pérez (daniel.perez@ri.se, tel 073 806 2917). Applications should include a brief personal letter, CV/resume, recent transcript of records, and a code excerpt (example of a code file written by you). Candidates are encouraged to send in their application as soon as possible but at the latest by the 15th of January 2025. Suitable applicants will be interviewed as soon as applications are received.

Keywords: Master thesis, Edge/Cloud Computing, Distributed Systems, LLM Model Inference, Kubernetes, resource allocation, RISE, Stockholm

About the position

City

Kista

Contract type

Temporary position

Job type

Student - Master Thesis/Internship

Contact person

Daniel Pérez
0738062917

Reference number

2024/275

Last application date

2025-01-15

Submit your application