Master's thesis; Efficient AI Model Inference in the 6G Edge-Cloud
Master's thesis; Efficient AI Model Inference in the 6G Edge-Cloud
Compute Continuum
We are looking for a dedicated master’s student to join us in the Connected Intelligence Unit at RISE. The Connected Intelligence Unit is part of RISE Computer Science in Kista. The current research focus is on devising intelligent autonomous systems for monitoring and allocating resources in future computer and communication networks. The unit conducts projects together with industry and academic partners from Sweden and across the world.
Background and Purpose
The convergence of AI and next-generation communication networks (6G) is set to revolutionize the future of computing, enabling smarter and more autonomous network systems. Efficient utilization of resources, particularly GPUs, is crucial to support the real-time inference needs of AI-driven applications in the 6G era. This thesis explores how AI model inference can be optimized within an Edge-Cloud environment, where dynamic resource allocation and monitoring are key to ensuring seamless service delivery, scalability, and reduced energy consumption in future networks.
Thesis Description
The aim of this thesis is to investigate how GPU resources are allocated and shared across multiple AI inference tasks in Edge-Cloud compute clusters. The focus will be on analyzing, designing, and implementing efficient AI model inference towards minimizing resource fragmentation and ensuring better utilization. You will deploy multiple large language models (LLMs) in a virtualized environment, e.g., a Kubernetes cluster, and propose improved strategies for monitoring GPU resources, and informing the scheduler to enhance decision-making.
The outcomes will contribute to more efficient resource allocation strategies in AI-driven 6G networks.
Duration: 6 months of full-time work (with potential for extension).
Application: as soon as possible. It is strongly encouraged to apply before October 15th, 2024.
Start date: as soon as possible, or by January 2025 at the latest.
Scope: 30 hp.
Location: RISE Computer Science, Kista, Stockholm. Option to partially work remotely.
Who are you?
We expect you to have strong and solid knowledge of algorithms & data structures, good programming skills (C++/C, Python, or similar), ground knowledge in machine learning, and an interest in solving complex problems.
Welcome with your application!
To know more, please contact Daniel Pérez (daniel.perez@ri.se, tel 073 806 2917). Applications should include a brief personal letter, CV/resume, recent transcript of records, and a code excerpt (example of a code file written by you). Candidates are encouraged to send in their application as soon as possible but at the latest by the 15th of January 2025. Suitable applicants will be interviewed as soon as applications are received.
Keywords: Master thesis, Edge/Cloud Computing, Distributed Systems, LLM Model Inference, Kubernetes, resource allocation, RISE, Stockholm
Om jobbet
Ort
Kista
Anställningsform
Tidsbegränsad anställning
Job type
Student - examensarbete/praktik
Kontaktperson
Daniel Pérez
0738062917
Referensnummer
2024/275
Sista ansökningsdag
2025-01-15
Skicka in din ansökan