Hoppa till huvudinnehåll
Search
Menu

Master's thesis: Machine Learning - Resource Efficient Large Language Models

Join a cutting-edge project to develop more sustainable and resource efficient Large Language Models (LLMs) through advanced techniques like Knowledge Distillation and Model Quantization.

Background
Large Language Models (LLMs) have revolutionized natural language understanding and generation tasks, with applications in chatbots, translation, summarization, and more. However, their widespread adoption poses significant challenges related to energy efficiency, model size, and resource utilization. Addressing these challenges is crucial for the sustainable development of AI, for example, through Tiny-ML technologies. Knowledge Distillation (KD) is a model compression technique where a smaller student model is trained to mimic the behaviour of a larger pre-trained teacher network, reducing the resource use while retaining language capabilities. Model Quantization reduces the precision of the model’s weights and activations, significantly lowering resource requirements.

Tiny-ML applications have demonstrated, for example, that DistilBERT compresses the BERT-base model by 40% while preserving 97% of its language capabilities. Moreover, Q8BERT quantizes BERT model parameters from 32-bit precision to 8 bits, achieving a 4x reduction in model size without sacrificing accuracy

Thesis Project Description
We are looking for a Master’s student to undertake a research project focused on implementing and evaluating one or more of the above techniques to create more energy-efficient transformer-based language models using the Hugging Face library.

Key Responsibilities

  • Literature Review: Explore the existing research on Knowledge Distillation and Model Quantization in the context of LLMs to understand the state of the art.
  • Implementation: Utilize the Hugging Face library to create a transformer-based language model with a specific focus on either Knowledge Distillation or Model Quantization.
  • Experimental Evaluation: Train and evaluate the model, measuring its resource utilization, model size reduction, and language capabilities. Compare the results with traditional LLMs.
  • Reporting: Document your work in a scientific report. Optionally, publish your code open source.

Qualifications

  • A strong background in machine learning, deep learning, and natural language processing.
  • Proficiency in Python and experience with libraries such as PyTorch and Hugging Face Transformers.
  • Background in computer science, mathematics or engineering physics. A talent for mathematical modelling and statistical analysis will be considered a plus.

Terms
Scope: 30 hp, one semester full time, with flexible starting date.
Location: You are expected to be at the RISE office regularly during the thesis period, preferably a few days each week. This applies to our offices, primarily in Luleå, Gothenburg, or Kista, with some flexibility.
Benefits: A scholarship of 30,000 SEK is granted upon approval of the final report.

Welcome with your application
For questions and further information regarding this project opportunity contact Rickard Brännvall, rickard.brannvall@ri.se, +46 730-753 713. Application deadline: November 30, 2024.

Keywords: Deep-learning, LLMs, Transformers, Energy efficiency, Quantization, Knowledge distillation, Processor architecture, Tiny-ML

Om jobbet

Ort

Luleå, Göteborg, Kista, Flexibelt

Anställningsform

Tidsbegränsad anställning

Job type

Student - examensarbete/praktik

Kontaktperson

Rickard Brännvall
+46730-753713

Referensnummer

2024/307

Sista ansökningsdag

2024-11-30

Skicka in din ansökan