Paper Image

Simplifying Large Language Models

Published on:

6 November 2023

Primary Category:

Machine Learning

Paper Authors:

Xuan Li,

Zhanke Zhou,

Jianing Zhu,

Jiangchao Yao,

Tongliang Liu,

Bo Han


Key Details

Proposes knowledge distillation to simplify large language models

Trains smaller 'student' model to reproduce outputs of larger 'teacher'

Reduces model size while retaining accuracy

Creates lightweight models with near state-of-the-art performance

Demonstrates technique on a variety of model sizes and datasets

AI generated summary

Simplifying Large Language Models

This paper proposes a method to reduce the complexity of large language models while retaining performance. The authors use knowledge distillation techniques to transfer knowledge from a larger 'teacher' model into a smaller 'student' model. Their method trains the smaller model to reproduce the outputs of the larger model, compressing the teacher's knowledge into the student. They show this technique can produce lightweight models with accuracy close to state-of-the-art large models.

Answers from this paper


No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up