Paper Image

Routing Language Models to Specialized Experts

Published on:

8 February 2024

Primary Category:

Machine Learning

Paper Authors:

Mohammed Muqeeth,

Haokun Liu,

Yufan Liu,

Colin Raffel

Bullets

Key Details

Proposes PHATGOOSE method for routing tokens to experts

Experts are from parameter-efficient fine-tuning

Routing is based on learned gates for each module

Outperforms past routing methods

Sometimes matches multitask training performance

AI generated summary

Routing Language Models to Specialized Experts

This paper explores improving zero-shot generalization by routing tokens within a language model to different specialized expert modules at each layer. Their method, PHATGOOSE, trains routing gates for each expert module that determine which tokens should use that module. Experiments find PHATGOOSE outperforms past routing methods and sometimes matches multitask training.

Answers from this paper

Comments

No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up