Paper Image

Language model for math

Published on:

16 October 2023

Primary Category:

Computation and Language

Paper Authors:

Zhangir Azerbayev,

Hailey Schoelkopf,

Keiran Paster,

Marco Dos Santos,

Stephen McAleer,

Albert Q. Jiang,

Jia Deng,

Stella Biderman,

Sean Welleck


Key Details

Llemma is pretrained on Proof-Pile-2, a new 55B token dataset for math

It improves on Code Llama, the model it initializes from

Llemma exceeds other available models on math benchmarks

It can solve problems using Python code and theorem provers

The models, data, and code are publicly released

AI generated summary

Language model for math

This paper introduces Llemma, a large language model specialized for mathematical reasoning by continued pretraining on a mixture of scientific text, web pages about math, and mathematical code. Llemma outperforms other available models on benchmarks for mathematical problem solving. It can also use tools like Python and theorem provers.

Answers from this paper


No comments yet, be the first to start the conversation...

Sign up to comment on this paper

Sign Up