Skip to content

Transformer-XL (257M)

Carnegie Mellon University (CMU)Google BrainLanguage modeling/generationOpen weights

Transformer-XL (257M) is a language modeling/generation model from Carnegie Mellon University (CMU),Google Brain released in 2019 with 256999999.99999997 parameters.

About Transformer-XL (257M)

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed len

Details

Provider
Carnegie Mellon University (CMU),Google Brain
Task
Language modeling/generation
Parameters
256999999.99999997
Released
2019-01-09
Open weights
Yes
View model source

Explore

FAQ