lmp#

Language model playground source code.

This module provide utilities for training language model, perplexity evaluation and continual text generation.

  • lmp.dset contains all available datasets.

  • lmp.model contains all available language models.

  • lmp.script provide scripts for training and inference on all tokenizers and language models.

  • lmp.tknzr contains all available tokenizers.

  • lmp.util contains utilities shared throughout this project.