Sample dataset#
Use this script to sample data points of a dataset.
We can use the following script to sample text from WikiText2Dset
.
python -m lmp.script.sample_dset wiki-text-2
The default sampling index is 0
and the default version of WikiText2Dset
is train
.
Thus the following script has the same sampling result as above.
python -m lmp.script.sample_dset wiki-text-2 --idx 0 --ver train
The following script sample text from WikiText2Dset
with index set to 1
and version set to
test
.
python -m lmp.script.sample_dset wiki-text-2 --idx 1 --ver test
You can use -h
or --help
options to get a list of available datasets.
python -m lmp.script.sample_dset -h
You can use -h
or --help
options on a specific dataset to get a list of supported CLI arguments, including all
available versions of a dataset.
python -m lmp.script.sample_dset wiki-text-2 -h
See also
- lmp.dset
All available datasets.