Experiment 1: Models Performance Baseline#

Tokenizers#

Tokenizers’ experiment shared by models.

tknzr_name

dset_name

exp_name

max_vocab

min_count

ver

is_uncased

whitespace

WNLI

ws-tknzr

-1

1

train

True

Models Shared Parameters#

parameters

value

batch_size

2

beta1

0.9

beta2

0.99

ckpt_step

1000

dset_name

WNLI

eps

1e-8

log_step

200

lr

1e-3

max_norm

1

max_seq_len

256

n_epoch

100

seed

42

tknzr_exp_name

ws-tknzr

ver

train

d_emb

100

d_hid

300

n_hid_lyr

2

n_post_hid_lyr

2

n_pre_hid_lyr

2

p_emb

0.1

p_hid

0.1

weight_decay

1e-2

Models Loss Performance#

model_name

step 10k

step 20k

step 32k

RNN

0.167

0.073

0.056

GRU

0.187

0.061

0.046

LSTM

0.182

0.063

0.044