Any info for tweaking training settings for those with little background in LSTMs?

Hi All, 

Not sure if this is the right place to post this, but I'm looking for a little extra info on how to choose parameters for model training. I have very little background in anything to do with neural networks or even any programming skills. I have been curious to try a little experiment as I find this software fascinating. I am tech-savvy enough that I have installed everything successfully and  started training using the default settings.  My data set is about 3,000,000 characters. Right now it seems I have reached a point of diminishing returns - the model is consistently underfitting and the loss value doesn't seem to be changing much at each checkpoint. By underfitting I mean that there are consistently many gibberish words and erratic sentence structures despite the structured nature of the data set. A few questions: 

1) How many epochs would training a model generally require to produce effective results? I made it to about 13/50 and it goes quite slow (cpu mode on a crap computer, this point took me >48hrs of constant running). Am I just being impatient? Could the loss value start to change again even after a perceived plateau? Is loss the be-all-end-all of evaluating a training run, or could the model be still improving even if the loss value doesn't change? 

2) If I am faced with underfitting, which model parameters should I change first to improve it? -rnn_size, -num_layers, -batch_size, something else? 

3) Does anybody have any resources designed for beginners that help explain the theory behind neural networks to help me understand exactly what is going on so I can improve my understanding and answer these questions myself? 

Thanks all
J

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any info for tweaking training settings for those with little background in LSTMs? #196

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Any info for tweaking training settings for those with little background in LSTMs? #196

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions