How to estimate the maximum allowed batch size for a TensorFlow training?
Swanand Mhalagi - «The quest of higher accuracy for CNN models» (2019-05-23)
What is the optimal batch size for a TensorFlow training?
What does mean «train_config» → «num_steps» in TensorFlow?
The batch size is the amount of samples you feed in your network.
The size of this batch (
batch_size) is the number of training samples used for this training pass.
You are approximating the loss, and therefore the gradient of your whole dataset by just computing it over
This basic terminology is explained in many introductory courses to neural networks.
train_stepsbasically counts the batches.
During training, you’ll read
batch_size*train_stepcsv rows, so you have to make sure that this number is lower than
total_rows_csv*num_epochsin your input reader; or that
num_epochs=None, it’ll cycle indefinitely through your data.
You’ll train on the whole data once (so for
train_steps=total_rows_csv/batch_size), that is 1 epoch, then it’ll go again over the same data, etc.
Batch size is the number of samples you put into for each training round.
So for each epoch, you can split your training sets into multiple batches.
For example, I have 1000 images.
If I set my batch size to 1, then for each epoch (training round), my input into the network will be 1 x 1000 images.
If set my batch size to 2, then it will be 2 x 500 images.
Meaning, for each epoch, I will run two rounds, each round using 500 images.
Step is just the learning rate that you use for your optimizer.
Usually, we start with 0.001 or 0.01.
I recommend that you watch Andrew Ng’s Machine Learning videos on Coursera to get a good understanding on ML if you want to have a good overall understanding.
This defines the number of work elements in your batch.
Tensorflow requires a fixed number and doesn’t take into consideration GPU memory or data size.
This number is highly dependent on your GPU hardware and image dimensions, and isn’t strictly necessary for quality results.
Tensorflow requires each input array to have the same dimensionality, which means that any
batch_size> 1 requires an