When it comes to hyperparameter tunning in Keras, there are a few very good options. Hyperas is great, very intuitive. Talos is very easy to use and so neat. Kerastuner is relatively easy to use as well, but with more tuners packed inside.The drawback for Hyperas is that everything needs to be packed in data() and model() functions, whereas Talos will be great if it has Bayesian Optimization. On the other hand, Keras Tuner comes with Bayesian Optimization, Hyperband, and Random Search algorithms built-in, and is also designed to be easy for researchers to extend in order to experiment with new search algorithms.There are few pitfalls I came across on the way to use kerastuner that I would like to share with you.
- The requirement to use Kerastuner are:
Requirements:
- Python 3.6
- TensorFlow 2.0
If you want to avoid
RuntimeError: Model-building function did not return a valid Keras Model instance, found <keras.engine.sequential.Sequential object at 0x000001A7D79E3128>
maek sure to import like this:
from tensorflow.keras import layers,models
if you do it like I did:
from keras import layers,models
or
from tf.keras import layers, models
you might get stuck there for hours trying to figure out RuntimeError
.
2. If you are using tensorflow 2.0,try not to use hyperas, kerastuner only can be run on tensorflow 2.0, vice versa, hyperas will give you some errors under tensorflow 2.0.
3. Tuner parameters like directory and project_name needs to be unique for each run
Trial 1:directory=’my_dir_01',
project_name=’helloworld_01'Trial 2:directory=’my_dir_02',
project_name=’helloworld_02'
and same rule goes to the hyperparameters’ name need to be unique:
model.add(layers.Dense(units=hp.Choice('units_01', [16,64,128]), activation='relu'))model.add(layers.Dense(units=hp.Choice('units_02', [16,64,128]),activation='relu'))
4. Due to the requirement of unique names for each parameters, I haven’t figure out a way to do something like this:
hidden_layers = hp.Int('hidden_layers', 2, 4)
units_dense = hp.Choice('units', [16,64,128])
learning_rate = hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])
and add each varibles to model which can reduce the cumbersome of the code.
import tensorflow as tf
def build_model(hp):
inputs = tf.keras.Input(shape=(32, 32, 3))
x = inputs
for i in range(hp.Int('conv_blocks', 3, 5, default=3)):
filters = hp.Int('filters_' + str(i), 32, 256, step=32)
for _ in range(2):
x = tf.keras.layers.Convolution2D(
filters, kernel_size=(3, 3), padding='same')(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.ReLU()(x)
if hp.Choice('pooling_' + str(i), ['avg', 'max']) == 'max':
x = tf.keras.layers.MaxPool2D()(x)
else:
x = tf.keras.layers.AvgPool2D()(x)
x = tf.keras.layers.GlobalAvgPool2D()(x)
x = tf.keras.layers.Dense(
hp.Int('hidden_size', 30, 100, step=10, default=50),
activation='relu')(x)
x = tf.keras.layers.Dropout(
hp.Float('dropout', 0, 0.5, step=0.1, default=0.5))(x)
outputs = tf.keras.layers.Dense(10, activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)
model.compile(
optimizer=tf.keras.optimizers.Adam(
hp.Float('learning_rate', 1e-4, 1e-2, sampling='log')),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
code like this is no where near neat!