Friday, August 25, 2017

Use own dataset for deep learning

Deep Learning

Preparation
1.  Virtual box and Vagrant

2.  Install Apache

3.  Install MySQL

4.  Install Python

5.  Hello World with Python

Deep learning programming
1. Install Keras and test deep learning

2. Save learned parameters and load the parameters

3. Save and load at a same time

4. Use own dataset

Use own dataset


We will try deeplearning what cat looks like and what dog looks like. At first, we need to get data set of dog/cat from somewhere. I downloaded the pictures from here: https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition

If download and unzip the dataset file, you will see a lot of pictures there.

We will use them for the learning.

Before that, we will create folders to store them inside. Create a "data" folder:


Create "train" folder and "validation" folder in the data folder:


Open the train folder and create "cats" folder and "dogs" folder and store 200 pics of cat in cats folder and 200 pics of dog in the dog folder:

Now open the validation folder and create cats folder and dogs folder. Store 200 pics of cat in the cats folder and 200 of pics of dog in the dogs folder:

Create "dogvscat.py" and write as follows inside:
from __future__ import print_function

import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.preprocessing.image import ImageDataGenerator
from keras.models import model_from_json
from keras.layers import Dense, Dropout, Activation
from keras.preprocessing.text import Tokenizer
import keras.backend.tensorflow_backend as KTF
import tensorflow as tf
import os.path


f_log = './log'
f_model = './model/dogvscat'
model_yaml = 'dogvscat_model.yaml'
model_filename = 'dogvscat_model.json'
weights_filename = 'dogvscat_model_weights.hdf5'

batch_size = 32
epochs = 5
nb_validation_samples = 100

old_session = KTF.get_session()
print('Building model...')
session = tf.Session('')
KTF.set_session(session)

if os.path.isfile(os.path.join(f_model,model_filename)):
    print('Saved parameters found. I will use this file...')
    json_string = open(os.path.join(f_model, model_filename)).read()
    model = model_from_json(json_string)
    model.summary()
    model.compile(loss='categorical_crossentropy',
                optimizer='adam',
                metrics=['accuracy'])
    model.load_weights(os.path.join(f_model,weights_filename))
else:
    print('Saved parameters Not found. Creating new model...')
    model = Sequential()
    model.add(Conv2D(32, 3, 3, input_shape=(128, 128, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(64, 3, 3))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Flatten())
    model.add(Dense(64))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2))
    model.add(Activation('softmax'))

    model.summary()

    model.compile(loss='categorical_crossentropy',
                optimizer='adam',
                metrics=['accuracy'])

train_datagen = ImageDataGenerator(
    rescale=1.0 / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1.0 / 255)

train_generator = train_datagen.flow_from_directory(
    'data/train',
    target_size=(128, 128),
    batch_size=batch_size,
    class_mode='categorical')

validation_generator = test_datagen.flow_from_directory(
    'data/validation',
    target_size=(128, 128),
    batch_size=batch_size,
    class_mode='categorical')

tb_cb = keras.callbacks.TensorBoard(log_dir=f_log, histogram_freq=0)
cp_cb = keras.callbacks.ModelCheckpoint(filepath = os.path.join(f_model,weights_filename), monitor='val_loss', verbose=1, save_best_only=True, mode='auto')
cbks = [tb_cb, cp_cb]

history = model.fit_generator(
    train_generator,
    steps_per_epoch=np.ceil(nb_validation_samples/batch_size),
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=np.ceil(nb_validation_samples/batch_size),
    callbacks=cbks
    )

score = model.evaluate_generator(validation_generator, nb_validation_samples/batch_size)

print('')
print('Test score:', score[0])
print('Test accuracy:', score[1])

json_string = model.to_json()
open(os.path.join(f_model,model_filename), 'w').write(json_string)
yaml_string = model.to_yaml()
open(os.path.join(f_model,model_yaml), 'w').write(yaml_string)
print('save weights')
model.save_weights(os.path.join(f_model,weights_filename))
KTF.set_session(old_session)


Run these commands to start learning:
$ sudo cd /vagrant
$ sudo python3 dogvscat.py

Then deep learning from the dataset will start: