An Exploration of Training Techniques to Avoid Overfitting

Segmentation is the process of seperating an object of interest from the background. It is a common first step in computer vision tasks such as identifying signs and pedestrians for self-driving cars or tracking tumour size in medical MRI. Segmentation is also a fundamentally challenging problem as it is not clearly defined mathematically and therefore segmentation algorithms are often empirical with parameters that must be tuned to specifiy the object of interest. This empirical nature make segmentation a natural fit for deep learning, which takes in images with user defined labels specifying explicitly the object(s) of interest.

There has been much work exploring how different deep learning architectures are currently being used for improved image segmentation, many of these are surveyed in this review: Medical Image Segmentation Using Deep Learning: A Survey. In this tutorial, we look at one narrow aspect of contemporary deep learning practices, which is training strategies, in particular how to avoid overfitting. Traditionally, the best way to avoid overfitting is to simply add more data to train on. But due to the high cost of quality annotated data, that is in

from fastai.vision.all import *

base_path = Path('.').absolute().parent.parent
path = base_path / 'data' / 'train'
test_path = path.parent/'test'
fnames = get_image_files(path/'images')
print('example file name:', fnames[0])
print('of', len(fnames), 'training examples')
example file name: D:\Dev\hedgehog_finder\data\train\images\1\1.png
of 301 training examples

Define DataLoaders

Baseline model, no image augmentation

from hedgiefinder.dataloading import get_msk

codes = ['Background', 'Hog', 'Outline']

def get_dls_no_aug(bs=10, size=224):
    hogvid = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
                       get_items=get_image_files,
                       splitter=RandomSplitter(),
                       get_y=get_msk,
                       item_tfms=Resize(size, method='squish'))

    return hogvid.dataloaders(path / 'images', path=path, bs=bs)
dls = get_dls_no_aug()
dls.show_batch(max_n=6, figsize=(20,10))
Due to IPython and Windows limitation, python multiprocessing isn't available now.
So `number_workers` is changed to 0 to avoid getting stuck
no_aug_learner = unet_learner(dls, resnet34, cbs=[ShowGraphCallback(), CSVLogger('no_aug_history.csv')], metrics=[foreground_acc])
no_aug_learner.fine_tune(25)
no_aug_learner.show_results(max_n=3, figsize=(20,10))
epoch train_loss valid_loss foreground_acc time
0 0.098106 0.196701 0.000000 00:32
epoch train_loss valid_loss foreground_acc time
0 0.035166 0.056939 0.000000 00:31
1 0.029195 0.065781 0.000000 00:28
2 0.024189 0.116065 0.000000 00:26
3 0.018909 0.052810 0.025158 00:26
4 0.015214 0.045097 0.074778 00:26
5 0.012535 0.060466 0.027447 00:26
6 0.010616 0.030896 0.169795 00:26
7 0.008994 0.036668 0.196021 00:26
8 0.008030 0.051598 0.071069 00:26
9 0.007538 0.061201 0.049457 00:26
10 0.006979 0.049425 0.092389 00:26
11 0.006172 0.049407 0.120063 00:26
12 0.005567 0.034542 0.248300 00:26
13 0.005065 0.060112 0.069652 00:26
14 0.004753 0.056854 0.086587 00:26
15 0.004389 0.048036 0.172763 00:26
16 0.004111 0.051139 0.149429 00:26
17 0.003851 0.047382 0.202587 00:26
18 0.003636 0.049323 0.202977 00:26
19 0.003410 0.046481 0.249485 00:26
20 0.003211 0.053093 0.218202 00:26
21 0.003066 0.048298 0.261085 00:26
22 0.002945 0.052522 0.256358 00:26
23 0.002833 0.059560 0.198065 00:26
24 0.002758 0.049443 0.264531 00:26

Terrible! Without any augmentation the model doesn't generalize at all. We simply don't have enough training data. What are we to do? Data augmentation is a major theme of weakly-supervised learning, and a major focus of fastai, whose goal is to democratize deep learning and one way to do that is to lower the requirement of massive amounts of training data that most of us don't have access to.

def get_dls_no_norm(bs=10, size=224):
    hogvid = DataBlock(blocks=(ImageBlock, MaskBlock(codes)),
                       get_items=get_image_files,
                       splitter=RandomSplitter(),
                       get_y=get_msk,
                       batch_tfms=[*aug_transforms(do_flip=False, max_rotate=2)],
                       item_tfms=Resize(size, method='squish'))

    return hogvid.dataloaders(path / 'images', path=path, bs=bs)

Baseline without normalization

dls = get_dls_no_norm()
dls.show_batch(max_n=6, unique=True, figsize=(20,10))
Due to IPython and Windows limitation, python multiprocessing isn't available now.
So `number_workers` is changed to 0 to avoid getting stuck
no_norm_learner = unet_learner(dls, resnet34, cbs=[ShowGraphCallback(), CSVLogger('no_norm_history.csv')], metrics=[foreground_acc])
no_norm_learner.fine_tune(25)
epoch train_loss valid_loss foreground_acc time
0 0.141939 0.040111 0.000000 00:26