My Fastai Course Note (2): From Model to Production

ifeelfree
3 min readOct 29, 2020

--

The note is based on Fastbook

  1. What’s the deep learning project strategy?
  • Iterate from end to end
  • Complete every step in a reasonable amount of time

2. What are high cardinality categorical variables?

  • categorical variables==discrete
  • high cardinality == too many unique values
  • Deep learning is good at analyzing tabular data that includes natural language, or high cardinality categorical columns (containing larger number of discrete choices like zip code).

3. Dataset, DataLoader, DataLoaders and DataBlocks

(1) torch.utils.data.Dataset is an abstract class representing a dataset. Your custom dataset should inherit Dataset and override the following methods:

  • __len__ so that len(dataset) returns the size of the dataset.
  • __getitem__ to support the indexing such that dataset[i] can be used to get iith sample

(2) torch.utils.data.DataLoader is an iterator that provides the following functions:

  • Batching the data
  • Shuffling the data
  • Load the data in parallel using multiprocessing workers.

(3) DataLoaders is a thin class in Fastai that just stores whatever DataLoader objects you pass to it, and makes them available as train and valid.

(4) DataBlock in Fastai

  • It is a factory method
  • From DataBlock we can get DataLoaders data_block.dataloaders(path)
  • DataBlock supports reset parameters by using .new method.

4. Data augmentation

  • In PyTorch, the confusion may come from the fact that often transforms are used both for data preparation (resizing/cropping to expected dimensions, normalizing values, etc.) and for data augmentation (randomizing the resizing/cropping, randomly flipping the images, etc.).
  • In Fastai, we separate it into two parameters: item_tfms and batch_tfms. The first is used to denote the pre-processing for input data, and the second is used for data augmentation.

5. Training in Fastai

  • Training in Fastai has been simplified and it needs three inputs (1) DataLoaders (2) neural network architectures and loss function (3) error metrics
  • Leaner has a handle that points to DataLoaders

6. Network interpretation

  • confusion matrix
  • plot_top_losses

7. Data cleaning using the trained network

  • ImageClassifierCleaner
  • created from Leaner

8. Model saving

  • .export() export saves both the architecture, as well as the trained parameters of the neural network architecture. It also saves how the DataLoaders are defined.
  • .pkl format (Python pickle serialization)
  • .pth format is supported by Pytorch
  • created from Learner
  • load_leaner is used to load the saved model

9. Model deployment

  • Jupyter Notebook for UI
  • voila is used to turn the Jupyter Notebook into a real APP
  • Very often the trained model does not work well in real situations due to out-of-domain data problem: domain shift
  • Another problem often happens is the presence of bias: feedback loops can result in bias getting worse and worse
  • GPUs are best for doing identical work in parallel. If you will be analyzing single pieces of data at a time (like a single image or single sentence), then CPUs may be more cost effective instead, especially with more market competition for CPU servers versus GPU servers. GPUs could be used if you collect user responses into a batch at a time, and perform inference on the batch. This may require the user to wait for model predictions. Additionally, there are many other complexities when it comes to GPU inference, like memory management and queuing of the batches.

--

--

No responses yet