paint-brush
Image Segmentation: Tips and Tricks from 39 Kaggle Competitions by@neptuneAI_jakub
686 reads
686 reads

Image Segmentation: Tips and Tricks from 39 Kaggle Competitions

by neptune.ai Jakub CzakonMay 23rd, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Data scientist building experiment tracking tools for ML projects at //neptune.ai.com Senior data scientist. I have gone over 39 Kaggle competitions including Intel & MobileODT Cervical Cancer Screening, Airbus Ship Detection Challenge and Planet: Understanding the Amazon from Space – $60,000 – and extracted that knowledge for you. Here are the tips and tricks you need to get to the top of the competition and use them to help you win the Data Science Bowl 2017 – $1,000,000.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Image Segmentation: Tips and Tricks from 39 Kaggle Competitions
neptune.ai Jakub Czakon HackerNoon profile picture

Imagine if you could get all the tips and tricks you need to hammer a Kaggle competition. I have gone over 39 Kaggle competitions including

  •  – $1,000,000
  •  – $100,000
  • – $100,000
  •   – $60,000
  •  – $60,000
  •  – $50,000
  •  – $37,000
  •  – $30,000
  •  – $25,000

 – and extracted that knowledge for you. Dig in.

Contents

  • External Data Preprocessing
  •  Data Augmentations 
  • Modeling 
  • Hardware Setups 
  • Loss Functions 
  • Training Tips
  •  Evaluation and Cross-validation
  •  Ensembling Methods 
  • Post Processing

External Data

  • Use of the data because it contains detailed annotations from radiologists
  • Use of the data because it had radiologist descriptions of each tumor that they found
  • Use , 
  • Use 
  • Use  dataset

Data Exploration and Gaining insights

  •  with the 0.5 threshold
  • Identify if there is a 

Preprocessing

  • Perform blob Detection using the. Used the implementation available in  package.
  • Use of in order to reduce the time of training
  • Use  for loading data instead of  because it has a faster reader
  • Ensure that all the images have the
  • Apply contrast limited adaptive 
  • Use  for all general image preprocessing
  • Employ  and adding manual annotations
  •  in order to apply the same model to scans of different thicknesses
  •  into normalized 3D numpy arrays
  • Apply single using Dark Channel Prior
  • Convert all data to
  • Find duplicate images using 
  • Make labels more balanced by
  • Apply pto test data in order  
  •  (CLAHE) with kernel size 32×32
  • Convert 
  • Calculate the  when there are duplicate images

Data Augmentations

  • Use package for augmentations
  • Apply random
  • Use h
  • Attempt : Elastic Transform, PerspectiveTransform, Piecewise Affine transforms, pincushion distortion
  • Apply 
  • Use of for generalization to prevent loss of useful image information
  • Apply 
  • Do  based on class 
  • Apply 
  • Use for data augmentation
  •  by a random angle from 0 to 45 degrees
  •  by a random factor from 0.8 to 1.2
  •  changing
  • Randomly change 
  • Apply  augmentations
  • Contrast limited adaptive Use the 
  •    augmentation strategy

Modeling

Architectures

  • Use of a based architecture. Adopted the concepts and applied them to 3D input tensors
  • Employing automatic active learning and adding 
  • The for training features with different receptive fields
  •  with adversarial training
  • , , v2 x 5 with Dense (FC) layer as the final layer
  • Use of a  which returns a fixed-length output no matter the input size
  • Use of 
  • Replace plus sign in  with concat and conv1x1
  • Keras  to train the model from scratch using 224x224x3
  • Use of the  to slide over the imagesImagenet-pre-trained  as the feature extractor
  • Use  in the decoder
  • Applying the Implementing the network with adjusted receptive fields and a 64 unit bottleneck layer on the end of the network
  • Use of  type architectures with pre-trained weights to improve convergence and performance of binary segmentation on 8-bit RGB input images
  •  since it’s fast and memory efficient
  •  and 
  •  from
  •  from
  •  from
  • A custom 

Hardware Setups

Loss Functions

  • because it works well with imbalanced data
  •  whose aim is to reduce the distance between the predicted segmentation and the ground truth
  •  that creates a criterion that optimizes a multi-label one-versus-all loss based on max-entropy, between input  and target
  • Balanced cross entropy (BCE) that involves weighing the positive and negative examples by a certain coefficient
  •  that performs direct optimization of the mean intersection-over-union loss in neural networks based on the convex Lovasz extension of sub-modular losses
  •  obtained by summing the Focal and Lovasz losses
  •  that incorporates margin in order to maximise face class separability
  •  that computes the npairs loss between y_true and y_pred.
  • A combination of  functions
  •  – a pairwise ranking that is is smooth everywhere and thus is easier to optimize
  •  that simultaneously learns a center for deep features of each class and penalizes the distances between the deep features and their corresponding class centers
  •  that augments standard loss functions such as Softmax that trains a network to embed features of the same class at the same time maximizing the embedding distance of different classes
  •  that involves subtracting the BCE and DICE losses then adding 1
  •  that is the binary cross-entropy minus the log of the dice loss
  •  of BCE, dice and focal
  •  that  loss performs direct optimization of the mean intersection-over-union loss
  •  -Dice loss is  obtained by calculating smooth dice coefficient function
  •  that is an improvement to the standard cross-entropy criterion
  •  – this is basically a summation of the three loss functions
  •  that incorporates the area and size information and integrates the information in a dense deep learning model
  •  – Kappa is a loss function for multi-class classification of ordinal data in deep learning. In this case we sum it and the focal loss
  •  — Additive Angular Margin Loss for Deep Face Recognition
  • s – Soft Dice uses predicted probabilities
  •  which is a custom loss used by the Kaggler
  •  that creates a criterion that uses a squared term if the absolute element-wise error falls below 1 and an L1 term otherwise
  • Use of the  in scenarios where it seems to work better than .

Training tips

  • Use 
  • Too much  will reduce the accuracy
  • Train on image  on full images
  • Use of Keras’s  to the learning rate
  • Train  then apply soft and hard augmentation to some epochs
  •  last one and use 1000 images from 
  • Make labels more balanced byUse of 
  • Use dropout and augmentation while tuning the last layer
  •  to improve score
  • Use 
  • Use 
  • Reduce the by a factor of two if validation loss does not improve for two consecutive epochs
  • Repeat the  of 10 batches
  •  so that each edge pixel is covered twice
  •  with low confidence score
  • Train different  then build an ensemble
  • Stop  is decreasing
  •  with gradual reducingTrain ANNs in  5 folds and 30 repeats
  • Track of your experiments using.

Evaluation and cross-validation

  • Split on  by classes
  • Avoid  by applying  while  the last layer
  • Combination  ensembles for detection
  • Sklearn’s5 
  • Adversarial 

Ensembling methods

  • Use simple  for ensemble
  •  on the  the z-location and the 
  •   for models  classes. This was done for raw data features only.
  •  for 
  • Training with 7 features for the 
  • Use to speed up model training. In this technique, models are first trained on simple samples then progressively moving to hard ones.
  • Ensemble with 
  •  for object detection
  • An ensemble of , , and architectures n with a classification network —  architecture

Post Processing

  • Apply — presenting an image to a model several times with different random transformations and average the predictions you get
  • Equalize test prediction  instead of only using predicted classes
  • Apply  to the 
  •  is covered at least thrice because UNET tends to have bad predictions around edge areas.
  •  and bounding box shrinkage
  •  to detach objects in instance segmentation problems.

Final Thoughts

Hopefully, this article gave you some background into image segmentation tips and tricks and given you some tools and frameworks that you can use to start competing.

We’ve covered tips on:

  • architectures
  • training tricks,
  • losses,
  • pre-processing,
  • post processing
  • ensemblingtools and frameworks.

If you want to go deeper down the rabbit hole, simply follow the links and see how the best image segmentation models are built.

Happy segmenting!

This article was originally posted by on the. If you liked it, you may like it there :)

You can also find me tweeting or posting on about ML and Data Science stuff.

바카라사이트 바카라사이트 온라인바카라