Imagine if you could get all the tips and tricks you need to hammer a Kaggle competition. I have gone over 39 Kaggle competitions including
- – $1,000,000
- – $100,000
- – $100,000
- – $60,000
- – $60,000
- – $50,000
- – $37,000
- – $30,000
- – $25,000
– and extracted that knowledge for you. Dig in.
Contents
- External Data Preprocessing
- Data Augmentations
- Modeling
- Hardware Setups
- Loss Functions
- Training Tips
- Evaluation and Cross-validation
- Ensembling Methods
- Post Processing
External Data
- Use of the data because it contains detailed annotations from radiologists
- Use of the data because it had radiologist descriptions of each tumor that they found
- Use ,
- Use
- Use dataset
Data Exploration and Gaining insights
- with the 0.5 threshold
- Identify if there is a
Preprocessing
- Perform blob Detection using the. Used the implementation available in package.
- Use of in order to reduce the time of training
- Use for loading data instead of because it has a faster reader
- Ensure that all the images have the
- Apply contrast limited adaptive
- Use for all general image preprocessing
- Employ and adding manual annotations
- in order to apply the same model to scans of different thicknesses
- into normalized 3D numpy arrays
- Apply single using Dark Channel Prior
- Convert all data to
- Find duplicate images using
- Make labels more balanced by
- Apply pto test data in order
- (CLAHE) with kernel size 32×32
- Convert
- Calculate the when there are duplicate images
Data Augmentations
- Use package for augmentations
- Apply random
- Use h
- Attempt : Elastic Transform, PerspectiveTransform, Piecewise Affine transforms, pincushion distortion
- Apply
- Use of for generalization to prevent loss of useful image information
- Apply
- Do based on class
- Apply
- Use for data augmentation
- by a random angle from 0 to 45 degrees
- by a random factor from 0.8 to 1.2
- changing
- Randomly change
- Apply augmentations
- Contrast limited adaptive Use the
- augmentation strategy
Modeling
Architectures
- Use of a based architecture. Adopted the concepts and applied them to 3D input tensors
- Employing automatic active learning and adding
- The for training features with different receptive fields
- with adversarial training
- , , v2 x 5 with Dense (FC) layer as the final layer
- Use of a which returns a fixed-length output no matter the input size
- Use of
- Replace plus sign in with concat and conv1x1
- Keras to train the model from scratch using 224x224x3
- Use of the to slide over the imagesImagenet-pre-trained as the feature extractor
- Use in the decoder
- Applying the Implementing the network with adjusted receptive fields and a 64 unit bottleneck layer on the end of the network
- Use of type architectures with pre-trained weights to improve convergence and performance of binary segmentation on 8-bit RGB input images
- since it’s fast and memory efficient
- and
- from
- from
- from
- A custom
Hardware Setups
Loss Functions
- because it works well with imbalanced data
- whose aim is to reduce the distance between the predicted segmentation and the ground truth
- that creates a criterion that optimizes a multi-label one-versus-all loss based on max-entropy, between input and target
- Balanced cross entropy (BCE) that involves weighing the positive and negative examples by a certain coefficient
- that performs direct optimization of the mean intersection-over-union loss in neural networks based on the convex Lovasz extension of sub-modular losses
- obtained by summing the Focal and Lovasz losses
- that incorporates margin in order to maximise face class separability
- that computes the npairs loss between y_true and y_pred.
- A combination of functions
- – a pairwise ranking that is is smooth everywhere and thus is easier to optimize
- that simultaneously learns a center for deep features of each class and penalizes the distances between the deep features and their corresponding class centers
- that augments standard loss functions such as Softmax that trains a network to embed features of the same class at the same time maximizing the embedding distance of different classes
- that involves subtracting the BCE and DICE losses then adding 1
- that is the binary cross-entropy minus the log of the dice loss
- of BCE, dice and focal
- that loss performs direct optimization of the mean intersection-over-union loss
- -Dice loss is obtained by calculating smooth dice coefficient function
- that is an improvement to the standard cross-entropy criterion
- – this is basically a summation of the three loss functions
- that incorporates the area and size information and integrates the information in a dense deep learning model
- – Kappa is a loss function for multi-class classification of ordinal data in deep learning. In this case we sum it and the focal loss
- — Additive Angular Margin Loss for Deep Face Recognition
- s – Soft Dice uses predicted probabilities
- which is a custom loss used by the Kaggler
- that creates a criterion that uses a squared term if the absolute element-wise error falls below 1 and an L1 term otherwise
- Use of the in scenarios where it seems to work better than .
Training tips
- Use
- Too much will reduce the accuracy
- Train on image on full images
- Use of Keras’s to the learning rate
- Train then apply soft and hard augmentation to some epochs
- last one and use 1000 images from
- Make labels more balanced byUse of
- Use dropout and augmentation while tuning the last layer
- to improve score
- Use
- Use
- Reduce the by a factor of two if validation loss does not improve for two consecutive epochs
- Repeat the of 10 batches
- so that each edge pixel is covered twice
- with low confidence score
- Train different then build an ensemble
- Stop is decreasing
- with gradual reducingTrain ANNs in 5 folds and 30 repeats
- Track of your experiments using.
Evaluation and cross-validation
- Split on by classes
- Avoid by applying while the last layer
- Combination ensembles for detection
- Sklearn’s5
- Adversarial
Ensembling methods
- Use simple for ensemble
- on the the z-location and the
- for models classes. This was done for raw data features only.
- for
- Training with 7 features for the
- Use to speed up model training. In this technique, models are first trained on simple samples then progressively moving to hard ones.
- Ensemble with
- for object detection
- An ensemble of , , and architectures n with a classification network — architecture
Post Processing
- Apply — presenting an image to a model several times with different random transformations and average the predictions you get
- Equalize test prediction instead of only using predicted classes
- Apply to the
- is covered at least thrice because UNET tends to have bad predictions around edge areas.
- and bounding box shrinkage
- to detach objects in instance segmentation problems.
Final Thoughts
Hopefully, this article gave you some background into image segmentation tips and tricks and given you some tools and frameworks that you can use to start competing.
We’ve covered tips on:
- architectures
- training tricks,
- losses,
- pre-processing,
- post processing
- ensemblingtools and frameworks.
If you want to go deeper down the rabbit hole, simply follow the links and see how the best image segmentation models are built.
Happy segmenting!
This article was originally posted by on the. If you liked it, you may like it there :)
You can also find me tweeting or posting on about ML and Data Science stuff.