visit
Original and target images Conceptually, the task seems to be well defined and simple, especially in comparison, say, with full recognition of the road scene for safe self driving. Indeed, 396 teams have achieved the score above 0.99. All further fight will be for 3–4 decimal place in the final score. In general, the Kaggle community is extremely creative and very non-trivial solutions are born as a result of tough competition. For instance, take a look at the of . However, when it comes to semantic segmentation problems, non-trivial approaches are difficult to utilize. Ensembles of Unet like architectures trained at different resolutions will prevail in most top-scored solutions. In situations where very similar approaches compete with each other, the Chance plays a huge role. And the following question arises:
“Is there any other way to get a competitive advantage?”
We think that the answer is yes, especially if we look at the task from different perspective: attack the data rather than model.
In this post we will describe the way managed to generate Synthetic Carvana Images (plus Ground Truth) that are very similar to the real ones — the training data provided by challenge organizers. What’s even more important is that our synthetic training set are freely available and everyone may make use of it to obtain higher score in the challenge.
Check out those two cars above. One of them is real, and one is synthetically generated using GTA V. Which is which?
Some intermediate magic in action After we’ve successfully injected our DLL in GTA process, we programmable place every vehicle available in GTA into garage. Well, not every — after some filtering we kept only 154 models that make sense for Carvana challenge, because airship does not. Then, we rotate our model per 10° with several different camera angles. Finally, we change car color: we chose black and white.
Okay, now we can take nice screenshots like the one above, but there are no ground truth available. That’s bad. Luckily, we can hook into DirectX API calls and make some manipulations with objects on scene. After a few broken keyboards we found a way to highlight the car:
As you can see, there is no windows. It’s because windows are totally separate objects in GTA V. So, we also highlight only windows:
Now that’s something! We actually got both ground truth mask and a car image. But we also need to extract and place our model on Carvana scene and make the final result as close to reality, as possible. Because of that, we also want to extract a car shadow from GTA:
As you can see, we’ve failed to make the floor exactly white and plain. But don’t worry: Photoshop is here to help us!
Actually, Photoshop has a lot to offer. But most people don’t know it’s possible to use good ol’ JavaScript to automate every action. That’s what we did. We start with screenshot from the game:
First, the easy one: we combine car and windows ground truth to obtain the final mask:
Now, we can cut out the car from screenshot and place it on empty stage we made before:
As you can see, the car is too dark. That’s because it was shot in darker place. Luckily, Photoshop has Auto Tone and Auto Color:
Much better! But the car is floating in the air. That’s because there is no shadow. It is possible to generate shadows in Photoshop, but it’s hard because we need to keep model rotation angle in mind. So, we will take shadow directly from GTA. We load screenshot with white (kinda) floor and make some manipulations:
But there are still no windows! Let’s fix that by generating windows using some gradients:
And finally, enlarge car to fit the scene:
All those manipulations are done programmable using Photoshop JS scripting and pre-recorded actions. If you think this is an interesting topic for a tutorial, please leave your opinion in comments.
Open Import
→ Datasets
library and click on “CarvanaGTA5" dataset. Enter a project name (for example, “Carvana”), click Next
and Upload
. After import task completion you will see your new dataset on Projects
page.
Datasets library
You can check out images in Annotation tool
by clicking on dataset or look at statistics.
Annotation tool
Now you can download dataset on your computer by using Export tool
. Export is a powerful feature of Supervise.ly that uses JSON-configurations to make filtering, resizing, augmentation, train-validation splitting, combining multiple datasets in one — and then save your results in popular ready to train frameworks formats.
Go to the Export
page and paste the following config in editor:
Here we define an array of sequential transformations of data: we tag every image as “train”, pass it to background
layer to generate bg
class and finally use segmentation
layer to make ground truth images. You can read more about Export in .
Now click Start Exporting
button and enter some name (optional).
Supervise.ly will prepare your archive and after some time Download
button would appear in Tasks:
Done! If you have some time, check out our tutorials on Supervise.ly like — it has a lot to offer.
Or in words: the quality of intellectual products based on Deep Learning is determined by the amount of available training data. Increasing the training data availability is the main priority of our company. We approach the problem from two sides: