visit
But recommending the right content to the user and making the users to actually view it is one of the biggest factors that affect significant portions of the company’s profit.So the industry has been developing lots of methods to improve the quality of recommendation, and along the way, it has sprouted a lot of academic researchers to address more scientific questions related to the field. As a striking example,Moving towards a generation that videos communicate better than plain texts or images, it is worth looking into how YouTube, the video giant subsidiary of Google, manages to deliver this important functionality of effective content recommendation to the users. YouTube started as a video sharing platform, but it has become the new search engine of its own form, specialized for music and videos. Although exact numbers are not revealed, it is speculated that .
And as a company deriving most of the profits from ad placement, the recommendation algorithm lies at the heart of the lucrative plot.Unfortunately, how YouTube’s recommendation algorithm works, in reality, is shrouded in mystery, but from , we can at least get a peek on what they tried. This work showed that deep learning could be effectively used for building a large-scale recommendation system. In this post, we will go over some results and implications of the paper, and end it with a brief overview of how such a system can enhance the user’s experience at Stan World.
YouTube’s Deep Recommendation Algorithm
The authors introduce a recommendation system architecture described in the Figure below.It is a process of narrowing down videos through two deep learning models: (1) candidate generation model, and (2) ranking model
(both depicted as funnel-shaped figures in the diagram). These models
are aimed at solving the following classification problem:
The problem is posed as an extreme multi-class classification, where there exist millions of classes (videos), and the job is to model the probability of the user watching each video given the training features.The models are trained in a supervised manner, which means they are trained by optimizing a loss function based on the fixed amount of training data available to the algorithm. In this case, the training data are the videos and the user/contextual data.
Candidate Generation Model
The candidate generation model, shown in the figure below, takes in several user/contextual data as the training data:With these data, the model is trained to predict the class
probabilities. At test time, because the real-time computation of the
class probabilities is high, the algorithm selects top N videos
according to an approximate nearest neighbor algorithm that simply
searches for the videos that lie closest to the user vector generated.
Ranking Model
The ranking model is similar to the candidate generation model, while it
is allowed to access more information about the user and videos because
now the search space of the videos has been narrowed down (from
millions to a few hundred) by the candidate generation model. The model
structure is depicted in the figure below.
With this data, the model is trained to predict the expected watch time
of the videos, using weighted logistic regression at the ultimate
layer’s output. At test time, the videos with the most expected watch
time will be suggested to the users in the end.
Their experiment section, although we will not go into too much detail here, verifies the following:
However, the take-away from this work is that they empirically verified that such extra-large-scale models also work as expected for a recommendation task.Although it would have been more interesting if they provided how effective their recommendation was compared to other baseline methods, e.g. naive collaborative filtering with matrix factorization, they did show that adding more layers, making the models more complex with richer features, do help in terms of performance, given enough computation power and lengthy engineering work.
With a rich amount of data from the users, including their preferences, travel/search/purchase history, gaze movements, we hope to provide users with a tailored set of advertisements, contents, and entertainment.A high-quality recommendation is crucial to maintaining user loyalty — avg. user-retention rate + avg. user-engagement rate — in the World, and this leads to more interesting content and fun among the users, a higher chance of being discovered for Virtual Resort owners, and more revenue to the platform.To conclude, there are three points that we should bear in mind when building our own recommendation system at Stan World.
Deep learning can be effectively applied for recommendation systems.
This is good news. We can exploit the powerful representational power of
deep learning to develop a recommendation system. The recommendation is all about understanding the user’s preference, suggest something that is the most similar to that preference.
Training and deploying such system online requires great computational cost and such reasonable performance is only possible when there is an ample amount of training data.
This is bad news, especially for the early stage of development. This is
commonly referred to as a cold start problem, where the system cannot
predict anything useful due to the lack of existing data at the initial
stage of deployment.
1) Unlike YouTube, a platform where users join to discover who/what they
like, Stan World is a platform where users join to visit resorts launched by who/what they already like; they already have at least one specific destination in mind. A resort visit is already providing the first layer of preference data.
The model requires a lot of trial-and-error work, to balance the performance, complexity, and user-friendly deployment schemes.Industry-scale models tend to require exponentially large effort in managing, and because the models are inherently complex, it is usually very difficult to debug or control. This is something we must be really careful about when deploying the algorithm online, as we do not want to startle the users. To prevent such cases, it will require a lot of thorough testing, both offline and online.
As our recommendation system becomes smarter, the user-experience satisfaction rate will increase due to more relevant content and advertisement.’s strong growth strategy with the stars and brands must be backed by a satisfying user-experience for the loyal fanbase — the core of our tech infrastructure — minimizing manual work to increase efficiency, accuracy, and revenue.
In the next post, I will discuss more in-depth on the solutions and implementations mentioned above to attract more active users, build the
database required, train the model, and iterate these processes to ultimately improve the model based on the users’ feedback.
(Disclaimer: The Author is an Engineer at )