visit
A data science interview consists of multiple rounds. One of such rounds involves theoretical questions, which we covered previously in 160+ Data Science Interview Questions.
After you successfully pass it, there’s another round: a technical one. It typically involves live coding and the purpose is to check if a candidate can program and knows SQL. In this post, we’ll cover the questions you may receive during this technical interview round. This post is a summary of my interviewing experience — from both interviewing and being interviewed. It includes questions I ask when interviewing candidates as well as questions I was asked when I was looking for a job.This post covers the following topics:1) The number of active ads.
2) All active campaigns. A campaign is active if there’s at least one active ad.
3) The number of active campaigns.
4) The number of events per each ad — broken down by event type.
5) The number of events over the last week per each active ad — broken down by event type and date (most recent first).
6) The number of events per campaign — by event type.
7) The number of events over the last week per each campaign — broken down by date (most recent first).
8) CTR (click-through rate) for each ad. CTR = number of impressions / number of clicks.
9) CVR (conversion rate) for each ad. CVR = number of clicks / number of installs.
10) CTR and CVR for each ad broken down by day and hour (most recent first).
11) CTR for each ad broken down by source and day
1) FizzBuzz. Print numbers from 1 to 100
2) Factorial. Calculate a factorial of a number
factorial(5)
= 5! = 1 * 2 * 3 * 4 * 5 = 120factorial(10)
= 10! = 1 * 2 * 3 * 4 * 5 * 6 * 7 * 8 * 9 * 10 = 36288003) Mean. Compute the mean of number in a list
mean([4, 36, 45, 50, 75]) = 42
mean([]) = NaN
(use float('NaN')
)4) STD. Calculate the standard deviation of elements in a list.
std([1, 2, 3, 4]) = 1.29
std([1]) = NaN
std([]) = NaN
5) RMSE. Calculate the RMSE (root mean squared error) of a model. The function takes in two lists: one with actual values, one with predictions.
rmse([1, 2], [1, 2]) = 0
rmse([1, 2, 3], [3, 2, 1]) = 1.63
6) Remove duplicates. Remove duplicates in list. The list is not sorted and the order of elements from the original list should be preserved.
[1, 2, 3, 1]
⇒ [1, 2, 3]
[1, 3, 2, 1, 5, 3, 5, 1, 4]
⇒ [1, 3, 2, 5, 4]
7) Count. Count how many times each element in a list occurs.
[1, 3, 2, 1, 5, 3, 5, 1, 4]
⇒ 8) Palindrome. Is string a palindrome? A palindrome is a word which reads the same backward as forwards.
9) Counter. We have a list with identifiers of form “
id-SITE
”. Calculate how many ids we have per site.10) Top counter. We have a list with identifiers of form “
id-SITE
”. Show the top 3 sites. You can break ties in any way you want.11) RLE. Implement RLE (run-length encoding): encode each character by the number of times it appears consecutively.
'aaaabbbcca'
⇒ [('a', 4), ('b', 3), ('c', 2), ('a', 1)]
'a'
)12) Jaccard. Calculate the Jaccard similarity between two sets: the size of intersection divided by the size of union.
jaccard({'a', 'b', 'c'}, {'a', 'd'})
= 1 / 4 13) IDF. Given a collection of already tokenized texts, calculate the IDF (inverse document frequency) of each token.
[['interview', 'questions'], ['interview', 'answers']]
14) PMI. Given a collection of already tokenized texts, find the PMI (pointwise mutual information) of each pair of tokens. Return top 10 pairs according to PMI.
[['interview', 'questions'], ['interview', 'answers']]
1) Two sum. Given an array and a number N, return
True
if there are numbers A, B in the array such that A + B = N. Otherwise, return False
.[1, 2, 3, 4], 5
⇒ True
[3, 4, 6], 6
⇒ False
2) Fibonacci. Return the n-th Fibonacci number, which is computed using this formula:
3) Most frequent outcome. We have two dice of different sizes (D1 and D2). We roll them and sum their face values. What are the most probable outcomes?
6, 6
⇒ [7]
2, 4
⇒ [3, 4, 5]
4) Reverse a linked list. Write a function for reversing a linked list.
Node(value, next)
a -> b -> c
⇒ c -> b -> a
5) Flip a binary tree. Write a function for rotating a binary tree.
Node(value, left, right)
6) Binary search. Return the index of a given number in a sorted array or -1 if it’s not there.
[1, 4, 6, 10], 4
⇒ 1
[1, 4, 6, 10], 3
⇒ -1
7) Deduplication. Remove duplicates from a sorted array.
[1, 1, 1, 2, 3, 4, 4, 4, 5, 6, 6]
⇒ [1, 2, 3, 4, 5, 6]
8) Intersection. Return the intersection of two sorted arrays.
[1, 2, 4, 6, 10], [2, 4, 5, 7, 10]
⇒ [2, 4, 10]
9) Union. Return the union of two sorted arrays.
[1, 2, 4, 6, 10], [2, 4, 5, 7, 10]
⇒ [1, 2, 4, 5, 6, 7, 10]
10) Addition. Implement the addition algorithm from school. Suppose we represent numbers by a list of integers from 0 to 9:
[1, 2]
[1, 0, 0, 0]
[1, 1] + [1]
⇒ [1, 2]
[9, 9] + [2]
⇒ [1, 0, 1]
11) Sort by custom alphabet. You’re given a list of words and an alphabet (e.g. a permutation of Latin alphabet). You need to use this alphabet to order words in the list.
Example (taken from ):['home', 'oval', 'cat', 'egg', 'network', 'green']
'bcdfghijklmnpqrstvwxzaeiouy'
['cat', 'green', 'home', 'network', 'egg', 'oval']
12) Check if a tree is a binary search tree. In BST, the element in the root is:
The definition of a tree node:
Node(value, left, right)
For updates, follow me on Twitter () and on LinkedIn (). The cover picture is by from .