AI organizations divide their work into data engineering, modeling, deployment, business analysis, and AI infrastructure. Each task requires specific skills and can be the focus of multiple roles. If you apply to a role that carries out the modeling task such as Machine Learning Engineer (MLE), Data Scientist (DS), Machine Learning Researcher (MLR) or Software Engineer-Machine Learning (SE-ML), you’ll often encounter the machine learning algorithms interview during the onsite round. You can learn more about these roles in our AI Career Pathways report and about other types of interviews in The Skills Boost.
I What to expect in the machine learning algorithms interview
The interviewer will try to uncover how deeply you understand (usually classic) machine learning algorithms. Here’s a list of interview questions Workera candidates have been asked onsite:
- Derive the binary cross-entropy loss function.
- How does Logistic Regression differ from Linear Regression?
- What is the difference between Batch Gradient Descent and Stochastic Gradient Descent?
- Explain a classic machine learning algorithm, among the following list: Linear Regression, Logistic Regression, Decision Trees, Random Forest, XGBoost, Support Vector Machines, K-means, K-Nearest Neighbors, Neural Networks, Principal Component Analysis, Naive Bayes Classifier, L1/L2 regularization, etc.
- Why is the EM algorithm useful?
- Why is the Naive Bayes classifier called Naive?
- How does a discriminative model differ from a generative model?
- In K-Nearest Neighbors, how does the value of K impact bias and variance?
- In Support Vector Machines, what is the kernel trick?
II Interview tips
Every interview is an opportunity to show your skills and motivation for the role. Thus, it is important to prepare in advance. Here are useful rules of thumb to follow:
Listen to the hints given by your interviewer.
Example: You’re explaining PCA and state that “we should find the eigenvalues and eigenvectors of the data matrix X”. If your interviewer questions you with “are you sure?” or “can you interpret the eigenvalues of X?”, there is a high chance your answer is imprecise or wrong. You should react by reconsidering and talking through your answer. In this case, the interviewer expects you to introduce the covariance matrix of X and find its eigenvalues/eigenvectors.
Don’t mention methods you’re not able to explain.
Example: You’re explaining logistic regression and state that “we’re using logistic regression for binary classification problems. For a multi-class problem we would use a softmax regression.” In this scenario, you can expect the interviewer to ask: “could you explain softmax regression?”
Write clearly, draw charts, and introduce a notation if necessary.
The interviewer will judge your scientific rigor.
Example: You’re asked to write the binary cross entropy cost function. Instead of writing $\mathcal{J}= -[y\log\hat{y}+(1-y)\log(1-\hat{y})]$, write $\mathcal{J}(\hat{y}, y) = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(\hat{y}^{(i)}, y^{(i)}) = - \frac{1}{m} \sum_{i=1}^m [y^{(i)}\log\hat{y}^{(i)}+(1-y^{(i)})\log(1-\hat{y}^{(i)})]$. In this fashion, you’ll display your meticulous understanding of cost functions, their arguments, and how they differ from loss functions.
Interviewers will often ask you questions about methods they use at work.
Before going onsite, read online about the product the company is building and try to infer the methods they might be using.
Example: If you’re interviewing with a fraud detection team, you might want to learn about the methods to deal with imbalanced datasets, precision, recall, and F1 score before going onsite.
When you are not sure of your answer, be honest and say so.
Interviewers value honesty and penalize bluffing far more than lack of knowledge.
Example: Assume the interviewer asks you about Bayes error. You remember multiple concepts named after the statistician Thomas Bayes including the Bayes theorem and the Naive Bayes classifier, but not the Bayes error. Rather than answering vaguely, you could say “I’m familiar with the Bayes theorem and the Naive Bayes classifier, but I don’t think I’ve been exposed to the Bayes error. Could you expand?”. This allows the interviewer to add context on Bayes Error and help you answer without prior knowledge of the subject, or move to the next question.
When out of ideas or stuck, think out loud rather than staying silent.
Talking through your thought process will help the interviewer correct you and point you in the right direction.
III Resources
The machine learning section of the Workera test is a great way to prepare for this interview. It’ll provide you with a personalized study plan which includes a list of your strengths and weaknesses, along with curated training material to prepare for interviews or transition in your career. Additionally, here’s a list of useful resources to prepare for the machine learning algorithms interview.
- Stanford’s CS229 lecture notes:
- Linear Regression and Logistic Regression
- Generative Learning Algorithms, Naive Bayes
- Support Vector Machines
- Neural Networks and Deep Learning
- Bias/variance tradeoff and Error analysis
- Regularization and Model Selection
- Unsupervised Learning, K-means clustering
- Mixtures of Gaussians and the EM algorithm
- Principal components analysis
- Independent components analysis
- Machine Learning on Coursera