82% of my students land consulting offers
Stop piecing together random resources. Get everything in one place: a step-by-step case interview course from a former Bain interviewer.
Author: Taylor Warfield, Former Bain Manager and Interviewer

Data science case interviews test whether you can apply your technical analytical skills to solve real business problems. This guide covers the different types of cases you'll face, proven strategies for solving them, common frameworks to use, and examples with full solutions.
But first, a quick heads up:
Learning case interviews on your own can take months.
If you’re looking for a step-by-step shortcut to learn case interviews quickly, enroll in my case interview course and save yourself 100+ hours. 82% of my students land consulting offers (8x the industry average).
A data science case interview is a problem-solving exercise where you work through a business scenario using data and analytics. Unlike coding interviews that test your ability to write correct syntax, case interviews test how you think.
The interviewer presents you with a problem that a real data scientist might face. Perhaps:
Your job is to structure the problem, identify what data you need, propose an analytical approach, and deliver an actionable recommendation. You do all this while thinking out loud so the interviewer can follow your reasoning.
Case interviews appear at multiple stages of the data scientist hiring process.
During the phone screen, you might get a 10- to 15-minute case mixed with technical questions. These are usually simpler and test your basic product sense and analytical thinking.
During the onsite, you'll face longer cases lasting 30 to 45 minutes. These go deeper and require more detailed analysis. Some companies dedicate entire interview rounds to case interviews.
You'll encounter three main formats.
Interviewers evaluate you on several dimensions:
Companies use case interviews because they reveal things that other interview questions miss.
Your resume shows what you've done. Coding tests show you can write working code. But neither shows how you approach new problems you've never seen before.
Case interviews simulate the actual job. Data scientists spend most of their time scoping problems, choosing metrics, and communicating with stakeholders. Technical execution is important but it's only part of the role.
Case interviews also test for product sense. The best data scientists understand how products work and what drives user behavior. They don't wait for someone to hand them a well-defined problem. They identify the right problems to solve.
Finally, case interviews reveal how you handle ambiguity. Real business problems are messy. The data is incomplete. The objectives are unclear. Stakeholders disagree on priorities. Case interviews show whether you can navigate this complexity.
Different companies emphasize different types of cases. Knowing what to expect helps you prepare more effectively.
These are the most common type, especially at tech companies like Meta, Google, Airbnb, and DoorDash.
Metric definition cases ask you to identify the right metrics for measuring a product's success.
The key is understanding what the company actually cares about. Start with the primary success metric, then identify supporting metrics that provide additional context and guardrail metrics that ensure you're not causing unintended harm.
Root cause analysis cases present a scenario where a metric changed unexpectedly.
For example: Daily active users dropped 10% last week. Average session duration increased but total sessions decreased. Why?
These cases test your ability to systematically investigate a problem. You need to:
Feature impact cases ask how you would measure whether a proposed change is working.
These often involve designing experiments and thinking through potential pitfalls like selection bias or network effects.
These cases feel more like traditional consulting interviews and are common at consulting firms and companies with strong analytics cultures.
Market sizing cases ask you to estimate the size of an opportunity.
The goal isn't to get the exact right number. It's to show you can build a logical model and make reasonable assumptions.
Growth and optimization cases ask how you would improve a business metric.
These test your ability to generate hypotheses, prioritize opportunities, and think about tradeoffs.
Investment and prioritization cases present competing options and ask you to recommend one.
You need to identify the relevant decision criteria, estimate the potential impact of each option, and make a clear recommendation with supporting rationale.
These cases are more technical and common at companies with ML-heavy products or at specialized ML roles.
Model selection cases present a prediction problem and ask how you'd approach it.
You need to understand the tradeoffs between different algorithms and explain why your choice fits the specific problem.
Feature engineering cases focus on the inputs to a model.
These test your creativity and your understanding of what makes predictive features.
Data quality cases present realistic messiness.
These show whether you understand the practical challenges of building ML systems in the real world.
A/B testing and experimentation cases are ubiquitous at tech companies where experimentation is central to product development.
These cases ask how you would set up a test.
You need to understand concepts like statistical power, sample size, and experiment duration.
Results analysis cases present experiment results and ask you to interpret them.
For example: The treatment group shows higher engagement but lower revenue. What do you recommend?
These test your ability to make decisions under uncertainty and navigate tradeoffs between competing metrics.
Edge case scenario cases explore situations where standard A/B testing breaks down.
These show whether you understand the assumptions underlying common methods and what to do when those assumptions are violated.
Having a toolkit of frameworks helps you structure your thinking quickly. Here are the most useful ones for data science case interviews.
AARRR is a funnel framework that tracks the user journey through five stages. It's essential for product and growth cases.
When asked about product metrics, walk through each stage of the funnel and identify what metrics matter most at each step.
This framework helps you define a complete set of metrics for any product decision.
When proposing metrics for any case, always include all three types. This shows you think about second-order effects.
This framework helps you investigate metric changes by breaking down data into meaningful groups.
When a metric changes, systematically check if the change is consistent across segments or concentrated in specific groups. This helps isolate the root cause.
This framework helps categorize potential causes when investigating problems.
Always check both categories when investigating a metric change. Internal factors are usually easier to verify using your release calendar and internal documentation.
Originally from product management, CIRCLES works well for product improvement and feature design cases.
Having a repeatable approach helps you stay organized under pressure. Here's a five-step method that works for most data science cases.
Before you do anything else, make sure you understand the problem. Ask questions to fill in the gaps.
Repeat the problem back in your own words. This confirms you understood correctly and gives the interviewer a chance to correct any misunderstandings.
Identify what success looks like.
Don't skip this step. Candidates who dive straight into analysis often solve the wrong problem.
Break the problem into manageable pieces.
Start by outlining your approach at a high level.
Organize your approach into a clear framework with buckets and sub-questions. This could be a list of hypotheses to test, a funnel to analyze, or a set of factors to consider. The specific structure depends on the problem.
Share your structure with the interviewer before proceeding. This gives them a chance to redirect you if needed and shows that you think before you act.
Work through your framework systematically.
For each component, explain your reasoning as you go.
Do calculations when required, but don't get lost in arithmetic. Round numbers aggressively and focus on whether your answer makes directional sense.
Connect each piece back to the bigger picture. How does this analysis help answer the original question?
Synthesize your findings into a clear recommendation.
State your answer directly. Don't bury the lead or hedge excessively. Interviewers want to see you take a stance.
Support your recommendation with the key reasons. Two or three strong reasons are better than a laundry list.
Acknowledge limitations and uncertainties. What would you want to investigate further? What could change your recommendation?
Be ready to pivot when the interviewer pushes back or introduces new information.
Follow-up questions are part of the test. The interviewer wants to see how you think on your feet and whether you can incorporate new constraints.
If your approach isn't working, don't be afraid to step back and try something different. Flexibility shows maturity.
Stay engaged and collaborative. The best case interviews feel like a conversation between colleagues working on a problem together.
Let's walk through three data science case interview examples to see how they can be solved.
Case Prompt: You're a data scientist at Instagram. The product team notices that the share rate for Stories has dropped 12% month over month. How would you investigate this?
Step 1: Clarify
Before investigating, I'd want to understand the context better.
I'd also confirm the metric definition. Share rate presumably means the percentage of Stories that get shared, but I want to make sure we're measuring this consistently over time.
Let's assume the interviewer confirms that share rate equals shares divided by Stories created, and that the drop appears broad-based in initial reports.
Step 2: Structure
I'll organize my investigation into three main buckets, each with specific sub-questions to answer.
Metric Decomposition
Segmentation Analysis
Potential Causes
Step 3: Analyze
Let me work through each bucket.
For metric decomposition, let's say the data shows that Stories created is stable, but total shares dropped. This tells us the problem is in the sharing behavior, not content creation.
Looking deeper at the share funnel, imagine we find that share attempts are flat but share completions dropped significantly. Users are trying to share but something is preventing them from finishing.
For segmentation analysis, suppose we find the drop is concentrated among Android users and is most severe in emerging markets like India, Brazil, and Indonesia. The drop is minimal on iOS and in North America. This pattern immediately narrows our focus.
For potential causes, given the Android and emerging markets pattern, I'd hypothesize this could be related to a recent app update, network connectivity issues, or increased file sizes making shares slower on low-bandwidth connections.
I'd cross-reference the timing of the drop with our release calendar. If a new Android version rolled out two weeks ago and the drop started then, that's a strong signal.
Step 4: Conclude
Based on this analysis, my hypothesis is that a recent Android update introduced a performance regression that's causing share completions to fail, particularly in regions with slower internet connections.
“I'd recommend three immediate actions.
First, the engineering team should investigate the share flow in the recent Android release, specifically looking at timeout thresholds and performance on low-bandwidth connections.
Second, we should compare share completion rates between users on the new version versus users who haven't updated yet.
Third, if we confirm the issue, we should consider rolling back or hotfixing the problematic change.”
Step 5: Adapt
If the interviewer said the timing doesn't match any releases, I'd pivot to investigating whether the content being shared has changed.
Perhaps Stories now include larger files or different formats that take longer to upload. I'd also look into whether any third-party dependencies like social sharing APIs changed their behavior recently.
If the interviewer said the drop is actually consistent across all platforms, I'd revisit my segmentation analysis and look more carefully at user-level patterns.
Maybe a change to the share UI made the button less visible, or perhaps Instagram changed its content recommendation algorithm in a way that surfaces content users are less likely to share.
Case Prompt: You're a data scientist at a food delivery company. The CEO walks in and says "our conversion rate dropped 8% yesterday. What's going on?" How do you investigate?
Step 1: Clarify
First, I need to understand what we mean by conversion rate.
I'd also want to know if this is outside normal daily variance. An 8% drop sounds large, but if our baseline has high day-to-day variability, it might not be unusual.
Let's assume the interviewer confirms this is visitor-to-order conversion for all users, and that 8% is about three standard deviations below our typical daily average.
Step 2: Structure
I'll organize my investigation into four buckets.
Funnel Breakdown
Segmentation
Internal Factors
External Factors
Step 3: Analyze
Let me work through each bucket.
For the funnel breakdown, suppose we find that the drop is concentrated between checkout start and payment completion. Earlier stages look normal. Users are browsing, selecting restaurants, and adding items at typical rates. They're starting checkout but not completing orders.
For segmentation, let's say the drop is consistent across user types and geographies but significantly worse on iOS. Android and web show only a small decline.
For internal factors, I'd check our release calendar. Suppose we pushed an iOS update two days ago. I'd also verify that no pricing or fee changes went live.
For external factors, there were no major weather events, holidays, or known competitor actions. But I'd check if Apple made any changes to Apple Pay or if our payment provider reported any issues.
Given the iOS concentration and the payment step, the evidence points to a payment processing issue specific to iOS.
Step 4: Conclude
My leading hypothesis is that there's a payment processing bug in our recent iOS release. The evidence supporting this includes the funnel showing problems at payment completion, the iOS concentration, and the timing aligning with our recent iOS update.
“There are four things that I’d recommend:
First, pull payment error rates by platform over the last 72 hours and look for a spike on iOS starting when the update rolled out.
Second, if error rates confirm the hypothesis, escalate to engineering immediately to investigate the payment integration in the new release.
Third, consider rolling back the iOS update or pushing a hotfix.
Fourth, estimate the revenue impact by calculating orders lost times average order value.”
Step 5: Adapt
If the interviewer revealed that payment error rates are normal, I'd pivot to investigating the checkout experience itself.
Maybe a UI change in the iOS update made the payment button less prominent or introduced a confusing flow. I'd pull up screenshots of the checkout flow from before and after the update to compare.
If the interviewer said the issue is actually consistent across all platforms, I'd reconsider external factors.
Perhaps our primary payment processor had a partial outage, or maybe a large credit card issuer flagged our transactions for some reason. I'd also look at whether we ran any promotions that expired, which might cause users to abandon checkout when they expected a discount that's no longer available.
Case Prompt: You're a data scientist at a bank. The risk team wants to build a model to predict which credit card applicants will default within their first year. How would you approach this?
Step 1: Clarify
I'd want to understand the business context first.
I'd also ask about constraints.
Let's assume the interviewer says we're augmenting human underwriters, that we must be able to explain decisions, and that the cost of a default is about 10x the profit from a good customer.
Step 2: Structure
I'll organize my approach into five buckets covering the full ML lifecycle.
Problem Definition
Data Assessment
Feature Engineering
Model Selection and Training
Evaluation and Deployment
Step 3: Analyze
Let me work through each bucket.
For problem definition, I'd confirm with the risk team that we're predicting 90+ days past due within the first 12 months, evaluated at the point of application before any account behavior exists.
For data assessment, we'd use historical application data including income, employment, credit score, and existing debt. We'd join this with outcome data showing which approved applicants defaulted. A key challenge is selection bias. We only have outcomes for applicants we approved.
Rejected applicants might have been good customers, but we can't know. I'd ask how consistent our approval criteria have been historically and whether we've done any randomized approval experiments.
For feature engineering, I'd start with standard credit features like credit score, debt-to-income ratio, and payment history. I'd also engineer features like credit utilization trend and number of recent credit inquiries. For applicants with thin credit files, I'd consider alternative data sources if permitted.
For model selection, given the explainability requirement, I'd use either logistic regression or a gradient boosted tree model like XGBoost. Both can achieve strong predictive performance while allowing us to explain which factors drove each decision. I'd use time-based cross-validation to prevent leakage from future information.
For evaluation, I'd focus on metrics that capture the business tradeoff. Given that defaults cost 10x what good customers earn, I'd optimize for a threshold where the cost of false negatives (approving defaults) is balanced against false positives (rejecting good customers). I'd also audit performance across demographic groups to ensure the model doesn't discriminate.
Step 4: Conclude
“My recommendation has four parts:
First, build a gradient boosted tree model using application features and credit bureau data, with time-based validation to simulate real deployment conditions.
Second, optimize the decision threshold based on the 10x cost asymmetry. This means we should be relatively conservative and accept some false positives to avoid costly defaults.
Third, build an explanation system that shows underwriters which features drove each prediction. This satisfies the explainability requirement and helps underwriters override the model when they have additional context.
Fourth, deploy in shadow mode for 3-6 months, running the model alongside current processes to evaluate real-world performance before making live decisions.”
Step 5: Adapt
If the interviewer asked how I'd handle the selection bias problem, I'd discuss several approaches.
We could use reject inference techniques to estimate outcomes for rejected applicants based on similar approved applicants. We could also weight our training data to account for the selection process. Ideally, we'd advocate for a small randomized experiment where we approve a sample of applicants who would normally be rejected, but I'd acknowledge this has ethical implications.
If the interviewer asked what I'd do if the model performed worse for certain demographic groups, I'd discuss fairness interventions.
We could adjust thresholds by group to equalize false positive or false negative rates. We could remove features that proxy for protected characteristics. We could also use fairness-constrained training algorithms. The right approach depends on which fairness definition the business and regulators prioritize.
Different companies emphasize different aspects of the case interview. Knowing these tendencies helps you prepare more effectively.
Google cases emphasize structured thinking and statistical rigor. You'll often get questions about designing metrics for measuring product quality or interpreting experiment results.
Expect follow-up questions that probe your statistical knowledge.
Google also values clear communication. They want to see that you can explain technical concepts to non-technical stakeholders.
Meta heavily emphasizes product sense and metrics thinking. Their interviews include dedicated "analytical reasoning" rounds focused entirely on case discussions about Facebook, Instagram, WhatsApp, and Messenger.
Common themes include:
Meta interviewers often push back to see how you handle contradictory evidence.
Amazon cases often involve ML components and tend to be more technical. You might be asked to design a recommendation system, fraud detection model, or demand forecasting solution.
Expect to discuss the full ML lifecycle including data collection, feature engineering, model selection, and deployment considerations. Amazon also incorporates their leadership principles into technical discussions, so be prepared to explain how your approach demonstrates customer obsession or bias for action.
Uber interviews typically include 5-6 rounds with heavy emphasis on SQL and case interviews.
Common case topics include:
Expect at least one dedicated "PM round" focused on product cases, often involving marketplace dynamics between riders and drivers. Uber interviewers frequently ask about A/B testing in the context of two-sided marketplaces where network effects complicate experiment design.
Airbnb has a distinctive process that often includes a take-home data challenge. You'll receive a dataset and have 24-48 hours to analyze it and prepare a presentation. The onsite then focuses heavily on presenting and defending your analysis.
Airbnb divides data scientists into three tracks: Analytics, Inference, and Algorithms. The case focus varies by track. Analytics emphasizes business metrics and product sense. Inference emphasizes statistical methodology. Algorithms emphasizes ML systems.
Spotify interviews combine technical assessments with product cases focused on music and podcast consumption. Common case topics include measuring recommendation quality, investigating engagement metrics, and designing A/B tests for audio features.
Spotify often asks about experimentation challenges specific to streaming, like how to measure the success of a personalized playlist when users have different listening patterns.
LinkedIn cases often focus on their core products: feed, messaging, jobs, and learning. Common topics include measuring professional network health, optimizing job recommendations, and investigating engagement metrics.
LinkedIn interviewers frequently ask about metrics that balance user and business value, since their model depends on both free engagement and premium subscriptions.
Netflix interviews emphasize recommendation systems and content analytics. You might be asked how to measure whether a recommendation algorithm is working or how to predict which shows will be successful.
Netflix cases often involve tradeoffs between short-term engagement metrics and long-term subscriber retention. They want data scientists who think about the whole customer lifecycle, not just immediate clicks.
McKinsey, BCG, and Bain have data science and analytics practices that blend traditional case interviews with technical assessments.
Their cases tend to be more business-strategy oriented. You might estimate a market size, analyze whether a client should enter a new market, or recommend how to optimize a supply chain.
The emphasis is on structured problem-solving and clear communication rather than deep technical implementation.
Startup interviews are less standardized and often more practical. You might work through a real problem the company is facing or analyze actual data from their product.
They're looking for scrappy problem solvers who can work independently and make progress with imperfect information. Expect more open-ended cases with less interviewer guidance.
You don't need to be an expert in everything, but certain concepts come up frequently. Here's a quick list of common technical concepts.
Understand hypothesis testing, p-values, confidence intervals, and statistical significance. Know when to use different tests like t-tests, chi-squared tests, and correlation analysis.
Be able to explain the difference between statistical significance and practical significance. A result can be statistically significant but too small to matter for the business.
Know how to calculate sample size and experiment duration. Understand what affects statistical power and how to run power analysis.
Be familiar with common pitfalls like multiple testing problems, peeking at results early, and interference between treatment and control groups.
Many case interviews include a SQL component. Be comfortable writing queries for common operations like filtering, joining, aggregating, and window functions.
Practice translating business questions into SQL queries. If the interviewer asks "what percentage of users who signed up last month made a purchase," you should be able to write that query quickly.
Know the major algorithm families: regression, classification, clustering, and recommendation systems. Understand when to use each and their key tradeoffs.
Be familiar with evaluation metrics like accuracy, precision, recall, AUC, and RMSE. Know how to choose the right metric for a given problem.
Understand common product metrics like DAU/MAU, retention, conversion rate, and engagement. Know how these connect to business outcomes like revenue and growth.
Be able to reason about how different metrics relate to each other. If DAU goes up but revenue goes down, what might be happening?
Use these questions to practice before your interviews:
Preparation makes a real difference for case interviews. Here's a concrete plan you can follow.
Days 1-2: Study the company's products
Before any interview, spend 2-3 hours using the company's products as a real user would. If you're interviewing at Airbnb, book a stay and pay attention to every screen in the flow. If it's Spotify, use the app daily and notice what metrics they might track.
Read their engineering blog and any public talks from their data team. Search YouTube for "[Company] data science" to find conference talks. For example, Airbnb's engineering blog has detailed posts about their experimentation platform. Netflix publishes papers about their recommendation systems. These give you real examples to reference.
Days 3-4: Learn the frameworks
Memorize the AARRR funnel and practice applying it to different products. Take any app on your phone and list the key metrics for each stage: Acquisition, Activation, Retention, Referral, Revenue.
Study the Success/Guardrail/Health metric framework. For any product change, practice identifying all three types of metrics you'd track.
Days 5-7: Practice metric definition cases
Pick 5 products you use regularly. For each one, write out what you think their north star metric is and 3-5 supporting metrics. Then research to see if you can find what they actually use.
Practice these specific prompts: "What metrics would you use to measure the success of [Product X]?" Set a 5-minute timer and talk through your answer out loud.
Days 1-2: Practice root cause analysis
Find 3-5 real examples of tech companies discussing metric changes. Tech company blogs often share these stories. Practice building investigation frameworks for each.
For each case, write out your complete framework with buckets and sub-questions before looking at how they actually solved it.
Days 3-4: Practice market sizing
Work through 10 market sizing problems with a timer set to 5 minutes each. Focus on building logical structures, not getting exact numbers.
Good practice problems: How many Uber rides happen in your city each day? How many Netflix subscribers are there in the US? How much revenue does Starbucks make from coffee in your state?
Days 5-7: Practice experiment design
Study the basics of A/B testing: sample size calculation, statistical significance, and common pitfalls. Khan Academy has free statistics courses if you need a refresher.
Practice designing experiments for these prompts: "How would you test whether a new checkout flow increases conversion?" "How would you test a new recommendation algorithm?" Focus on defining your hypothesis, metrics, randomization unit, and potential pitfalls.
Days 1-3: Solo practice with timer
Set up 30-minute practice sessions. Use a timer and record yourself answering case questions. Talk out loud as if you're in an interview.
After each session, listen to your recording and note where you rambled, got stuck, or could have been clearer. Most people are surprised by how much filler language they use.
Days 4-5: Practice with a partner
Find a friend, classmate, or online practice partner. Pramp and Interviewing.io offer free peer practice. Take turns being interviewer and candidate.
As the interviewer, practice asking probing follow-up questions: "Why that metric instead of another?" "What would you do if the data showed X?" This helps you anticipate what real interviewers will ask.
Days 6-7: Do a full mock interview
Ideally, find someone who has actually conducted data science interviews. If you can't, use a paid mock interview service or find an experienced data scientist on LinkedIn willing to help.
Get specific feedback on your structure, communication, and analytical depth. Ask what parts were confusing and what you should spend more time on.
Ready to stop struggling and start landing data science, business strategy, or consulting offers?
Enroll in my step-by-step case interview course:
Join 3,000+ candidates who've landed offers at McKinsey, BCG, Bain, and other top firms. 82% of my students land consulting offers.
👉 Enroll Now