All data science projects fall within one or more of the following four areas:
- Statistical Inference
- Causal Inference
- Machine Learning
- Descriptive Statistics
Below are short descriptions of each, followed by examples.
Statistical Inference
Goal: measurement, typically optimizing for an unbiased and low variance estimator.
Examples:
- Metrics based on samples, e.g. conducting human reviews of samples of content to measure the rate of violative content being seen by users on the platform
- Hypothesis testing, e.g. testing whether user retention differs across demographics
Causal Inference
Goal: inferring the causal connection between two events
Examples:
- A/B tests, e.g. the new UX design outperform the old one on conversion rate?
- Observational studies, i.e. absent a randomized experiment, can we tell if one event caused another?
Machine Learning
Goal: learning prediction functions using data and algorithms
Examples:
- Classifying violative content, e.g. videos, images, comments
- Predicting the product that a customer will purchase next
Descriptive Statistics
Goal: summarize the data, primarily to tell the story of what is happening, or to help generate hypotheses. This is what is commonly meant by the term “analytics”.
Examples:
- Creating a North Star metric for teams to optimize towards
- Calculating the growth rates in different customer segments