Pillars of Data Science

Get familiar with the foundations of data science.

Data science relies on several fundamental principles that serve as the foundation or building blocks for its practice. These building blocks help extract insights and knowledge from data and determine what model is required for a problem. These building blocks include statistics, probability, and linear algebra, along with ML and DL models.

Statistics and probability

Statistics and probability provide a basis for many data processing, feature transformation, visualization, analysis, and evaluation techniques. Statistics help us collect, organize, and analyze data, such as descriptive or quantitative analysis (summarizing data). We can find data trends and their spread with variance, covariance, and standard deviation. We can find how data is centered with mean, median, and mode. We can find the relative skewness with quartiles. Probability also helps us find patterns and trends that lie within data and transform them, such as inferential analysis (making guesses from data). We can find data distribution in its confidence interval, perform hypothesis or A/B testing, and make predictions with the central limit theorem (CLT). With Bayes’ theorem, we can find model uncertainty and priori probability, estimate model parameters, and make inferences. Statistics and probability also help us find the statistical significance of results with parametric testing, such as p-value testing, t-tests, analysis of variance (ANOVA), and so on.

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy