Course Objectives

Linear algebra for data science

Mathematically, linear algebra, probability, statistics, and optimizations form the foundation of data science. Even for advanced models in data science, like neural networks, the inputs and transformations are based on vectors, matrices, and tensors. This means that a reasonable understanding of linear algebra is a must-have for any data scientist. Linear algebra is a fundamental and unavoidable pillar of data science and the most applied branch of mathematics. Luckily for data scientists, it’s also elegant, which makes learning and applying it enjoyable and satisfying.

Key features of this course

This course has several advantages over similar courses on this subject.

Visualizations

This course is equipped with several engaging illustrations, including static images and animations. Each illustration’s content is carefully designed to convey the underlying concept. Except for a few, all illustrations are designed through Python packages, like manim, which are increasingly being adopted by the mathematics community.

Linear combinations of spanning set
For an invertible matrix, the column space is the full space. However, the standard basis changes to the columns of the matrix.
Null space transformation

Programming

Most data science courses fall under two seemingly disjointed sets, namely the mathematical or the programmatic. In the mathematical approach, the programming part is usually not the focus, which makes it difficult for beginners to relate it with real-world applications. In contrast, the programming approach focuses primarily on data science frameworks and usually fails to convey details about the core of the models.

We believe that understanding mathematics through programming is more effective and applied. Hence, this course bridges the gap between the two paradigms by covering mathematical modeling through programming. Furthermore, we chose Python, the leader in data science programming, as our language.

def getStats(mat):
rref, inds = Matrix(mat).rref()
numPivots = len(inds) if len(inds) < mat.shape[1] else len(inds)-1
numFree = mat.shape[1]-numPivots-1
return np.array(rref), numPivots, numFree

Projects on real data

This course contains several mini-projects on real data sets. Moreover, project code is readily executable and extendable.

The final project of this course is on face recognition using recently published linear algebra techniques.

Prerequisites

You only need a casual fluency in Python’s numpy package as a prerequisite to taking this course.

Target audience

The target audience of this course includes the following:

  • Those who want to learn mathematics for data science in an accessible and applied manner.
  • Those who love programming but are intimidated by mathematics.
  • Those who desire a career in data science.

Learning outcomes

After taking this course, you will:

  • Core understanding of the most applied parts of linear algebra.
  • Familiarity with the use of the linear algebra module linalg in numpy.
  • Ability to understand and extend data science models.
  • Ability to solve real world data science problems.