Get Started with Matrix Algebra

Learn about matrices and how to perform different algebraic operations in R.

Becoming a competent data scientist requires a lot of dedication, hard work, and comfort with matrices.

The matrix is a tidy mathematical object with its own set of rules. Its algebra helps us understand matrix structure and its interactions with other matrices, vectors, and scalars.

This course will improve our ability to interact with matrices by teaching us to compute important matrix algebra operations.

Data is represented in computers as math objects, such as matrices and vectors. Data scientists interact with data through these computational tools.

This course is an agile review of algebra operations on matrices through coding in R, Rcpp, Armadillo, and Eigen to better understand the matrix object.

Note: This course doesn’t cover matrix algebra theory (it doesn’t include theorems and proofs) or matrix numerical methods (such as numerical stability issues and program optimization topics).

In this course, we’ll get acquainted with the skills required to loop through matrix columns and rows and call classes and methods of linear algebra (matrix math) libraries.

R is a free software environment for statistical computing and graphics. It compiles and executes on a wide range of operating systems, such as Unix, Linux, Windows, and MacOS.

Rcpp offers seamless integration of R and C++. Many R data types and objects can be mapped back and forth to C++ equivalents, which facilitates both writing new code and easier integration of third-party libraries.

C++ linear algebra libraries

One of the main rules in programming is never to reinvent the wheel. Therefore, a data scientist needs to be acquainted with the numerical matrix algebra libraries that are available. Armadillo and Eigen are both common choices for C++.

Armadillo is a high-quality linear algebra library for the C++ language, which aims for a good balance of speed and ease of use.

Eigen is a C++ template library for linear algebra. It supports all matrix sizes, from small (fixed-size) matrices to large, dense matrices, and even sparse matrices. It is generally considered fast, reliable, and elegant.

Course organization

Each section of the course covers one operation in matrix algebra, and focuses on the following concepts:

  • An explanation of the mathematical concept
  • Operation computation in R
  • Operation implementation in Rcpp
  • Computing with Armadillo C++ library
  • Computing with Eigen C++ library

In these sections, we will review the math, experiment with function calls in domain-specific computational tools, understand the complexity of implementing matrix operations, and get a feel for using optimized libraries.

Pre-requisites

It’s important to have these skills to fully understand and benefit from the course:

  • Some background knowledge of matrix algebra

  • Some background knowledge of computer science

  • Some familiarity with R and C++