Search⌘ K
AI Features

The Matrix and ndarray

Explore how to create and manipulate n-dimensional arrays using Rust's ndarray crate. Understand array operations, element-wise versus matrix multiplication, and integrating randomness. This lesson equips you to handle multidimensional data efficiently in Rust for data analysis tasks.

What is ndarray and why it’s useful

The Rust crate ndarray is used to work with arrays. It covers all the classic uses of an array-handling framework, such as numpy for Python.

Some use cases that are not covered by the main crate are covered by corollary crates, such as ndarray-linalg for linear algebra, ndarray-rand to generate randomness, and ndarray-stats for statistics.

Additionally, ndarray also has some nice extra features. These include support for rayon for parallelization and the popular BLAS low-level specs, through one of the working backends (using blas-src).

We can add ndarray to our project with the following command in Cargo.toml:

[dependencies]
ndarray = "0.14.0"

Why use ndarray?

Rustallows for many different types of arrays (or lists), and vector manipulation through powerful iterators, as we have seen in the first part of this course.

The basic Rust language (enhanced by std) is also many times faster than other more popular languages. It almost seems pointless to have a separate crate to handle arrays if those capabilities are already present in Rust.

So, what do we need ndarray for?

The short answer is that ndarray is specialized to handle n-dimensional arrays with a mathematical end in view, while Rust’s arrays and vectors are convenient for the day-to-day programmer’s needs.

Thus ndarray builds on the power already provided by the language. Indeed, Rust’s power has made it a candidate to become the language of data science in the next few years.

Note: In the following code we touch on concepts related to mathematics. However, knowing how to calculate matrices is not a requirement for this course. We won’t go in-depth on this subject, and the general concepts should still be easy to understand and useful for all learners. Feel free to skip ahead through the math and just look at the code.

Array creation

Let’s start by exploring how to create nd-arrays:

let arr1 = array![1., 2., 3., 4., 5., 6.];
println!("1D array: {}", arr1);

The ndarray crate provides the array! macro, which detects which type of ArrayBase is needed. This is a 1D or a one-dimensional array. Notice that the underlying ArrayBase already implements a std::fmt::Display function.

Let’s compare ndarray to the standard Rust array and Vec:

Rust 1.40.0
use ndarray::prelude::*;
fn main() {
let arr1 = array![1., 2., 3., 4., 5., 6.];
println!("1D array: \t{}", arr1);
let ls1 = [1., 2., 3., 4., 5., 6.];
println!("1D list: \t{:?}", ls1);
let vec1 = vec![1., 2., 3., 4., 5., 6.];
println!("1D vector: \t{:?}", vec1);
}

The output of the code above is almost the same in all three cases. However, ndarray has many more methods that make it more useful for matrix operations.

Operations on arrays

With ndarray, we not only can easily create multidimensional arrays, but we can also execute operations on them in a simple way.

Consider the following code:

Rust 1.40.0
use ndarray::prelude::*;
fn main() {
let arr1 = array![[1., 2., 3.], [ 4., 5., 6.]];
let arr2 = Array::from_elem((2, 1), 1.);
let arr = arr1.clone() + arr2.clone();
println!("2D array:\n{}", arr);
//println!("arr1:\n{}", arr1);
//println!("arr2:\n{}", arr2);
}

In line 4, we create a 2D array, stored in arr1. Then, in line 5, we use Array::from_elem() to create a 2D array filled with 1s. In this case, the array is:

[[1], [1]]

The first parameter of the from_elem() method takes a shape, in this case (2,1). The second parameter is the element with which to fill the array.

When we sum the two arrays, we get a new array, as expected. Notice that we use the method clone() in line 6 to allow us to uncomment the println! lines and see the actual data on those arrays. Experiment in the playground to see what happens.

Multiplying arrays works as well.

Rust 1.40.0
use ndarray::prelude::*;
fn main() {
let arr1 = array![[1., 2., 3.], [ 4., 5., 6.]];
let arr2 = Array::<f64, _>::zeros(arr1.raw_dim());
let arr = arr1 * arr2;
println!("\n{}", arr);
}

In line 4, we use Array::zeros() to create an array filled with zeros. In line 5, we obtain the required shape from arr1 with the raw_dim() method. Remember that we need to specify to the type-checker what kind of zeros to fill the array with (in this case we’re filling them with f64). Also in line 5, we let the compiler infer the dimension. That is why we use the _ notation in Array::<f64, _>. Keep in mind that the simple operations are made to work in an element-wise fashion. Consider the following code:

Rust 1.40.0
use ndarray::prelude::*;
fn main() {
let identity: &Array2<f64> = &Array::eye(3);
println!("Identity:\n{}", identity);
let arr1 = array![[1., 2., 3.], [ 4., 5., 6.], [7., 8., 9.]];
let mult1 = &arr1.clone() * identity;
println!("Multiplication element-wise:\n{}", mult1);
let mult2 = &arr1.dot(identity);
println!("Dot multiplication:\n{}", mult2);
}

In line 4, we use the eye() method to create an identity matrix (a square matrix with the diagonal filled with 1s and the rest with 0s).

With Array2<f64>, we can already define a 2D array of type f64.

Finally, notice the difference between what happens when we multiply element-wise in line 9, and when we perform a matrix dot multiplication in line 12. In dot multiplication, the identity matrix works as the number 1 in a scalar multiplication, like so:

A * 1 = A

From here, our result is an array equal to arr1. The element-wise multiplication gives us a very different result from dot multiplication.

Randomness in arrays

By using the crate ndarray-rand, we can add the power of the rand crate to the ndarray ecosystem.

We can import it into the Cargo.toml, this way:

[dependencies]
ndarray = "0.14.0"
ndarray-rand = "0.14.0"

Now we can add some randomness when needed:

Rust 1.40.0
use ndarray::prelude::*;
use ndarray_rand::{rand_distr::Uniform, RandomExt};
fn main() {
let arr = Array::random((2, 5), Uniform::new(0., 10.));
println!("{:5.2}", arr);
}

No matter how many times we execute the above code, we’ll always obtain a 2 X 5 array filled with random numbers between 0.0 and 10.0.

In line 5, we call Array::random(), which accepts a shape to fill and a distribution to sample random elements. In this case, we use a Uniform distribution.

We can also sample data from an array as shown:

Rust 1.40.0
use ndarray::prelude::*;
use ndarray_rand::{RandomExt, SamplingStrategy};
fn main() {
let samples = array![1., 2., 3., 4., 5., 6.];
let arr = samples.sample_axis(
Axis(0), 2, SamplingStrategy::WithoutReplacement);
println!("\nSampling from:\t{}\nTwo elements:\t{}", samples, arr);
}

In line 6, the sample_axis() samples on the array’s axis indicated in the first parameter, in this case 0, as it is a 1D array (a vector). The second parameter in line 7 indicates the number of samples, and the third parameter sets the sampling strategy.

The sampling strategy is either WithReplacement, which allows the use of the same sample more than once, or WithoutReplacement, which only gets each sample once.