What is vector data structure in R?
Vectors are sequence containers of the same type and used as data structures in multiple programming languages. In the R language, there are two major classifications of vectors:
- atomic (homogeneous)
- recursive (heterogeneous)
Atomic vs recursive vectors
As highlighted, when we pass a single value, it becomes a vector with a length of 1, and hence we are creating atomic vectors. Below on line #22, we are creating a recursive vector, list("a", 34L , 2:8). It contains an "a" character type at the first index. It contains 34L (integer type) at second index and 2:8 (vector type containing 2,3,4,5,6,7,8) at the third index.
In R, the default numeric value is double. To make it an integer (32bit), we can use
Lwith it.
cat("Atomic Vectors: \n")# Atomic vector of type integer.print(10L)# Atomic vector of type double.print(3.5)# Atomic vector of type logical.print(TRUE)# Atomic vector of type character.print("xyz");# Atomic vector of type complex.print(-3+2i)# Atomic vector of type raw.print(charToRaw('edpresso'))# Recursive vector i.e. listcat("Recursive Vector: \n")x <- list("a", 34L , 2:8)print(x)length(x)
Common properties
- Length: You can get the length of any vector using
length()method. - Attributes: To extract additional arbitrary metadata, use
attributes(). - Type: To check the type of vector, you can use
typeof()method.
Accessing vector elements
Elements can be accessed by using their indexes and square brackets. Indexes start with position 1 i.e., data[1]. Providing a negative value in the index drops that element from the result. TRUE, FALSE, 0 and 1 can also be used to index a vector.
Example
In line 2 of the code snippet below, we are creating vector data that contains the month name at each index.
- Line #4: In this line we’ve demonstrated indexing by using a vector containing index positions. Passing the vector of indexes
c(2,3,6)we can access each index of vector data. - Line #7: Here we see indexing using Boolean values. Passing a Boolean vector with equal length, each index will be marked either
trueorfalse. An index containingtruewill be accessed. - Line #11: This is an example of accessing using negative index values. Passing a vector of negative index values. It will exclude specified indexes from results i.e.,
-2and-5. Results will not include values at “Feb” and “May” indexes.
# Accessing elements using positiondata <- c("Jan","Feb","Mar","Apl","May","Jun","July","Aug","Sep","Oct","Nov","Dec")a <- data[c(2,3,6)]print(a)# Accessing elements using logical indexing.b <- data[c(TRUE, FALSE, FALSE,TRUE, TRUE, FALSE, FALSE,TRUE,TRUE, FALSE, FALSE,FALSE)]print(b)# Accessing elements using negative indexingc <- data[c(-2,-5)]print(c)# Vectory Typecat("Vector Typesis ",typeof(c))
Declaring and defining vectors
Listed below are some useful methods to create vectors, either recursive or atomic types.
1. Creating vectors with c()
This is a generic method that combines argument values to generate a vector. All argument values are of the same type and the return type will be the same as the arguments.
# Vector with muliple typesdataVector <- c('apple', 'red', 5, TRUE)cat("Vector Contain: \n")print(dataVector)
2. Creating vector with seq()
This seq() method accepts from, to and increment values as an argument. In the example below, seq(5, 9, by= 0.4) will generate values from 5 to 9 by an increment of 0.4.
# Create vector with elements from 5 - 9 incrementing by 0.4cat("Vector using seq() method:\n")print(seq(5, 9, by = 0.4))
3. Creating vectors with : Operator
By using this : operator, we can create a vector of consecutive values. As in the example below, on lines 2, 6, 10 we are using a colon operator to initialize the vectors.
# Creating a sequence from 5 - 13Vector <- 2:15cat("Vector#1:")print(Vector)# Creating a sequence from 6.6 - 12.6Vector <- 5.4:13.4cat("Vector#2:")print(Vector)# If the final element do not belong to the sequence then it is discarded.Vector <- 2.8:12.4cat("Vector#3:")print(Vector)
In most cases
cat()andprint()have the same behaviour butcat()is used for atomic types i.e., integer, double, character, raw etc. While theprint()method is used for non-atomic types like a non-empty list and any type of object.