Introduction

1

Working with data is an obvious requirement for data science professionals. The building blocks of working with data include understanding the most common data structures and how they are interrelated. In this guide, you will learn the techniques of programming *matrices*, *lists*, and *arrays* in R.

It's important to understand the concept of vectors before moving ahead.

A vector is the most common data structure in R. It is a sequence of elements of the same basic type. The `vector()`

function can be used to create a vector. The default mode is logical, but we can use constructors such as `character()`

, `numeric()`

, etc., to create a vector of a specific type.

The lines of code below construct a numeric and a logical vector, respectively. A vector can also contain strings, as shown by the vector `s`

.

`1 2 3 4 5 6 7`

`n <- c(1,2,5.3,6,-2,4) l <- c(TRUE,TRUE,TRUE,FALSE,TRUE,FALSE) s = c("USA", "UK", "AFRICA", "INDIA", "CHINA") class(n) class(l) class(s)`

{r}

Output:

`1 2 3`

`[1] "numeric" [1] "logical" [1] "character"`

It's also possible to do several operations on the vectors, such as combining vectors and performing mathematical operations. With this brief introduction to vectors, you are ready to understand matrices, lists, and arrays.

In R, matrices are an extension of numeric or character vectors. All columns in a matrix must have the same mode and the same length. Also, as is the case with atomic vectors, the elements of a matrix must be of the same data type. The general representation of a matrix is shown in the code below.

The arguments `nrow`

and `ncol`

denote the number of rows and columns, respectively. The argument `byrow = TRUE`

indicates that the matrix should be filled by the rows.

`1 2`

`m = matrix(c(20, 45, 33, 19, 52, 37), nrow=2, ncol=3, byrow = TRUE) print(m)`

{r}

Output:

`1 2 3`

`[,1] [,2] [,3] [1,] 20 45 33 [2,] 19 52 37`

It is possible to identify the rows, columns, or elements of a matrix using subscripts. For example, the element at the second row and second column can be accessed using the following command.

`1`

`m[2, 2]`

{r}

Output:

`1`

`[1] 52`

You can also create a matrix and give names to the rows and columns with the `dimname`

argument. In the first matrix, `m1`

, the elements are arranged sequentially by row. In the second matrix, `m2`

, the arrangement is done by columns. The `rownames`

and `colnames`

specify the row and column names of the matrix. All these arguments are passed into the `matrix()`

function while creating the matrix, `m3`

.

`1 2 3 4 5 6 7 8 9 10 11 12`

`m1 <- matrix(c(21:32), nrow = 4, byrow = TRUE) print(m1) m2 <- matrix(c(21:32), nrow = 4, byrow = FALSE) print(m2) # Define the column and row names. rownames = c("r1", "r2", "r3", "r4") colnames = c("c1", "c2", "c3") m3 <- matrix(c(21:32), nrow = 4, byrow = TRUE, dimnames = list(rownames, colnames)) print(m3)`

{r}

Output:

`1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19`

`[,1] [,2] [,3] [1,] 21 22 23 [2,] 24 25 26 [3,] 27 28 29 [4,] 30 31 32 [,1] [,2] [,3] [1,] 21 25 29 [2,] 22 26 30 [3,] 23 27 31 [4,] 24 28 32 c1 c2 c3 r1 21 22 23 r2 24 25 26 r3 27 28 29 r4 30 31 32`

You can program matrices to access the elements with row and column indices. For example, the code below prints the element at the third column and first row.

`1`

`print(m3[1,3])`

{r}

Output:

`1`

`[1] 23`

If you want to access only the second row, the code below performs this task.

`1`

`print(m3[2,])`

{r}

Output:

`1 2`

`c1 c2 c3 24 25 26`

It is possible to perform mathematical operations with matrices. The R operators are used to do this task, and the result is also a matrix, provided the number of rows and columns are the same for the matrices involved.

The code below creates a couple of *two by three* matrices and performs the addition operation. The resulting matrix is named `combined`

, as shown below.

`1 2 3 4 5`

`score1 <- matrix(c(5, 9, 0, -2, 7, 6), nrow = 2) score2 <- matrix(c(5, 2, 5, 9, -1, 4), nrow = 2) combined <- score1 + score2 print(combined)`

{r}

Output:

`1 2 3`

`[,1] [,2] [,3] [1,] 10 5 6 [2,] 11 7 10`

In the same manner, you can perform other mathematical operations on matrices, like subtraction, multiplication, and division.

A list is a generic vector containing a collection of objects (or components). The advantage of a list is that it allows you to store a variety of objects, which may be possibly unrelated, under one name.

The lines of code below create a list containing copies of three vectors: name, place, and age in years.

`1 2 3 4 5 6`

`name = c("abhi", "ansh", "ajay") place = c("delhi", "mumbai", "pune") age = c(TRUE, FALSE, TRUE, FALSE, FALSE) l = list(name, place, age) print(l)`

{r}

Output:

`1 2 3 4 5 6 7 8`

`[[1]] [1] "abhi" "ansh" "ajay" [[2]] [1] "delhi" "mumbai" "pune" [[3]] [1] TRUE FALSE TRUE FALSE FALSE`

You can merge several lists into one list as shown below.

`1 2 3 4`

`l1 <- list(10,20,30) l2 <- list("Jan","Feb","March") merged <- c(l1,l2) print(merged)`

{r}

Output:

`1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17`

`[[1]] [1] 10 [[2]] [1] 20 [[3]] [1] 30 [[4]] [1] "Jan" [[5]] [1] "Feb" [[6]] [1] "March"`

For programming purposes, you may be required to convert lists into vectors. This can be done with the `unlist()`

function.
This allows you to perform mathematical operations.

`1 2 3 4 5 6 7 8 9 10`

`l1 <- list(10,20,30) l2 <- list(5,5,5) v1 <- unlist(l1) v2 <- unlist(l2) print(v1) print(v2) addvec = v1 + v2 print(addvec)`

{r}

Output:

`1 2 3 4 5`

`[1] 10 20 30 [1] 5 5 5 [1] 15 25 35`

Arrays represent data objects that can store data in more than two dimensions. An array is created using the `array()`

function. The lines of code below create an array, `r1`

, that takes vectors `vec1`

and `vec2`

as inputs. It also uses the values in the `dim`

parameter to create an array.

`1 2 3 4`

`vec1 <- c(50,20,40) vec2 <- c(10,20,25,30,35,50) r1 <- array(c(vec1,vec2),dim = c(3,3,2)) print(r1)`

{r}

Output:

`1 2 3 4 5 6 7 8 9 10 11 12 13`

`, , 1 [,1] [,2] [,3] [1,] 50 10 30 [2,] 20 20 35 [3,] 40 25 50 , , 2 [,1] [,2] [,3] [1,] 50 10 30 [2,] 20 20 35 [3,] 40 25 50`

The `dimnames`

parameter can be used to give names to the rows, columns, and matrices in the array, as shown below.

`1 2 3 4 5 6 7 8`

`colnames <- c("column1","column2","column3") rownames <- c("row1","row2","row3") matrixnames <- c("matrix1","matrix2") # Take these vectors as input to the array. r2 <- array(c(vec1,vec2),dim = c(3,3,2),dimnames = list(rownames,colnames, matrixnames)) print(r2)`

{r}

Output:

`1 2 3 4 5 6 7 8 9 10 11 12 13`

`, , matrix1 column1 column2 column3 row1 50 10 30 row2 20 20 35 row3 40 25 50 , , matrix2 column1 column2 column3 row1 50 10 30 row2 20 20 35 row3 40 25 50`

It is easy to program and access array elements. The code below prints the third row of the second matrix of the array.

`1`

`print(r2[3,,2])`

{r}

Output:

`1 2`

`column1 column2 column3 40 25 50`

Similarly, the code below prints the element in the first row and third column of the first matrix.

`1`

`print(r2[1,3,1])`

{r}

Output:

`1`

`[1] 30`

In this guide, you learned how to program matrices, lists, and arrays in R. This is of great help in performing data manipulation tasks while dealing with different data structures.

To learn more about data science with R, please refer to the following guides:

1