3 Data structures

In R we have objects which are functions and objects which are data.

Function examples:
- sin()
- integrate()
- plot()
- paste()
Data examples:
- 42
- 1:5
- “R”
- matrix(1:12, nrow=4, ncol=3)
- data.frame(a=1:5, tmt=c(“a”,“b”,“a”,“b”,“a”))
- list(x=2, y=“abc”, x=1:10)

3.1 Vector

> # Vector of numbers, e.g:
> c(1, 1.2, pi, exp(1))
## [1] 1.000 1.200 3.142 2.718
> 
> # We can have vectors of other things too, e.g:
> c(TRUE, 1 == 2)
## [1]  TRUE FALSE
> c("a", "ab", "abc")
## [1] "a"   "ab"  "abc"
> 
> # But not combinations, e.g:
> c("a", 5, 1 == 2)
## [1] "a"     "5"     "FALSE"
> # Notice that R just turned everything into characters!

3.1.1 Constructing vectors

> # Integers from 9 to 17
> x <- 9:17
> x
## [1]  9 10 11 12 13 14 15 16 17
> 
> # A sequence of 11 numbers from 0 to 1
> y <- seq(0, 1, length = 11)
> y
##  [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> 
> # The same number or the same vector several times
> z <- rep(1:2, 5)
> z
##  [1] 1 2 1 2 1 2 1 2 1 2
> 
> # Combine numbers, vectors or both into a new vector
> xz10 <- c(x, z, 10)
> xz10
##  [1]  9 10 11 12 13 14 15 16 17  1  2  1  2  1  2  1  2  1  2 10

3.1.2 Index and logical index

> # Define a vector with integers from (-5) to 5 and extract the numbers with
> # absolute value less than 3:
> x <- (-5):5
> x
##  [1] -5 -4 -3 -2 -1  0  1  2  3  4  5
> 
> # by their index in the vector:
> x[4:8]
## [1] -2 -1  0  1  2
> 
> # or, by negative selection (set a minus in front of the indices we don't
> # want):
> x[-c(1:3, 9:11)]
## [1] -2 -1  0  1  2
> 
> # A logical vector can be defined by:
> index <- abs(x) < 3
> index
##  [1] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE
> 
> # Now this vector can be used to extract the wanted numbers:
> x[index]
## [1] -2 -1  0  1  2

3.2 Factor

A special kind of vector is a factor. It has a known finite set of levels (options), e.g:

> # gl = generate levels
> gl(2, 10, labels = c("male", "female"))
##  [1] male   male   male   male   male   male   male   male   male   male  
## [11] female female female female female female female female female female
## Levels: male female
> 
> # One could also do:
> as.factor(c(rep("male", 10), rep("female", 10)))
##  [1] male   male   male   male   male   male   male   male   male   male  
## [11] female female female female female female female female female female
## Levels: female male

3.3 Matrix and array

Similar to vectors we can have matrices of objects of the same type, e.g:

> matrix(c(1, 2, 3, 4, 5, 6) + pi, nrow = 2)
##       [,1]  [,2]  [,3]
## [1,] 4.142 6.142 8.142
## [2,] 5.142 7.142 9.142
> 
> matrix(c(1, 2, 3, 4, 5, 6) + pi, nrow = 2) < 6
##      [,1]  [,2]  [,3]
## [1,] TRUE FALSE FALSE
## [2,] TRUE FALSE FALSE
> 
> # We can create higher order arrays, e.g:
> array(c(1:24), dim = c(4, 3, 2))
## , , 1
## 
##      [,1] [,2] [,3]
## [1,]    1    5    9
## [2,]    2    6   10
## [3,]    3    7   11
## [4,]    4    8   12
## 
## , , 2
## 
##      [,1] [,2] [,3]
## [1,]   13   17   21
## [2,]   14   18   22
## [3,]   15   19   23
## [4,]   16   20   24

3.3.1 Constructing matrices

> 
> # Combine rows into a matrix
> A <- rbind(1:3, c(1, 1, 2))
> A
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    1    1    2
> 
> # Or columns
> B <- cbind(1:3, c(1, 1, 2))
> B
##      [,1] [,2]
## [1,]    1    1
## [2,]    2    1
## [3,]    3    2
> 
> # Define a matrix from one long vector
> C <- matrix(c(1, 0, 0, 1, 1, 0, 1, 1, 1), nrow = 3, ncol = 3)
> C
##      [,1] [,2] [,3]
## [1,]    1    1    1
## [2,]    0    1    1
## [3,]    0    0    1
> 
> # Can also be done by rows by adding 'byrow=TRUE' before the last parenthesis.
> # Try!

3.3.2 Index and logical index

> A <- matrix((-4):5, nrow = 2, ncol = 5)
> A
##      [,1] [,2] [,3] [,4] [,5]
## [1,]   -4   -2    0    2    4
## [2,]   -3   -1    1    3    5
> 
> 
> # Negative values
> A[A < 0]
## [1] -4 -3 -2 -1
> 
> # Assignments
> A[A < 0] <- 0
> A
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    0    0    0    2    4
## [2,]    0    0    1    3    5
> 
> # Matrix rows can be selected by
> A[2, ]
## [1] 0 0 1 3 5
> 
> # and similarly for columns
> A[, c(2, 4)]
##      [,1] [,2]
## [1,]    0    2
## [2,]    0    3

3.3.3 Properties of vectors and matrices

The R function mode() when applied to a vector or to a matrix detects the type of singles that is stored:

> A <- matrix(rep(c(TRUE, FALSE), 2), nrow = 2)
> 
> B <- rnorm(4)
> 
> C <- matrix(LETTERS[1:9], nrow = 3)
> 
> A
##       [,1]  [,2]
## [1,]  TRUE  TRUE
## [2,] FALSE FALSE
> B
## [1] -0.1343  0.1892 -1.2469 -1.0376
> C
##      [,1] [,2] [,3]
## [1,] "A"  "D"  "G" 
## [2,] "B"  "E"  "H" 
## [3,] "C"  "F"  "I"
> 
> mode(A)
## [1] "logical"
> mode(B)
## [1] "numeric"
> mode(C)
## [1] "character"

Vectors and matrices have lengths: the length is the number of elements:

> x <- matrix(c(NA, 2:12), ncol = 3)
> x
##      [,1] [,2] [,3]
## [1,]   NA    5    9
## [2,]    2    6   10
## [3,]    3    7   11
## [4,]    4    8   12
> 
> length(x[1, ])
## [1] 3
> 
> length(x)
## [1] 12
> 
> # The dimension of a matrix is the number of rows and columns: The number of
> # columns is the second element:
> dim(x)
## [1] 4 3
> dim(x)[2]
## [1] 3

3.3.4 Naming rows and columns in a matrix

We can add names to a matrix with the colnames() and rownames() functions:

> x <- matrix(rnorm(12), nrow = 4)
> x
##          [,1]    [,2]    [,3]
## [1,] -0.17693 -0.6982 -0.7909
## [2,] -1.82361  1.7129  0.7004
## [3,] -0.03294  1.4788 -1.6718
## [4,] -0.80677  0.4982 -0.8348
> 
> colnames(x) <- paste("data", 1:3, sep = "")
> 
> rownames(x) <- paste("obs", 1:4, sep = "")
> 
> x
##         data1   data2   data3
## obs1 -0.17693 -0.6982 -0.7909
## obs2 -1.82361  1.7129  0.7004
## obs3 -0.03294  1.4788 -1.6718
## obs4 -0.80677  0.4982 -0.8348
> 
> y <- matrix(rnorm(15), nrow = 5)
> y
##         [,1]    [,2]   [,3]
## [1,] -0.9181  1.3811 0.9727
## [2,] -0.2014 -0.2144 1.7128
## [3,]  0.3535 -0.3591 0.4331
## [4,] -0.1364  0.2015 0.8959
## [5,] -1.2797 -1.2802 0.9468
> 
> colnames(y) <- LETTERS[1:ncol(y)]
> 
> rownames(y) <- letters[1:nrow(y)]
> 
> y
##         A       B      C
## a -0.9181  1.3811 0.9727
## b -0.2014 -0.2144 1.7128
## c  0.3535 -0.3591 0.4331
## d -0.1364  0.2015 0.8959
## e -1.2797 -1.2802 0.9468

3.3.5 Matrix multiplication

> M <- matrix(rnorm(20), nrow = 4, ncol = 5)
> N <- matrix(rnorm(15), nrow = 5, ncol = 3)
> 
> M %*% N
##         [,1]    [,2]    [,3]
## [1,] -1.3324 -0.3467  0.1803
## [2,]  1.7218  2.0924 -1.2959
## [3,]  1.7792  0.5250  1.9424
## [4,]  0.1583  1.9016 -1.0523
> 
> # Can we perform N*M? No! A and B are not compatible!! Try to run: N%*%M

3.3.6 Additional functions

> M <- matrix(rnorm(16), nrow = 4, ncol = 4)
> 
> dim(M)
## [1] 4 4
> 
> t(M)
##         [,1]    [,2]    [,3]    [,4]
## [1,]  0.8836  0.7216  1.1255 -0.3660
## [2,] -1.4741  0.6280 -0.5720 -0.1200
## [3,] -0.5487  1.0463 -1.2647  0.2757
## [4,]  0.4768 -1.4371 -0.1576 -0.3024
> 
> det(M)
## [1] 1.64
> 
> (invM <- solve(M))
##         [,1]    [,2]     [,3]   [,4]
## [1,]  0.3360  0.4212 -0.10608 -1.417
## [2,] -0.5751  0.0519  0.03683 -1.173
## [3,]  0.5220  0.3749 -0.82252 -0.530
## [4,]  0.2975 -0.1887 -0.63600 -1.610
> 
> eigen(M)
## eigen() decomposition
## $values
## [1]  0.7635+1.456i  0.7635-1.456i -0.9312+0.000i -0.6514+0.000i
## 
## $vectors
##                   [,1]              [,2]       [,3]        [,4]
## [1,] -0.68683+0.00000i -0.68683+0.00000i  0.2619+0i  0.36373+0i
## [2,]  0.08143+0.62184i  0.08143-0.62184i  0.6659+0i  0.63353+0i
## [3,] -0.34794+0.08086i -0.34794-0.08086i -0.4924+0i -0.09702+0i
## [4,]  0.02428-0.08228i  0.02428+0.08228i  0.4954+0i  0.67597+0i

3.4 Data-frame

A special data object is called a data frame (data.frame). We can create data frames by reading data in from files or by using the function as.data.frame() on a set of vectors. A data frame is a set of parallel vectors, where the vectors can be of different types, e.g:

> MAS <- data.frame(course = c("CTA", "PSP", "RM"), hours = c(39, 65, 52))
> MAS

##   course hours
## 1    CTA    39
## 2    PSP    65
## 3     RM    52

> # Compare to a matrix
> cbind(course = c("CTA", "PSP", "RM"), hours = c(39, 65, 52))

##      course hours
## [1,] "CTA"  "39" 
## [2,] "PSP"  "65" 
## [3,] "RM"   "52"

3.4.1 Data frames: adding and removing columns

> dat <- data.frame(x = LETTERS[1:3], y = 1:3)
> dat

##   x y
## 1 A 1
## 2 B 2
## 3 C 3

> dat[, 1]

## [1] "A" "B" "C"

> dat$x

## [1] "A" "B" "C"

> # It is simple to add or remove a column:
> 
> dat$z <- dat$y^2
> dat$name <- c("A1", "A2", "A3")
> dat$y <- NULL
> dat

##   x z name
## 1 A 1   A1
## 2 B 4   A2
## 3 C 9   A3

3.4.2 Data frames: merging data frames

> df1 <- data.frame(course = c("CTA", "PSP", "RM"), hours = c(39, 65, 52))
> df1

##   course hours
## 1    CTA    39
## 2    PSP    65
## 3     RM    52

> df2 <- data.frame(course = c("RM", "CTA", "PSP"), credits = c(6, 4, 8))
> df2

##   course credits
## 1     RM       6
## 2    CTA       4
## 3    PSP       8

> # We can merge that information into one data set by:
> 
> df12 <- merge(df1, df2, by = "course")
> df12

##   course hours credits
## 1    CTA    39       4
## 2    PSP    65       8
## 3     RM    52       6

3.4.3 Data frames: getting dimension, column info and others

> df <- airquality
> 
> names(df)

## [1] "Ozone"   "Solar.R" "Wind"    "Temp"    "Month"   "Day"

> class(df$Ozone)

## [1] "integer"

> class(df$Wind)

## [1] "numeric"

> dim(df)

## [1] 153   6

> nrow(df)

## [1] 153

> ncol(df)

## [1] 6

> # Get an overview of the object structure:
> 
> str(df)

## 'data.frame':    153 obs. of  6 variables:
##  $ Ozone  : int  41 36 12 18 NA 28 23 19 8 NA ...
##  $ Solar.R: int  190 118 149 313 NA NA 299 99 19 194 ...
##  $ Wind   : num  7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
##  $ Temp   : int  67 72 74 62 56 66 65 59 61 69 ...
##  $ Month  : int  5 5 5 5 5 5 5 5 5 5 ...
##  $ Day    : int  1 2 3 4 5 6 7 8 9 10 ...

> # First rows of a data frame:
> 
> head(airquality, 3)

##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3

> head(airquality, 10)

##    Ozone Solar.R Wind Temp Month Day
## 1     41     190  7.4   67     5   1
## 2     36     118  8.0   72     5   2
## 3     12     149 12.6   74     5   3
## 4     18     313 11.5   62     5   4
## 5     NA      NA 14.3   56     5   5
## 6     28      NA 14.9   66     5   6
## 7     23     299  8.6   65     5   7
## 8     19      99 13.8   59     5   8
## 9      8      19 20.1   61     5   9
## 10    NA     194  8.6   69     5  10

> # Last rows of a data frame:
> 
> tail(airquality, 3)

##     Ozone Solar.R Wind Temp Month Day
## 151    14     191 14.3   75     9  28
## 152    18     131  8.0   76     9  29
## 153    20     223 11.5   68     9  30

> tail(airquality, 9)

##     Ozone Solar.R Wind Temp Month Day
## 145    23      14  9.2   71     9  22
## 146    36     139 10.3   81     9  23
## 147     7      49 10.3   69     9  24
## 148    14      20 16.6   63     9  25
## 149    30     193  6.9   70     9  26
## 150    NA     145 13.2   77     9  27
## 151    14     191 14.3   75     9  28
## 152    18     131  8.0   76     9  29
## 153    20     223 11.5   68     9  30

3.4.4 Data frames: the `subset()` function

Let’s look at the airquality data again:

> head(airquality, 3)

##   Ozone Solar.R Wind Temp Month Day
## 1    41     190  7.4   67     5   1
## 2    36     118  8.0   72     5   2
## 3    12     149 12.6   74     5   3

Logical indexing applies to data frames:

> datA <- airquality[airquality$Temp > 80, c("Ozone", "Temp")]

… but a neat function is built in for making subsets of data:

> (datA <- subset(airquality, Temp > 80, select = c(Ozone, Temp)))

##     Ozone Temp
## 29     45   81
## 35     NA   84
## 36     NA   85
## 38     29   82
## 39     NA   87
## 40     71   90
## 41     39   87
## 42     NA   93
## 43     NA   92
## 44     23   82
## 61     NA   83
## 62    135   84
## 63     49   85
## 64     32   81
## 65     NA   84
## 66     64   83
## 67     40   83
## 68     77   88
## 69     97   92
## 70     97   92
## 71     85   89
## 72     NA   82
## 74     27   81
## 75     NA   91
## 77     48   81
## 78     35   82
## 79     61   84
## 80     79   87
## 81     63   85
## 83     NA   81
## 84     NA   82
## 85     80   86
## 86    108   85
## 87     20   82
## 88     52   86
## 89     82   88
## 90     50   86
## 91     64   83
## 92     59   81
## 93     39   81
## 94      9   81
## 95     16   82
## 96     78   86
## 97     35   85
## 98     66   87
## 99    122   89
## 100    89   90
## 101   110   90
## 102    NA   92
## 103    NA   86
## 104    44   86
## 105    28   82
## 117   168   81
## 118    73   86
## 119    NA   88
## 120    76   97
## 121   118   94
## 122    84   96
## 123    85   94
## 124    96   91
## 125    78   92
## 126    73   93
## 127    91   93
## 128    47   87
## 129    32   84
## 134    44   81
## 143    16   82
## 146    36   81

> (datB <- subset(airquality, Day == 1, select = -Temp))

##     Ozone Solar.R Wind Month Day
## 1      41     190  7.4     5   1
## 32     NA     286  8.6     6   1
## 62    135     269  4.1     7   1
## 93     39      83  6.9     8   1
## 124    96     167  6.9     9   1

> (datC <- subset(airquality, select = Ozone:Wind))

##     Ozone Solar.R Wind
## 1      41     190  7.4
## 2      36     118  8.0
## 3      12     149 12.6
## 4      18     313 11.5
## 5      NA      NA 14.3
## 6      28      NA 14.9
## 7      23     299  8.6
## 8      19      99 13.8
## 9       8      19 20.1
## 10     NA     194  8.6
## 11      7      NA  6.9
## 12     16     256  9.7
## 13     11     290  9.2
## 14     14     274 10.9
## 15     18      65 13.2
## 16     14     334 11.5
## 17     34     307 12.0
## 18      6      78 18.4
## 19     30     322 11.5
## 20     11      44  9.7
## 21      1       8  9.7
## 22     11     320 16.6
## 23      4      25  9.7
## 24     32      92 12.0
## 25     NA      66 16.6
## 26     NA     266 14.9
## 27     NA      NA  8.0
## 28     23      13 12.0
## 29     45     252 14.9
## 30    115     223  5.7
## 31     37     279  7.4
## 32     NA     286  8.6
## 33     NA     287  9.7
## 34     NA     242 16.1
## 35     NA     186  9.2
## 36     NA     220  8.6
## 37     NA     264 14.3
## 38     29     127  9.7
## 39     NA     273  6.9
## 40     71     291 13.8
## 41     39     323 11.5
## 42     NA     259 10.9
## 43     NA     250  9.2
## 44     23     148  8.0
## 45     NA     332 13.8
## 46     NA     322 11.5
## 47     21     191 14.9
## 48     37     284 20.7
## 49     20      37  9.2
## 50     12     120 11.5
## 51     13     137 10.3
## 52     NA     150  6.3
## 53     NA      59  1.7
## 54     NA      91  4.6
## 55     NA     250  6.3
## 56     NA     135  8.0
## 57     NA     127  8.0
## 58     NA      47 10.3
## 59     NA      98 11.5
## 60     NA      31 14.9
## 61     NA     138  8.0
## 62    135     269  4.1
## 63     49     248  9.2
## 64     32     236  9.2
## 65     NA     101 10.9
## 66     64     175  4.6
## 67     40     314 10.9
## 68     77     276  5.1
## 69     97     267  6.3
## 70     97     272  5.7
## 71     85     175  7.4
## 72     NA     139  8.6
## 73     10     264 14.3
## 74     27     175 14.9
## 75     NA     291 14.9
## 76      7      48 14.3
## 77     48     260  6.9
## 78     35     274 10.3
## 79     61     285  6.3
## 80     79     187  5.1
## 81     63     220 11.5
## 82     16       7  6.9
## 83     NA     258  9.7
## 84     NA     295 11.5
## 85     80     294  8.6
## 86    108     223  8.0
## 87     20      81  8.6
## 88     52      82 12.0
## 89     82     213  7.4
## 90     50     275  7.4
## 91     64     253  7.4
## 92     59     254  9.2
## 93     39      83  6.9
## 94      9      24 13.8
## 95     16      77  7.4
## 96     78      NA  6.9
## 97     35      NA  7.4
## 98     66      NA  4.6
## 99    122     255  4.0
## 100    89     229 10.3
## 101   110     207  8.0
## 102    NA     222  8.6
## 103    NA     137 11.5
## 104    44     192 11.5
## 105    28     273 11.5
## 106    65     157  9.7
## 107    NA      64 11.5
## 108    22      71 10.3
## 109    59      51  6.3
## 110    23     115  7.4
## 111    31     244 10.9
## 112    44     190 10.3
## 113    21     259 15.5
## 114     9      36 14.3
## 115    NA     255 12.6
## 116    45     212  9.7
## 117   168     238  3.4
## 118    73     215  8.0
## 119    NA     153  5.7
## 120    76     203  9.7
## 121   118     225  2.3
## 122    84     237  6.3
## 123    85     188  6.3
## 124    96     167  6.9
## 125    78     197  5.1
## 126    73     183  2.8
## 127    91     189  4.6
## 128    47      95  7.4
## 129    32      92 15.5
## 130    20     252 10.9
## 131    23     220 10.3
## 132    21     230 10.9
## 133    24     259  9.7
## 134    44     236 14.9
## 135    21     259 15.5
## 136    28     238  6.3
## 137     9      24 10.9
## 138    13     112 11.5
## 139    46     237  6.9
## 140    18     224 13.8
## 141    13      27 10.3
## 142    24     238 10.3
## 143    16     201  8.0
## 144    13     238 12.6
## 145    23      14  9.2
## 146    36     139 10.3
## 147     7      49 10.3
## 148    14      20 16.6
## 149    30     193  6.9
## 150    NA     145 13.2
## 151    14     191 14.3
## 152    18     131  8.0
## 153    20     223 11.5

3.4.5 Data frames: the `summary()` function

The summary() function gives you a range of statistics…

> summary(airquality$Wind)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    1.70    7.40    9.70    9.96   11.50   20.70

… that you could alternatively obtain using the R functions min(), max(), mean(), median(), quantile().
The summary of a data frame gives the summary of each column:

> summary(airquality)
##      Ozone          Solar.R         Wind            Temp          Month     
##  Min.   :  1.0   Min.   :  7   Min.   : 1.70   Min.   :56.0   Min.   :5.00  
##  1st Qu.: 18.0   1st Qu.:116   1st Qu.: 7.40   1st Qu.:72.0   1st Qu.:6.00  
##  Median : 31.5   Median :205   Median : 9.70   Median :79.0   Median :7.00  
##  Mean   : 42.1   Mean   :186   Mean   : 9.96   Mean   :77.9   Mean   :6.99  
##  3rd Qu.: 63.2   3rd Qu.:259   3rd Qu.:11.50   3rd Qu.:85.0   3rd Qu.:8.00  
##  Max.   :168.0   Max.   :334   Max.   :20.70   Max.   :97.0   Max.   :9.00  
##  NA's   :37      NA's   :7                                                  
##       Day      
##  Min.   : 1.0  
##  1st Qu.: 8.0  
##  Median :16.0  
##  Mean   :15.8  
##  3rd Qu.:23.0  
##  Max.   :31.0  
##

3.4.6 Data frames: missing values

R uses the special value NA to code missing values.
The result of arithmetic involving NAs becomes NA as well:

> colMeans(airquality)
##   Ozone Solar.R    Wind    Temp   Month     Day 
##      NA      NA   9.958  77.882   6.993  15.804

We need a special function is.na to filter out NAs:

> is.na(NA)
## [1] TRUE

To get rid of NAs in a column we can use:

> s <- subset(airquality, !is.na(Ozone))
> 
> colMeans(s)
##   Ozone Solar.R    Wind    Temp   Month     Day 
##  42.129      NA   9.862  77.871   7.198  15.534

Note that the argument na.rm=TRUE can be passed to most summary functions e.g. sum(), mean(), sd():

> mean(airquality$Ozone, na.rm = TRUE)

## [1] 42.13

> # or
> 
> colMeans(airquality, na.rm = TRUE)

##   Ozone Solar.R    Wind    Temp   Month     Day 
##  42.129 185.932   9.958  77.882   6.993  15.804

3.5 Lists

A list is a most general object type. Elements can be of different types and lengths, e.g:

> list(a = 1, b = "Lisbon", c = c(1, 2, 3), d = list(e = matrix(1:4, 2), f = function(x) x^2))
## $a
## [1] 1
## 
## $b
## [1] "Lisbon"
## 
## $c
## [1] 1 2 3
## 
## $d
## $d$e
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
## 
## $d$f
## function(x) x^2

The objects returned from many of the built-in functions in R are fairly complicated lists!

3 Data structures

3.1 Vector

3.1.1 Constructing vectors

3.1.2 Index and logical index

3.2 Factor

3.3 Matrix and array

3.3.1 Constructing matrices

3.3.2 Index and logical index

3.3.3 Properties of vectors and matrices

3.3.4 Naming rows and columns in a matrix

3.3.5 Matrix multiplication

3.3.6 Additional functions

3.4 Data-frame

3.4.1 Data frames: adding and removing columns

3.4.2 Data frames: merging data frames

3.4.3 Data frames: getting dimension, column info and others

3.4.4 Data frames: the subset() function

3.4.5 Data frames: the summary() function

3.4.6 Data frames: missing values

3.5 Lists

3.4.4 Data frames: the `subset()` function

3.4.5 Data frames: the `summary()` function