Overview

This tutorial walks through a few helpful initial steps before analysis of experience sampling and EMA data (or any analyses, actually). Specifically, this tutorial demonstrates how to get the data in and obtain initial descriptive statistics and plots that are useful when making decisions about analyses.

Our examples make use of The AMIB Data, a multiple time-scale data set that has been shared for teaching purposes.

Outline

This script covers …

A. Loading The AMIB Data

B. Describing some aspects of the data

C. A few data visualizations

Preliminaries

Loading libraries used in this script.

library(psych) # for describing the data
library(plyr) #for data manipulation
library(ggplot2) # for data visualization

A. Loading the AMIB Data

The first step, and often the most frustrating part of data analysis is … getting the data into the program!

In recent years people are generally exchanging data files in a .csv format as these files port well and can interfce with many different software packages.

Here, we make use of the person-level and interaction-level (EMA-type) AMIB data files, (which can be merged easily using the id and day variables).

Loading person-level file (N = 190)

#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/AMIBshare_persons_2019_0501.csv"
#read in the .csv file using the url() function
AMIB_persons <- read.csv(file=url(filepath),header=TRUE)

Loading interaction-level file (T = many, on average 43 per person).

#set filepath for data file
filepath <- "https://quantdev.ssri.psu.edu/sites/qdev/files/AMIBshare_interaction_2019_0501.csv"
#read in the .csv file using the url() function
AMIB_interaction <- read.csv(file=url(filepath),header=TRUE)

B. Describing The AMIB Data

Once the data are in, we can begin learning about them.

Basic descriptives of person-level data

Subsetting to a few variables.

#subsetting to a few trait variables
AMIB_persons <- AMIB_persons[ ,c("id","sex",
                                 "bfi_o","bfi_c","bfi_e","bfi_a","bfi_n")]

Descriptives of traits using person-level data (sex, personality).

#basic descriptives (using describe() from the psych package)
describe(AMIB_persons)
##       vars   n   mean     sd median trimmed    mad   min max range  skew
## id       1 190 318.29 130.44  321.5  318.99 151.23 101.0 532 431.0 -0.04
## sex      2 190   1.66   0.48    2.0    1.70   0.00   1.0   2   1.0 -0.66
## bfi_o    3 190   3.60   0.96    3.5    3.64   0.74   1.0   5   4.0 -0.34
## bfi_c    4 190   3.76   0.85    4.0    3.77   0.74   1.5   5   3.5 -0.12
## bfi_e    5 190   3.38   1.00    3.5    3.40   0.74   1.0   5   4.0 -0.21
## bfi_a    6 190   3.61   0.88    3.5    3.69   0.74   1.0   5   4.0 -0.72
## bfi_n    7 190   2.98   0.96    3.0    3.00   1.48   1.0   5   4.0 -0.09
##       kurtosis   se
## id       -1.09 9.46
## sex      -1.57 0.03
## bfi_o    -0.48 0.07
## bfi_c    -0.90 0.06
## bfi_e    -0.58 0.07
## bfi_a     0.14 0.06
## bfi_n    -0.82 0.07
#psych::describe(AMIB_persons)

#correlations
cor(AMIB_persons[ ,-1]) #dropping 1st column (id)
##              sex       bfi_o       bfi_c       bfi_e       bfi_a
## sex   1.00000000  0.06161000  0.03394367  0.18873546  0.04422936
## bfi_o 0.06161000  1.00000000  0.03093131  0.13087435  0.03581446
## bfi_c 0.03394367  0.03093131  1.00000000 -0.01870259  0.08472053
## bfi_e 0.18873546  0.13087435 -0.01870259  1.00000000  0.08558411
## bfi_a 0.04422936  0.03581446  0.08472053  0.08558411  1.00000000
## bfi_n 0.18389358 -0.05557229 -0.02838956 -0.16160302 -0.12293379
##             bfi_n
## sex    0.18389358
## bfi_o -0.05557229
## bfi_c -0.02838956
## bfi_e -0.16160302
## bfi_a -0.12293379
## bfi_n  1.00000000
#plot matrix (using describe() from the psych package)
pairs.panels(AMIB_persons[ ,-1])

Note that the person-level descriptes are “cross-sectional”.

Basic descriptives of interaction-level data.

Subsetting to a few variables.
ID and TIME variables: “id”,“day”,“interaction”,“timea”
Outcome variables: “partner_gender”,“agval”,“stress”

#subsetting to a few variables
AMIB_interaction <- AMIB_interaction[ ,c("id","day","interaction", "timea", 
                                         "partner_gender","agval","stress")]

Often, our analyses will make use of both the repeated measures data and the person-level data. For illustration, we merge them here.

Merging person-level data into repeated meausures interaction-level data.

#merging repeated mesaures and person-level data
interaction_long <- merge(AMIB_interaction, AMIB_persons, by="id")

#Look at first few rows of data
head(interaction_long,10)
##     id day interaction timea partner_gender agval stress sex bfi_o bfi_c
## 1  101   1           1   700              0     3      1   2     4     4
## 2  101   1           2  1230              1     8      1   2     4     4
## 3  101   1           3  1245              1     8      1   2     4     4
## 4  101   1           4  1330              1     8      1   2     4     4
## 5  101   1           5  1420              1     6      1   2     4     4
## 6  101   1           6  1445              1     8      1   2     4     4
## 7  101   1           7  1920              1     8      1   2     4     4
## 8  101   1           8  2030              1     8      1   2     4     4
## 9  101   2           9    30              0     9      0   2     4     4
## 10 101   2          10     0              1     8      0   2     4     4
##    bfi_e bfi_a bfi_n
## 1    3.5   1.5     2
## 2    3.5   1.5     2
## 3    3.5   1.5     2
## 4    3.5   1.5     2
## 5    3.5   1.5     2
## 6    3.5   1.5     2
## 7    3.5   1.5     2
## 8    3.5   1.5     2
## 9    3.5   1.5     2
## 10   3.5   1.5     2
#checking number of persons
length(unique(interaction_long$id))
## [1] 184

Note that there are only 184 persons in the data now (vs. N = 190 in the person-leevl file). In this case, the discrepancy is becauese there were 6 persons that completed baseline questionnaires, but did not provide any EMA data.

Descriptives of the merged interaction-level and person-level data.

#basic descriptives (using describe() from the psych package)
describe(interaction_long)
##                vars    n    mean     sd median trimmed    mad   min  max
## id                1 7568  330.55 122.47  328.0  334.09 145.29 101.0  532
## day               2 7568    3.97   1.99    4.0    3.96   2.97   1.0    7
## interaction       3 7568   23.31  14.74   22.0   22.56  17.79   1.0   56
## timea             4 7500 1496.82 445.29 1500.0 1495.34 489.26   0.0 2800
## partner_gender    5 6884    0.60   0.49    1.0    0.62   0.00   0.0    1
## agval             6 7553    6.68   1.96    7.0    6.92   1.48   1.0    9
## stress            7 7544    1.31   1.33    1.0    1.14   1.48   0.0    5
## sex               8 7568    1.70   0.46    2.0    1.75   0.00   1.0    2
## bfi_o             9 7568    3.60   0.95    3.5    3.64   0.74   1.0    5
## bfi_c            10 7568    3.77   0.85    4.0    3.79   0.74   1.5    5
## bfi_e            11 7568    3.41   0.97    3.5    3.43   0.74   1.0    5
## bfi_a            12 7568    3.61   0.87    3.5    3.69   0.74   1.0    5
## bfi_n            13 7568    3.00   0.96    3.0    3.00   1.48   1.0    5
##                 range  skew kurtosis   se
## id              431.0 -0.17    -0.91 1.41
## day               6.0  0.02    -1.24 0.02
## interaction      55.0  0.35    -0.90 0.17
## timea          2800.0 -0.10    -0.14 5.14
## partner_gender    1.0 -0.39    -1.85 0.01
## agval             8.0 -0.95     0.33 0.02
## stress            5.0  0.77    -0.32 0.02
## sex               1.0 -0.85    -1.27 0.01
## bfi_o             4.0 -0.41    -0.24 0.01
## bfi_c             3.5 -0.15    -0.86 0.01
## bfi_e             4.0 -0.24    -0.52 0.01
## bfi_a             4.0 -0.65    -0.02 0.01
## bfi_n             4.0 -0.03    -0.76 0.01
#plot matrix (using describe() from the psych package)
pairs.panels(interaction_long[ ,c("partner_gender","agval","stress")])