R Exercises - Cognitive Psychology
This exercise is based on data from a PhD student in cognitive psychology. In his work, this student collects response times (RT) to two simultaneous tasks, and then he has to analyse the response times.
- At \(t_0\), the stimulus S1 is triggered
- At \(t_1=t_0 + SOA\), the stimulus S2 is triggered (SOA is the time between the 2 stimuli)
- At \(t_2=t_0 + RT1\), the subject responds to stimulus S1
- At \(t_3=t_1 + RT2\), the subject responds to stimulus S2
The inter-response interval (IRI) is defined as the time difference between \(t_3\) and \(t_2\), thus \(IRI=SOA+RT2-RT1\). Negatives IRI therefore mean that the response to S2 is emitted before the response to S1.
For theoretical reasons, one can consider that if \(SOA=1500\) ms, both stimulis are answered independently. One can thus use the RT1 and RT2 values for SOA=1500 ms to estimate the IRI in the case of independent responses to the stimuli.
Data wrangling
Download the datafiles archive and unzip it, then create a R project in this folder in Rstudio.
Load the
tidyverseandpatchworkpackages:Load the .csv file in the
Datafolder and save it intoraw_data. Look into the help ofread_csv()to help you get rid of the error.-
Using successive pipe operations, we will now create a
datacleantable fromraw_data, where we:-
Filter
raw_dataso thatProcedure[Trial]is only equal to"EssaisDT" -
Mutate the table by adding a column
IRIcontainingSOAdur + S2Visuel.RT - S1Audio.RT -
Filter the table so that some extreme values are excluded:
-
S1Audio.RTis smaller or equal to 2500 and larger or equal to 100 -
S2Visuel.RTis smaller or equal to 2500 and larger or equal to 100 -
S1Audio.ACC,S1response.ACCandS2Visuel.ACCare equal to 1
-
-
Filter
-
Let’s compute the simulated values in the case the responses to the stimuli are independent, i.e. using SOA=1500ms, and save it into a tibble called
IRI_sim. Using successive pipe operations and starting fromdataclean:-
Filter rows so that
SOAduris equal to 1500 - Delete the
SOAdurandIRIcolumns -
Mutate the table to add 3 columns
IRI_sim_xx, wherexx=15, 65 or 250 andIRI_sim_xx = xx + S2Visuel.RT - S1Audio.RT. - Using
pivot_longer()and the optionsnames_prefix = "IRI_sim_", names_to = "SOAdur", values_to = "IRI_sim", pivot the columns containing"IRI_sim_"into a long table (you need to add the option to select the corresponding columns).
-
Filter rows so that
We want now to get the averaged
IRIper subject and per SOA, and its standard deviation. Usinggroup_by()andsummarise(), store the mean and standard deviation ofIRIin a table calledstats_obs, starting fromdataclean. It should look like this:
# A tibble: 24 × 4
# Groups: Subject [6]
Subject SOAdur mean sd
<dbl> <dbl> <dbl> <dbl>
1 1 15 -64.8 291.
2 1 65 20.8 297.
3 1 250 218. 217.
4 1 1500 1019. 228.
5 2 15 -159. 193.
6 2 65 -156. 236.
7 2 250 4.42 191.
8 2 1500 1184. 198.
9 3 15 -289. 207.
10 3 65 -251. 262.
# ℹ 14 more rows
- We want to do the same for the 3 simulated IRI.
# A tibble: 18 × 4
# Groups: Subject [6]
Subject SOAdur mean sd
<dbl> <chr> <dbl> <dbl>
1 1 15 -466. 228.
2 1 250 -231. 228.
3 1 65 -416. 228.
4 2 15 -301. 198.
5 2 250 -65.9 198.
6 2 65 -251. 198.
7 3 15 -99.0 157.
8 3 250 136. 157.
9 3 65 -49.0 157.
10 4 15 -168. 214.
11 4 250 66.9 214.
12 4 65 -118. 214.
13 5 15 -367. 249.
14 5 250 -132. 249.
15 5 65 -317. 249.
16 6 15 -313. 161.
17 6 250 -77.9 161.
18 6 65 -263. 161.
Plotting
- We want now to produce a graph showing the histograms of the observed
IRIcolumn usingggplot2.- Create a
ggplotusing thedatacleandataset - Set the aesthetics to
x = IRI - Create the histograms using
geom_histogram(), with a fill color depending onSOAdur - Arrange the plots on a grid depending on the
Subjectcolumn usingfacet_wrap() - Add a vertical lign marking the average value for each subject using
geom_vline(). The data for these lines are stored in thestats_obsdataset. - Play with the theme and other ggplot commands to make the plot look like the one below
- Try plotting a normalized histogram by looking up on the Internet how to do this
- Create a
- Let’s do the same for the simulated dataset. It should look like this: