In this section, we are going to learn to plot a bar graph using a package called ggplot2. This package is very popular for R users and that is because it allows you to customize anything on your graph. Start off by installing the package ggplot2 as you have been shown in a past section.
Now, ggplot requires a long format of data rather than a wide format. A wide format is where columns are different conditions and rows are each participant. A long format will instead compile it so that the first column names the conditions, and the second column names the score. This means that if you have 20 people and 3 conditions, the first 20 rows will be the scores for condition 1, the second 20 rows will be condition 2 and so forth. This seems simple right now, but when you have multiple predictors, things begin to get complicated. Let’s say you have two time points, and three conditions per time point. That means that for each participant you need Time1*Condition1, Time2*Condition1, Time1*Condition2, Time2*Condition2, Time1*Condition3, and Time2*Condition3. Luckily, you do not have to arrange your data in this way. Instead, we will use a package called reshape2 which will compile your wide format data into long format data. Install the reshape2 package now.
Unlike past sections, I will not here walk you through each part of the code. The reason for this is that this code is much more elaborate than what you’ve used before. Instead, I have decided to give you working code which has extensive commenting. I suggest you read the comments thoroughly, but also play around with it. The best way to learn what every line of code does is by playing with it. See what happens if you delete a line, or if you change a value. If you don’t do these things, you will not truly understand what everything does.
HERE is the data you will need for this tutorial.
HERE is the code you will need for this tutorial.
Now, load the script into R and the tutorial will continue there.
Now, ggplot requires a long format of data rather than a wide format. A wide format is where columns are different conditions and rows are each participant. A long format will instead compile it so that the first column names the conditions, and the second column names the score. This means that if you have 20 people and 3 conditions, the first 20 rows will be the scores for condition 1, the second 20 rows will be condition 2 and so forth. This seems simple right now, but when you have multiple predictors, things begin to get complicated. Let’s say you have two time points, and three conditions per time point. That means that for each participant you need Time1*Condition1, Time2*Condition1, Time1*Condition2, Time2*Condition2, Time1*Condition3, and Time2*Condition3. Luckily, you do not have to arrange your data in this way. Instead, we will use a package called reshape2 which will compile your wide format data into long format data. Install the reshape2 package now.
Unlike past sections, I will not here walk you through each part of the code. The reason for this is that this code is much more elaborate than what you’ve used before. Instead, I have decided to give you working code which has extensive commenting. I suggest you read the comments thoroughly, but also play around with it. The best way to learn what every line of code does is by playing with it. See what happens if you delete a line, or if you change a value. If you don’t do these things, you will not truly understand what everything does.
HERE is the data you will need for this tutorial.
HERE is the code you will need for this tutorial.
Now, load the script into R and the tutorial will continue there.