[1] 2
PSYC 6802 - Introduction to Psychology Statistics
To start, it’s good to point out that R and RStudio are two different things
R:

RStudio:



R (https://www.r-project.org/about.html) is a programming language originally designed for statistical computing and data visualization.
Thanks to the contribution of many users, nowadays R is quite similar to python (https://www.python.org) in what it allows you to do.
There exist many programming languages and some do something better than others.
R works pretty well for data analysis and visualization and that’s why we use it 🤷
Whereas R is a programming language, RStudio is an integrated development environment (IDE…a what? 😕)
An IDE is software that facilitates writing code in general. Although RStudio was developed with R in mind, it also supports many other programming languages (e.g., Python, Javascript, C…)
Likewise, you do not need RStudio to use R. However, RStudio is by far the best IDE for coding in R and it makes the process much more efficient!
The people who make RStudio (https://posit.co/download/rstudio-desktop) have no affiliation with the people who make R as far as I know.
To use R properly you will have to learn how to code (which may sound a bit scary 😟). Coding is like learning a foreign language: There is grammar and there is a logic to how you construct sentences. The exact same is true for programming languages. There is a lot I could say here, but just some advice:
Errors and Mistakes: Your code (my code too!) will almost never work perfectly the first time around. Do not get frustrated; understand why your code does not work. Making mistakes over and over and fixing them is how your learn.
Understand Your Code: Do not copy and paste code without understanding what it does. This may work for some of the assignments in this course, but it will eventually lead to huge mistakes when you are doing research on your own.
Taking Shortcuts: There are many shortcuts you can take to write code (e.g., chatGPT). I strongly discourage using AI assistance to write code if you are new to coding. AI code may be wrong (or it may not be what you want), and you first need to develop the knowledge to know when and why code is wrong.
If you have code problems (and I am not around to help), I recommend Googling your question and looking for other humans who have answered it (usually on https://stackoverflow.com)
The RStudio interface is divided into 4 panes:

More about the “Console”
You can actually write and run code directly in the console, but you cannot save your code (which you should always do!). When you run your code from the Source pane, RStudio sends it to the console to be interpreted. All computer code is just plain text; what you need to run code of a certain computer language is to have something that interprets it and runs it. The R console is what interprets and runs your code (Hence why you need to have R on your computer to use R in RStudio)
Before we can do any coding, we need to open a new R script! You can open a new R script by following the steps in the image on the right, or by using pressing Crtl + Shift + N (Windows) or Cmd + Shift + N.
A tab named “Untitled1” will appear in your source pane. This is where we are going to write code for today!

As any other file, you can later save this file anywhere on your computer. It will have the .R extension.

R can perform just about any mathematical operation. At the same time, let’s see how to run some code:
In RStudio, you can either run one or more line of code at once, or run the whole R script file at once.
The
button will also run the next runnable line of code with respect to your cursor.
You will see your code with output appear in the console.
Output is indicated by “[n]”, where n represents the line of the output.
Here we only have one line for output each of our inputs (the 3 math operations), but you can have more lines.
The # sign represents comments. R will not run commented lines. Comments are good for explaining code to other people reading your code, and more importantly…to the future you!

R “reads” code until it find the end of a statement (code that produces output), and then expects the following statement to appear on a new line.
Operators are symbols that tell R to perform certain actions. Aside from the math operations, the : operator is a bit unique to R.
| Operator | Description |
|---|---|
+ |
addition |
- |
subtraction |
* |
multiplication |
/ |
division |
^ |
exponentiation |
x:y |
sequence from x to y |
Just as many other programming languages, R is object-oriented. You can think of objects as containers where information is stored (very important concept to remember).
To create an object in R, you use “<” + “-”, known as the assignment operator:
The keyboard shortcut for the assignment operator is alt + - (Win) or Option + - (Mac).
Just like there are different types of containers (boxes, drawers, fridges, etc…), there are different types of R objects!
The x object that we just created is a numeric vector (type of object) of length 1 🤔
A vector is a one-dimensional collection of elements. To create a vector with more than one element we can do the following:
The concept of dimensions will become clearer later. In the meantime, can you think of some objects that may have more than 1 dimension? 🤓
So far we have only dealt with numbers, but character objects also come up a lot:
you cannot apply any math operations to character objects
A function is something that takes one or more objects as input and produces some output.
R interprets anything that stars with letters and is followed by a ( as a function, after which it executes the function until the next ).
Functions also have arguments, that allow you to tweak what the function does. Here decreasing = is an argument of the sort() function:
Functions are at the core of anything we do in R. We will learn about many more functions as they come up. If you want to get a flavor of some basic R functions, you can find a list here.
Let’s say I ask Google for an R function that sorts vectors and I find the sort() function!…But how do I know about its arguments? How do I know whether it sorts in ascending or descending order by default? How do I know that the function does what I need? 🤔
There’s much more going on here, but notice the {base} after the name of the function. That is the package the function comes from 🧐

Usually, the base R functions are not enough for most of the tasks that one needs to accomplish in R. Often people have to create their own custom functions.
A package is simply a collection of functions that other users make for everyone out of the kindness of their heart 🤗
Let’s install a package that makes opening data in R very smooth, the rio package (Becker et al., 2024):
The install.packages() function installs packages from the comprehensive R archive network (CRAN). Among other things, CRAN maintains a library of packages made by users.
The process to get a package on CRAN is a bit lengthy (and sometimes packages get removed), so some people just upload their packages to Github.
To see all of the packages installed in your RStudio, you can navigate to your viewer pane and select “packages”.

We want to open the World_happiness_2024.csv data set with the import() function from the rio package. First we download the data (click here). Then we load the rio package:
# to load the functions from a package you need to run the `library(package)` function first
library(rio)
# rio also suggests to add a few extra packages, so also run the line below. It is the case that packages have functions that use functions from other packages to run, hence why rio suggests to also install other packages here
install_formats()Now, we need to tell R how to find the World_happiness_2024.csv file. Here are a couple of ways of doing this:
Either you use the absolute file path (i.e., a unique address that identifies the location of all files on your computer)
Change your working directory (WD; the default folder where RStudio saves/looks for files) to where the data is (or move the data to your current WD). Your current WD is always displayed at the top of the R console pane next to the R version number.
You can get your current working directory by running the getwd() function. This is where R currently expects to find files.
[1] "C:/Users/fabio/Dropbox/Work/Github repos/Fabio-Setti/static/PSYC6802"
I’ll change my current working directory to my Desktop with the setwd() function. This function takes the path to a location on your computer as input. For the Desktop:
setwd("~/Desktop") on Mac
setwd("C:/Users/fabio/Desktop") on Windows.
Everyone will have something different here
Afterwards, move the World_happiness_2024.csv to your Desktop (which should now be your WD)
Now that our working directory is (hopefully) sorted out, we can use the import() function from rio to load our data. Data needs to be saved as a new object, so we use <- to name it:
This is data from the 2024 world happiness report. The str() function can be used to get a lot of information about objects:
'data.frame': 140 obs. of 9 variables:
$ Country_name : chr "Finland" "Denmark" "Iceland" "Sweden" ...
$ Region : chr "Western Europe" "Western Europe" "Western Europe" "Western Europe" ...
$ Happiness_score : num 7.74 7.58 7.53 7.34 7.34 ...
$ Log_GDP : num 1.84 1.91 1.88 1.88 1.8 ...
$ Social_support : num 1.57 1.52 1.62 1.5 1.51 ...
$ Healthy_life_expectancy: num 0.695 0.699 0.718 0.724 0.74 0.706 0.704 0.708 0.747 0.692 ...
$ Freedom : num 0.859 0.823 0.819 0.838 0.641 0.725 0.835 0.801 0.759 0.756 ...
$ Generosity : num 0.142 0.204 0.258 0.221 0.153 0.247 0.224 0.146 0.173 0.225 ...
$ Corruption : num 0.454 0.452 0.818 0.476 0.807 0.628 0.516 0.568 0.502 0.677 ...
data.frame ObjectsAlthough this information was given to us by the str() function, it is generally useful to first figure out what type of object we are dealing with:
[1] "data.frame"
the dat object is a data.frame. We will come across other type of objects eventually, but here is a list of common ones.
You may have realized that data.frame objects, unlike vectors, have 2 dimensions (2D), rows and columns.
Now, If objects are containers for information, then there must be a way to extract only some of the information stored in those containers 🧐 This is called subsetting (or indexing, depends on the context).
You can subset 2D objects by referring to the indices of their dimensions in this way object_name[row number, column number]:
Country_name Region Happiness_score Log_GDP Social_support
2 Denmark Western Europe 7.583 1.908 1.52
Healthy_life_expectancy Freedom Generosity Corruption
2 0.699 0.823 0.204 0.452
You can modify specific elements like so:
[1] 139
You can also select non-adjacent elements:
I often get students telling me that “they prefer SPSS” (my nemesis 🙃). Normally, I would go on a 20 minutes rant about this, but some like-minded people have done that in this pretty funny reddit post
Some other reasons for adopting R:
But wait! One last thing 🫣

Quarto is an “open-source scientific and technical publishing system”. As mentioned on their main page, with quarto, you can:
Overall I am a big fan of quarto because it fosters accessibility, reproducibility, and transparency 😀

Before we can create PDFs with quarto, there are 2 important steps that you need to follow:
First, you need to install the rmarkdown package (Allaire et al., 2024).
R Markdown used to be the main way (and may still be) of creating reports in R. However, the RStudio folks have decided to move to quarto, and it will likely become more popular than R Markdown in the near future.
In general, most of the nice PDFs you see are created with LaTeX. The last thing that we need is to install a LaTeX interpreter (kinda like needing R to run R code!).
To install a LaTeX interpreter that quarto likes, go to the top of your screen and click tools → terminal → new terminal
A window named “terminal” will appear next to the R console. Go to the “terminal” window, type the following line :
quarto install tinytex
and the press Enter (Win) or Return (Mac)
quarto files have the .qmd extension. We can open a .qmd file by clicking file → new file → quarto document. You should see the window on right appear. Make sure you select PDF.
Note the Use visual markdown editor check-box. Once you create the document you will have the option to switch between the visual and source editor:
Click on Create to create a .qmd document, which will already have some instructions in it.

Now you can click on Render at the top of the document; you will be asked to save the .qmd file. After you do, you will see a .pdf file appear where you saved your .qmd file.
This is what the .qmd file looks like from the source editor view:

The PDF file output:
The fist thing that we see in the .qmd file is the are some lines enclosed between two ---. That is a YAML header. The YAML header simply gives quarto some instruction to follow once you click the “Render” button.
Try to make the changes in the second code block and render the PDF again!
For this course, I don’t expect you to make any changes to the YAML header beyond modifying the title and adding your name as the author. Here is a comprehensive list of all the YAML options that exists for PDF documents in quarto.
.qmd files have two main parts plain text and code chunks
Any text outside code chunks is considered plain text. In the template .qmd file “Quarto enables you to…document.” is plain text. When creating PDF files from .qmd files, plain text accepts both Markdown (see here) and LaTeX syntax.
In this course, LaTeX syntax will only come up if you want to write Greek letters or math symbols (see here)
Anything in plain text between $ signs is interpreted as LaTeX math. So, $\beta$ will look like \(\beta\), or $\sqrt{x}$ will look like \(\sqrt{x}\) (LaTeX looks nice, so give it a try 🫣)
Code chunks are anything that is enclosed between ```{r} and ```
You can create a new code chunk with Ctrl + Alt + I (Windows) or Cmd + Option + I (Mac).
Code chunks can be run in many ways, one of them being the green arrow at their top right.
You can also modify how the chunks behave when rendered with chunk options (e.g., your advisor does not know R, so you can hide the code and just show the output).
1. When quarto renders a .qmd file, it will run R chunks in order, one by one. This means that your code should work in sequence from the first chunk to the last chunk. For example:
If you had this chunk
```{r}
x + 5
```
Followed later by this chunk first
```{r}
x <- 4
```
Your document will not render because in the x + 5 part, R does not know what x is yet 🧐 I suggest that before you try rendering your document, you run rm(list=ls()) to clear your environment and check that your code runs from start to finish!
2. You only need to install packages once. Do not leave install.packages() functions in your code chunks when trying to render; that will also likely cause issues.
3. You can make your PDF documents look much better by modifying chunk options (e.g., hiding messages and warnings by using #| message: false and #| warning: false in your chunks). You really don’t have to do this, but I would really appreciate if you spent a tad bit more time improving how you PDFs look (makes grading homework easier 🥺)
PSYC 6802 - Lab 1: Introduction to R