The goal of this lab is to introduce you to JASP, which you’ll be using throughout the course both to learn the statistical concepts discussed in the course and to analyze real data and come to informed conclusions.
As the labs progress, you are encouraged to explore beyond what the labs dictate; a willingness to experiment will make you a much better data analyst! Before we get to that stage, however, you need to build some basic fluency with JASP. First, we will explore the fundamental building blocks of JASP, reading in data, and basic commands for working with data in JASP.
You can download JASP from https://jasp-stats.org/. Click download
near the top of the screen, then select the version from the computer you are using.
Go ahead and launch JASP You should see a window that looks like the image shown below.
First we’re going to download a dataset. This data will be downloaded from https://www.openintro.org/data/index.php?data=arbuthnot. Open this window, then select the CSV file to download it to your computer. You may need to right click or control-click then select save file as
in order to download the file instead of viewing it, depending on your operating system and particular web browser. Make sure to note where you download it, since we’ll need to go to that location to load it.
To load the data, click the menu button at the top left part of the window (with the three horizontal lines), then Open
, then Computer
, then click Browse
, and select the file in the location you downloaded it to. You will see a new window open with a display of the data, which should look like this:
You’ll see three columns, one for each variable in the data. This display should feel similar to viewing data in Excel, where you are able to scroll through the dataset to inspect it. You can double click the data to open another editor to change any data that needs fixing, but we don’t need to do that.
Let’s take a peek at the data. The Arbuthnot data set refers to the work of Dr. John Arbuthnot, an 18th century physician, writer, and mathematician. He was interested in the ratio of newborn boys to newborn girls, so he gathered the baptism records for children born in London for every year from 1629 to 1710.
We can see that there are 82 observations and 3 variables in this dataset. The variable names are year
, boys
, and girls
.
When inspecting the data, you should see three columns of numbers and 82 rows. Each row represents a different year that Arbuthnot collected data. There is also a column to the left of the variables which shows a row number. The first variable is the year, and the second and thirds are the numbers of boys and girls baptized that year, respectively. Use the scrollbar on the right side of the window to examine the complete data set.
Note that the row numbers are not part of Arbuthnot’s data. JASP adds these row numbers as part of its printout to help you make visual comparisons. You can think of them as the index that you see on the left side of a spreadsheet. In fact, the comparison of the data to a spreadsheet will generally be helpful.
Just to the left of the name of each variable, you should see an icon of a ruler. If you click this icon, you will see three options. This allows you to specify whether JASP should interpret the variable as a scale, ordinal, or nominal variable. JASP will try to select the correct type when it loads the data. We will leave these variables as scale.
Let’s start to examine the data a little more closely. Click the Descriptives
button near the top of the window. You will see the list of variables in one box. We can move the variable boys
to the box labeled Variables
to see some descriptive statistics for that variable. You can move the variable over by double clicking the variable name, by clicking and dragging the variable name, or by clicking the variable name and then clicking the arrow between the boxes pointed at the Variables
box.
You will then see the results in the right hand side of the window. You should see that there are 82 valid observations, no data is missing, and you will see various other descriptive statistics for that variable. You can add an additional variable to the Variables
box, and you will then see descriptives for each variable in the box. Do this with girls
to see how the two variables compare.
girls
? Are any values missing?Now that we have some analysis, we should save our file in case we want to come back to it. With JASP, you can save a .jasp
file which contains your data and all of your analysis, along with any descriptions you write in yourself. Open the menu, select Save As
and you can save your first JASP file to your computer.
In JASP, we can use the same descriptives section to generate graphics. To create a simple plot of the number of girls baptized per year, we will start by selecting the variables we are interested in. Move the variables so the Variables
box contains year
and girls
. Next, click Plots
below the boxes on the left hand side of the screen to see the plotting options. We’ll make a scatter plot, so check the box next to Scatter Plots
.
JASP adds a number of features to the scatter plot. By default, each variable is being plotted with a density, which is essentially a smoothed histogram. By clicking the radio buttons, you can change this to a histogram, or turn it off. We can also include the regression line (which is turned on by default) or turn it off by unchecking the box next to Add regression line
. Turning the graphs above and to the right of the scatter plot to None
and turning off the regression line will produce a standard scatterplot.
JASP will use the variables in the variable box in order: the first variable will be on the x-axis and the second variable will be on the y-axis. Try switching the order of the variables by clicking and dragging them. See how your plot changes.
Use the plot to address the following question:
In the descriptives window, you can see four buttons. The first allows you to edit the title of the analysis. If you click this buttons, you can change Descriptive Statistics
to a different name. If you click the Descriptives
button again, you will see that you now have a second descriptive statistics box, and two descriptive statistics sections on the right hand side of the window. In this way, you can make a comprehensive list of all of your statistical work.
Next, the button with the plus sign allows you to duplicate an analysis, in case you want to run two similar analyses. Next, the button with the i
provides information on the options in the window, and the x
closes that particular analysis.
We can now create lots of analyses, but we also want to provide written context and interpretations. Next to the title of a section, click the arrow, and select Add Note
. You will see a text box at the beginning of the section and the end of the section, where you can add introductions and conclusions. Similarly, if you click the arrow next to the heading above a scatter plot, you can add a note to describe what is going on in the scatterplot.
We are interested in using a new vector of the total number of baptisms to generate some plots, so we’ll want to add it as a column in our data.
Close the descriptives view to go back to viewing the data. Click the plus button to the right of the girls
variable name. This will bring up a menu for creating a new variable. In the box for name, enter total
. Then click Create Column
. This brings up a menu for creating the new variable. Click boys
, then the plus sign, then girls
, then click Compute column
.
You’ll see that there is now a new column called total
that has been tacked onto the data.
In a similar fashion, once you know the total number of baptisms for boys and girls in 1629, you can compute the ratio of the number of boys to the number of girls baptized.
boy_to_girl_ratio
given by boys
/girls
.You can also compute the proportion of newborns that are boys.
Create a new variable named boy_ratio
to the dataset by dividing boys
by total
. Notice that rather than dividing by boys + girls
we are using the total
variable we created earlier in our calculations!
Now, generate a plot of the proportion of boys born over time. What do you see? Put your description of what you see in a note.
Finally, in addition to simple mathematical operators like subtraction and division, you can ask JASP to make comparisons like greater than, >
, less than, <
, and equality, =
. For example, we can create a new variable called more_boys
that tells us whether the number of births of boys outnumbered that of girls in each year. To do this, create a new variable, name it more_boys
and enter boys
then the greater than sign (>) then girls
. Click Compute column
, and you will see a column with 1’s. Because this inequality is either true or false, JASP will put a 1 in the column when the condition is true, and a 0 when the condition is false.
In the previous few pages, you recreated some of the displays and preliminary analysis of Arbuthnot’s baptism data. Your assignment involves repeating these steps, but for present day birth records in the United States. The data are stored in a dataset called present
, which you can download here.
Answer the following questions with the present
dataset:
What years are included in this data set? How many variables are there? How many observations? What are the variable (column) names?
How do these counts compare to Arbuthnot’s? Are they of a similar magnitude?
Make a plot that displays the proportion of boys born over time. What do you see? Does Arbuthnot’s observation about boys being born in greater proportion than girls hold up in the U.S.? Include the plot in your response.
These data come from reports by the Centers for Disease Control. You can learn more about them here.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.