This section introduces R and describes how it integrates the five main parts of SAS, SPSS and Stata into a powerful, comprehensive system.
The software you’re familiar with is a complete software package. However, R is downloaded and installed in pieces. This chapter tells you how to find parts of R that match your current software and how to install them.
SAS Institute, IBM (makers of SPSS) and Statacorp all act as one-stop-shops for documentation and support. With R, top-notch documentation and support are also available... if you know where to look! This chapter gives you the best options.
There are many ways to control R, but RStudio is the most popular by far. This brief chapter covers what you need to know to get started.
This section covers the basics of R expressions, assignments and commands.
A whole chapter on data sets? You’re used to software that revolves around “the dataset” but in the world of R, there’s much more flexibility. And yes, flexibility does come at a price: added complexity.
R is a whole work “environment”. This chapter covers R commands that are commonly found in operating systems.
You can control the way analyses are run in ways that are very similar to your current software, or you can use an object oriented approach that’s unique to R. This section covers the alternatives.
You can’t analyze data until you read it in, so this chapter covers various types of text files as well as how to import datasets from SAS, SPSS and Stata.
Here’s a topic that almost all statistics packages treat in a similar fashion... but not R! This section guides you through the differences.
In other packages, there are just a few ways of selecting variables. This section covers six different ways to select variables in R, including the one that is most like your current software, the dplyr package’s “select” function.
This section covers the two most common ways to select observations in R, and it points out that the way you specify the logic in those selections follows slightly different rules.
The previous chapters discussed the selection of variables and observations. Here, we'll cover techniques on how to do both at the same time.
R is unique in its ability to create new variables from variables stored in multiple datasets at once. This section covers three different ways to specify transformations, pointing out the advantages of each.
R offers multiple packages to do graphics. The built-in one is good for many things, but it’s not ideal for displaying the same plot for multiple groups in your data. So this section also includes the popular ggplot2 package. It uses the same approach as SPSS’ Graphics Production Language, but is easier to learn because it uses standard R code to do any data transformations your plot requires.
Writing functions in R is very similar to writing macros in SAS, SPSS and Stata. However the resulting functions are much more integrated into the package, more like the “procs” or “commands” of other software. The downside to this though, is that functions are required to do “by group” processing. This section will guide you through the basic steps of both.
The basic statistical routines that are built into R are surprisingly sparse. This section points out add-on packages that provide output more like your current software.
R’s built-in functions are great if you just need correlations. But if you need output that includes n’s and p-values, you need to turn to add-on packages. Regression in R is done in a multi-step approach that will make Stata users feel right at home, but which will seem alien to SAS and SPSS users at first.
We’ll cover the basic ways to compare groups and why R does not perform the same tests without an add-on package.
R’s output by default looks pretty bad! But don’t worry, there are add-on packages that produce beautiful publication-quality tables. This section shows how they work.
R can be run in many ways, from simple point-and-click user interfaces to deep integration with other stat packages. This section briefly covers the integration of R into Alteryx, Excel, KNIME, R Commander, RapidMiner, Rattle, SAS, SPSS, and Stata.