Welcome!

1. Welcome!

Welcome to string manipulation with stringr! Elements of character vectors are known as "strings".

2. You will learn:

stringr is a package specifically built for manipulating strings. While some of the functionality of stringr is available in base R, the functions in stringr have consistent syntax, simplified options and sensible defaults, that combine to make them much easier to learn. To fully utilize the power of stringr you'll also learn about regular expressions. Regular expressions are a language for describing patterns in text, like "starts with c", or "a dollar sign followed by one or more digits". You'll use the rebus package to help you construct the concise, yet somewhat cryptic expressions, from functions with intuitive names. So what will you be able to do with stringr and rebus?

3. IMG_1

4. IMG_1

You'll be able to answer questions about strings of dna, like how long are the genes,

5. IMG_1

how many A's occur

6. IMG_1

and which strings

7. IMG_1

have a particular sequence.

8. IMG_2

You'll be able to pull out variables from messy strings,

9. IMG_2

like the age and gender

10. IMG_2

from an accident narrative like this one. You'll be able to

11. IMG_3

find and replace parts of strings

12. IMG_3

to anonymize sensitive data. You'll be able to in read in text,

13. IMG_4

like the text to the Oscar Wilde play "The Importance of Being Earnest", then leverage your string processing skills

14. IMG_4

to figure out how many lines each character speaks. To kick things off

15. Chapter 1

and prepare you for your first stringr functions in Chapter 2, in this chapter you'll learn a few fundamentals about strings: how to enter strings in R, how to control how numbers are turned into strings, and how to combine strings into sentences or tables. You've probably already

16. Entering strings

entered strings in R. To tell R something is a string you surround it with double quotes. But, what happens if you need to input a string that has double quotes inside it? You can try to use double quotes, but R will return an error. R has interpreted the second double quote as the end of the string, and is then confused that more input followed it.

17. Entering strings

We can avoid this problem by using single quotes, which R also allows to define a string. R won't stop interpreting the input as a string until it hits the closing single quote '.Take a close look at the output. R always prints strings with double quotes (even if you used single quotes to define them). The double quotes inside the string have a backslash in front of them.

18. Entering strings

This is known as an escape sequence, it tells R this is a double quote, rather than this is the end of the string. Escaping double quotes is another way to enter them inside a string. With two equivalent ways to enter this string, which should you use?

19. When to use \" vs. '

R doesn't care, the output is the same, but there are some guidelines designed to keep your code readable. If the string's text does not contain any quotes, wrap it in double quotes. If the text contains double quotes but not single quotes, wrap it in single quotes. If the text contains both kinds of quotes, wrap it in double quotes and escape the double quotes in the text.

20. Let's practice!

I think you're ready to get started entering strings.