Extracting substrings
The str_sub()
function in stringr
extracts parts of strings based on their location. As with all stringr
functions, the first argument, string
, is a vector of strings. The arguments start
and end
specify the boundaries of the piece to extract in characters.
For example, str_sub(x, 1, 4)
asks for the substring starting at the first character, up to the fourth character, or in other words the first four characters. Try it with my Batman's name:
str_sub(c("Bruce", "Wayne"), 1, 4)
Both start
and end
can be negative integers, in which case, they count from the end of the string. For example, str_sub(x, -4, -1)
, asks for the substring starting at the fourth character from the end, up to the first character from the end, i.e. the last four characters. Again, try it with Batman:
str_sub(c("Bruce", "Wayne"), -4, -1)
To practice, you'll use str_sub()
to look at popular first and last letters for names.
This is a part of the course
“String Manipulation with stringr in R”
Exercise instructions
We've set up the same boy_names
and girl_names
vectors from the last exercise in your workspace.
- Use
str_sub()
to extract the first letter of each name inboy_names
. Save this toboy_first_letter
. - Use
table()
onboy_first_letter
to count up how many names start with each letter. Can you see which is most popular? - Repeat these steps, but now look at the last letter for boys' names.
- Again repeat, but now look at the first letter for girls' names.
- Finally, look at the last letter for girls' names.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Extract first letter from boy_names
boy_first_letter <- ___
# Tabulate occurrences of boy_first_letter
___
# Extract the last letter in boy_names, then tabulate
boy_last_letter <- ___
___
# Extract the first letter in girl_names, then tabulate
girl_first_letter <- ___
___
# Extract the last letter in girl_names, then tabulate
girl_last_letter <- ___
___
This exercise is part of the course
String Manipulation with stringr in R
Learn how to pull character strings apart, put them back together and use the stringr package.
Time to meet stringr! You'll start by learning about some stringr functions that are very similar to some base R functions, then how to detect specific patterns in strings, how to split strings apart and how to find and replace parts of strings.
Exercise 1: Introducing stringrExercise 2: Putting strings together with stringrExercise 3: String lengthExercise 4: Extracting substringsExercise 5: Hunting for matchesExercise 6: Detecting matchesExercise 7: Subsetting strings based on matchExercise 8: Counting matchesExercise 9: Splitting stringsExercise 10: Parsing strings into variablesExercise 11: Some simple text statisticsExercise 12: Replacing matches in stringsExercise 13: Replacing to tidy stringsExercise 14: ReviewExercise 15: Final challengesWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.