String length
Our next stringr
function is str_length()
. str_length()
takes a vector of strings as input and returns the number of characters in each string. For example, try finding the number of characters in Batman's name:
str_length(c("Bruce", "Wayne"))
This is very similar to the base
function nchar()
but you'll see in the exercises str_length()
handles factors in an intuitive way, whereas nchar()
will just return an error.
Historically, nchar()
was even worse, rather than returning an error if you passed it a factor, it would return the number of characters in the numeric encoding of the factor. Thankfully this behavior has been fixed, but it was one of the original motivations behind str_length()
.
Take your first look at babynames
by asking if girls' names are longer than boys' names.
This is a part of the course
“String Manipulation with stringr in R”
Exercise instructions
We've pulled out just the names from 2014, and created the vectors boy_names
and girl_names
for you. (If you want to learn about the filter()
function, take the Data Manipulation in R with dplyr course!).
- Take a look at the
boy_names
vector, it's long, so usehead()
to see the first few elements. - Use
str_length()
onboy_names
to find the length of each name and save the result toboy_length
. - Take a look at the lengths. Again, use
head()
. Can you see the correspondence withboy_names
? - Find the length of all the girls' names. Call this
girl_length
. - Find the difference in mean length between boys' and girls' names by subtracting the mean length of boys' names from that of girls' names.
- Confirm
str_length()
works on factors, by calling it onfactor(boy_names)
. Again, you'll want to just look at thehead()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
library(stringr)
library(babynames)
library(dplyr)
# Extracting vectors for boys' and girls' names
babynames_2014 <- filter(babynames, year == 2014)
boy_names <- filter(babynames_2014, sex == "M")$name
girl_names <- filter(babynames_2014, sex == "F")$name
# Take a look at a few boy_names
___
# Find the length of all boy_names
boy_length <- ___
# Take a look at a few lengths
___
# Find the length of all girl_names
girl_length <- ___
# Find the difference in mean length
___
# Confirm str_length() works with factors
head(___)