Giochiamo con i tweet, round 1

Ti ricordi che, nei capitoli precedenti, hai lavorato come data analyst per una web agency? Hai fatto un ottimo lavoro e ora ti hanno assegnato un altro progetto ;) In questo capitolo analizzerai un nuovo tipo di dato: l’output JSON.

Il team di engineering ti ha fornito l’output di una raccolta di dati con i tweet pubblicati durante la RStudio Conf 2018. Poiché questo insieme di dati è in JSON, lo hai letto in R come una lista annidata.

Per iniziare, vuoi fare un’esplorazione di base di questo insieme di dati, e purrr ti tornerà utile. Il pacchetto è già stato caricato per te e il dataset rstudioconf è disponibile nel tuo workspace.

Nota: non provare a stampare l’intero dataset — è troppo grande per essere stampato nella console di datacamp.

Tieni presente che si tratta di dati reali da Twitter e, come tali, c’è sempre il rischio che possano contenere volgarità o altri contenuti offensivi (in questo esercizio e in qualsiasi esercizio successivo che utilizzi dati reali di Twitter).

Questo esercizio fa parte del corso

Programmazione funzionale intermedia con purrr

Visualizza il corso

Istruzioni dell'esercizio

Stampa il primo elemento della lista per avere una panoramica di contenuto e struttura.
Poiché vuoi concentrarti sui tweet originali (non retweet), crea una sottolista di non-retweet usando l’elemento logico "is_retweet" contenuto in ogni sotto-lista.
Estrai l’elemento "favorite_count" da ciascun elemento di questa nuova sottolista usando la variante map_* per interi.
Calcola la mediana del risultato precedente.

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

# Print the first element of the list to the console 


# Create a sublist of non-retweets
non_rt <- ___(___, "is_retweet")

# Extract the favorite count element of each non_rt sublist
fav_count <- ___(___, "favorite_count")

# Get the median of favorite_count for non_rt
___(___)

Modifica ed esegui il codice

Questo esercizio fa parte del corso

Programmazione funzionale intermedia con purrr

IntermediárioNível de habilidade

4.8+

Inizia il corso gratis

Do lambda functions, mappers, and predicates sound scary to you? Fear no more! After refreshing your purrr memory, we will dive into functional programming 101, discover anonymous functions and predicates, and see how we can use them to clean and explore data.

Exercise 1: purrr basics - a refresher Exercise 2: Refreshing your purrr memory Exercise 3: Another purrr refresher Exercise 4: Introduction to mappers Exercise 5: Creating lambda functions Exercise 6: Lambda functions Exercise 7: Using mappers to clean up your data Exercise 8: Clean up your data with keep Exercise 9: Split up with keep() and discard()Exercise 10: Predicates Exercise 11: What is a predicate?Exercise 12: Exploring data with predicates

Ready to go deeper with functional programming and purrr? In this chapter, we'll discover the concept of functional programming, explore error handling using including safely() and possibly(), and introduce the function compact() for cleaning your code.

Exercise 1: Functional programming in R Exercise 2: Everything that happens is a function call Exercise 3: Identifying pure functions Exercise 4: Tools for functional programming in purrr Exercise 5: Safe iterations Exercise 6: Create a function Exercise 7: Using possibly()Exercise 8: A possibly() version of read_lines()Exercise 9: Everything in one call Exercise 10: Handling adverb results Exercise 11: Purrrfecting our function Exercise 12: Extracting status codes with GET()

In this chapter, we'll use purrr to write code that is clearer, cleaner, and easier to maintain. We'll learn how to write clean functions with compose() and negate(). We'll also use partial() to compose functions by "prefilling" arguments from existing functions. Lastly, we'll introduce list-columns, which are a convenient data structure that helps us write clean code using the Tidyverse.

Exercise 1: Why cleaner code?Exercise 2: How to write compose()Exercise 3: Back to the office Exercise 4: Building functions with compose() and negate()Exercise 5: Build a function Exercise 6: Count the NA Exercise 7: Prefilling functions Exercise 8: A content extractor Exercise 9: Another extractor Exercise 10: List columns Exercise 11: About list-columns Exercise 12: Create a list-column data.frame

We'll wrap up everything we know about purrr in a case study. Here, we'll use purrr to analyze data that has been scraped from Twitter. We'll use clean code to organize the data and then we'll identify Twitter influencers from the 2018 RStudio conference.

Exercise 1: Esplorare l'insieme di dati Exercise 2: Giochiamo con i tweet, round 1

Esercizio in corso

Exercise 3: Identificare i profili Exercise 4: Estrazione di informazioni dall'insieme di dati Exercise 5: Contare i preferiti Exercise 6: Estrazione delle mention Exercise 7: Manipolare gli URL Exercise 8: Analizzare gli URL Exercise 9: Giochiamo con gli URL Exercise 10: Identificare gli influencer Exercise 11: Suddividere l'insieme di dati Exercise 12: Abbiamo un vincitore!Exercise 13: Congratulazioni!