1. Understanding Twitter JSON
Great job! Now that you've collected some Twitter data, let's dig into the structure of the data. In the examples which we are working with in this course, the information is returned in Javascript Object Notation, or JSON. JSON is a special data format which is both human-readable and is easily transferred between machines. JSON is structured a lot like Python objects and is composed of a combination of dictionaries and lists.
2. Contents of Twitter JSON
Understanding the Twitter JSON is critical to knowing how to analyze Twitter data. There's a lot of data in a single Twitter JSON. For instance, in a single original tweet -- that is, a tweet that's not a retweet or a quoted tweet, you have foundational information like the text, when it was created, and the unique tweet ID. You also have information like how many retweets or favorites it has at the time of collection, what language it's in, if it's a reply to a tweet and to which tweet, and to which user.
3. Child JSON objects
More importantly, the Twitter JSON contains several important child JSON objects. These are like dictionaries stored in other dictionaries. The important ones which we'll cover in this course are `user`, `place`, and `extended_tweet`. `user` contains all the useful information you'd want to know about the user who tweeted, including their name, their Twitter handle, their Twitter bio, their location, and if they're verified.
4. Places, retweets/quoted tweets, and 140+ tweets
`place` contains information on the geolocation of the tweet, and we'll get into that in chapter 4. `extended_tweet` contains the full text of a tweet which is over 140 characters in length.
When a tweet is a retweet or contains a quoted tweet, the whole of that tweet will be contained with the Twitter JSON. For retweets, that tweet will be stored in `retweeted_status` and for a quoted tweet in `quoted_status`.
5. Accessing JSON
Let's start exploring the Twitter JSON by loading a single tweet. We'll use the open() and read() methods to load the JSON file into a JSON object. Then, we'll use the `json` package and the `loads` method to convert the JSON into a Python dictionary. Lastly, we access the value of interest by using its appropriate key.
6. Child tweet JSON
Child Twitter JSON can be accessed as nested dictionaries. Let's look at the `user` object. To access the user's handle, we use the `user` key, then the `screen_name` key. We can do the same with `name` to show the user's display name, and `created_at` to show when the user created their Twitter account.
7. Let's practice!
Let's practice working with Twitter JSON.