Calculating the centroid
The bounding box can range from a city block to a whole state or even country. For simplicity's sake, one way we can deal with handling these data is by translating the bounding box into what's called a centroid, or the center of the bounding box. The calculation of the centroid is straight forward -- we calculate the midpoints of the lines created by the latitude and longitudes.
numpy has been imported as np.
This exercise is part of the course
Analyzing Social Media Data in Python
Exercise instructions
- Obtain the first set of coordinates from the place JSON.
- Calculate the central longitude by adding up the longitude list and dividing by two.
- Do the same for the latitudes.
- Apply the
calculateCentroid()function to theplacecolumn.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def calculateCentroid(place):
""" Calculates the centroid from a bounding box."""
# Obtain the coordinates from the bounding box.
coordinates = place[____][____][0]
longs = np.unique( [x[0] for x in coordinates] )
lats = np.unique( [x[1] for x in coordinates] )
if len(longs) == 1 and len(lats) == 1:
# return a single coordinate
return (longs[0], lats[0])
elif len(longs) == 2 and len(lats) == 2:
# If we have two longs and lats, we have a box.
central_long = ____.____(____) / ____
central_lat = ____.____(____) / ____
else:
raise ValueError("Non-rectangular polygon not supported: %s" %
",".join(map(lambda x: str(x), coordinates)) )
return (central_long, central_lat)
# Calculate the centroids of place
centroids = tweets_sotu[____].apply(____)