1. Uploading and retrieving files
In the last lesson, we learned how to list, create and delete buckets.
Now, it's time to put stuff in them.
Let's take a look at how objects work.
2. Buckets and objects
The files in S3 buckets are called objects.
An object can be anything - an image, a video file, CSV or a log file.
Managing objects is a key component of many data pipelines.
3. Buckets and objects
Objects and buckets in S3 work somewhat like files and folders on our desktop.
Each bucket has a name. Objects' names are called keys.
A bucket name is just a name. An object's key is the full path of the object from the bucket's root.
A bucket's name is unique in all of S3. An object's key is unique in the bucket.
A bucket contains many objects. But an object can only belong to one bucket.
4. Creating the client
First, we create the client and assign it to the s3 variable. Now we can perform operations on our objects and buckets.
5. Uploading files
Let's upload an object into a Bucket.
We upload the file using the client's upload_file method.
The Filename is the local file path. Bucket parameter takes the name of the bucket we are uploading to. Key is what we want to name the object in S3.
We are not capturing the return from this method in a variable.
The method doesn't return anything. If there is an error, it will throw an exception.
6. Uploading files
Whoo! Our file is now on S3!
7. Uploading more objects
I've uploaded a few more objects for us to play with. Let's list them with boto3.
8. Listing objects in a bucket
Call the client's list_objects method, passing `gid-requests` for Bucket Name.
Optionally, we can limit the response to two objects with the MaxKeys argument. If we omit it, S3 will return up to 1000 objects in our bucket if they exist.
Another way to limit the response is to use the optional Prefix argument. Passing it will limit the response to objects that start with the string we provide.
9. Listing objects in a bucket
The response dictionary contains the 'Contents' key. This key contains a list of objects and their info.
Each object dictionary is returned with a key.
10. Listing objects in a bucket
A modified date
11. Listing objects in a bucket
And the object size in bytes.
12. Getting object metadata
If we want to know these things about a single object, we can use the client's head_object method, passing the bucket name and object key.
13. Getting object metadata
Notice that because we are only working with one object, there is no Contents dictionary. The metadata is directly in the response dictionary.
14. Downloading files
To download a file, we use the client's download_file method. We pass the Filename, or the local path we want the file to download to. Then we specify the bucket and key of the object we want to download!
15. Deleting objects
Sometimes, an object has outlived its usefulness and needs to be deleted. Use the client's delete_object method, passing the Bucket name and object key to delete the object.
16. Summary
In this lesson, we learned that buckets are like folders, and objects are like files within them.
We learned to create the client before we can do anything else.
We learned how to upload files to a bucket
How to list objects in a bucket
How to head object - or get object metadata
How to download a file from a bucket
And finally, how to delete an object
17. Let's make some objects!
Let's help Sam continue working on her pipeline with these new skills!