1. Keeping objects secure
In the previous chapter, we learned how to create and delete buckets and upload objects inside of them.
Often we work with private data or data we want only certain users to see.
This is where AWS permission system comes in.
2. Why care about permissions?
AWS defaults to denying permission. That means if we upload a file, by default only our key and secret have access to it.
This behavior may sound annoying, but it's much more secure to opt-in to allowing public access to our data than to have it be public by default.
If we or our colleague tries to download a file with pandas, S3 won't let them. Even if it's in the same script that uploaded it!
3. Why care about permissions?
But if we initialize the s3 client with our credentials, it will work!
Let's learn how to protect (and expose) our buckets and objects.
4. AWS Permissions Systems
There are 4 ways we can control permissions in S3.
We can use IAM to control users' access to AWS services, buckets, and objects. We used it in lesson 1 to give permissions by attaching IAM policies to a user. IAM applies across all AWS services.
Bucket policies give us control on the bucket and the objects within it.
ACLs or access control lists let us set permissions on specific objects within a bucket.
Lastly, presigned URLs let us provide temporary access to an object.
5. AWS Permissions Systems
IAM and Bucket Policies are great in multi-user environments. But since Sam isn't managing one, we will focus on ACLs and presigned URLs.
6. ACLs
ACLs are entities attached to objects in S3. We will focus on 2 types of ACL: private and public-read.
7. ACLs
Say we upload a file.
By default, its ACL is 'private'.
Let's set the ACL to 'public-read' using s3_put_object_acl method.
Now anyone in the world can download this file.
8. Setting ACLs on upload
We can also set the ACL as 'public-read' on upload. We pass a dictionary with key ACL and value 'public-read' to the ExtraArgs parameter.
9. Accessing public objects
Once we have a public object, anyone in the world can access it at the URL with the format of bucket.objectKey.
A URL for an object with the key 2019/potholes.csv and bucket gid-requests will look like this.
10. Generating public object URL
We can use Python's nifty string format method to generate a public URL for an object in S3. This is different than a pre-signed URL which we will cover in the next lesson.
We create a string where the bucket name and the object key are empty brackets.
We call the format method, passing positional arguments of bucket - gid-requests, and key - 2019/potholes.csv.
We can now pass this URL to something like Pandas with no problems.
11. How access is decided
So how does this work together? When a request comes in if it's a presigned URL, it will allow the download.
12. How access is decided
If it's not pre-signed, it will check the policies to make sure they allow the download. AWS's default behavior is to deny access.
13. Review
In this lesson, we learned about AWS permissions systems.
IAM answers "What can this user do in AWS?"
Bucket Policies answer "Who can access this S3 bucket?"
ACLs answer "Who can access this object"?
And presigned URLs let us grant temporary access to objects.
14. Review
We can set ACLs to public-read or private on existing objects.
15. Review
Or we can upload with a specified ACL using ExtraArgs
16. Review
Lastly, we can generate public URLs for objects with a public-read ACL.
17. Let's practice!
Now that we have a grasp of basic AWS permissions, let's practice!