Get startedGet started for free

Cloud Storage

1. Cloud Storage

Cloud Storage is Google Cloud's object storage service, and it allows world-wide storage and retrieval of any amount of data at any time. You can use Cloud Storage for a range of scenarios including serving website content, storing data for archival and disaster recovery, or distributing large data objects to users via direct download. Cloud Storage has a couple of key features: It's scalable to exabytes of data; The time to first byte is in milliseconds; It has very high availability across all storage classes; And It has a single API across those storage classes. Some like to think of Cloud Storage as files in a file system but it's not really a file system. Instead, Cloud Storage is a collection of buckets that you place objects into. You can create directories, so to speak, but really a directory is just another object that points to different objects in the bucket. You're not going to easily be able to index all of these files like you would in a file system. You just have a specific URL to access objects. Cloud Storage has four storage classes: Standard, Nearline, Coldline and Archive, and each of those storage classes provide 3 location types: There's a multi-region is a large geographic area, such as the United States, that contains two or more geographic places. Dual-region is a specific pair of regions, such as Finland and the Netherlands. A region is a specific geographic place, such as London. Objects stored in a multi-region or dual-region are geo-redundant. Now, let's go over each of the storage classes: Standard Storage is best for data that is frequently accessed (think of "hot" data) and/or stored for only brief periods of time. This is the most expensive storage class but it has no minimum storage duration and no retrieval cost. When used in a region, Standard Storage is appropriate for storing data in the same location as Google Kubernetes Engine clusters or Compute Engine instances that use the data. Co-locating your resources maximizes the performance for data-intensive computations and can reduce network charges. When used in a dual-region, you still get optimized performance when accessing Google Cloud products that are located in one of the associated regions, but you also get improved availability that comes from storing data in geographically separate locations. When used in multi-region, Standard Storage is appropriate for storing data that is accessed around the world, such as serving website content, streaming videos, executing interactive workloads, or serving data supporting mobile and gaming applications. Nearline Storage is a low-cost, highly durable storage service for storing infrequently accessed data like data backup, long-tail multimedia content, and data archiving. Nearline Storage is a better choice than Standard Storage in scenarios where slightly lower availability, a 30-day minimum storage duration, and costs for data access are acceptable trade-offs for lowered at-rest storage costs. Coldline Storage is a very-low-cost, highly durable storage service for storing infrequently accessed data. Coldline Storage is a better choice than Standard Storage or Nearline Storage in scenarios where slightly lower availability, a 90-day minimum storage duration, and higher costs for data access are acceptable trade-offs for lowered at-rest storage costs. Archive Storage is the lowest-cost, highly durable storage service for data archiving, online backup, and disaster recovery. Unlike the so-to-speak "coldest" storage services offered by other Cloud providers, your data is available within milliseconds, not hours or days. Archive Storage also has higher costs for data access and operations, as well as a 365-day minimum storage duration. Archive Storage is the best choice for data that you plan to access less than once a year. Let's focus on durability and availability. All of these storage classes have 11 nines of durability, but what does that mean? Does that mean you have access to your files at all times? No, what that means is you won't lose data. You may not be able to access the data, which is like going to your bank and saying well my money is in there, it's 11 nines durable. But when the bank is closed we don't have access to it, which is the availability that differs between storage classes and the location type. Cloud Storage is broken down into a couple of different items here. First of all, there are buckets which are required to have a globally unique name and cannot be nested. The data that you put into those buckets are objects that inherit the storage class of the bucket and those objects could be text files, doc files, video files, etc. There is no minimum size to those objects and you can scale this as much as you want as long as your quota allows it. To access the data, you can use the gcloud storage command, or either the JSON or XML APIs. When you upload an object to a bucket, the object is assigned the bucket's storage class, unless you specify a storage class for the object. You can change the default storage class of a bucket but you can't change the location type from regional to multi-region/dual-region or vice versa. You can also change the storage class of an object that already exists in your bucket without moving the object to a different bucket or changing the URL to the object. Setting a per-object storage class is useful, for example, if you have objects in your bucket that you want to keep, but that you don't expect to access frequently. In this case, you can minimize costs by changing the storage class of those specific objects to Nearline, Coldline or Archive Storage. In order to help manage the classes of objects in your bucket, Cloud Storage offers Object Lifecycle Management. More on that later. Let's look at access control for your objects and buckets that are part of a project. We can use IAM for the project to control which individual user or service account can see the bucket, list the objects in the bucket, view the names of the objects in the bucket, or create new buckets. For most purposes, IAM is sufficient, and roles are inherited from project to bucket to object. Access control lists or ACLs offer finer control. For even more detailed control, signed URLs provide a cryptographic key that gives time-limited access to a bucket or object. Finally, a signed policy document further refines the control by determining what kind of file can be uploaded by someone with a signed URL. Let's take a closer look at ACLs and signed URLs. An ACL is a mechanism you use to define who has access to your buckets and objects, as well as what the level of access is they have. The maximum number of ACL entries you can create for a bucket or object is 100. Each ACL consists of one or more entries, and these entries consist of two pieces of information: A scope, which defines who can perform the specified actions (for example, a specific user or group of users). And a permission, which defines what actions can be performed (for example, read or write). The allUsers identifier listed on this slide represents anyone who is on the internet, with or without a Google account. The allAuthenticatedUsers identifier, in contrast, represents anyone who is authenticated with a Google account. For more information on ACLs, refer to the links of this video. For some applications, it is easier and more efficient to grant limited-time access tokens that can be used by any user, instead of using account-based authentication for controlling resource access. (For example, when you don't want to require users to have a Google account). Signed URLs allow you to do this for Cloud Storage. You create a URL that grants read or write access to a specific Cloud Storage resource and specifies when the access expires. That URL is signed using a private key associated with a service account. When the request is received, Cloud Storage can verify that the access-granting URL was issued on behalf of a trusted security principal, in this case the service account, and delegates its trust of that account to the holder of the URL. After you give out the signed URL, it is out of your control. So you want the signed URL to expire after some reasonable amount of time.

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.