One of the crucial basic facets of any storage answer is sturdiness—how effectively is your knowledge protected against loss or corruption? And that may really feel particularly vital for a cloud setting. Cloud Storage has been designed for at least 99.999999999% annual durability, or 11 nines. That signifies that even with one billion objects, you’d seemingly go 100 years with out dropping a single one! 

We take reaching our sturdiness targets very critically. On this put up, we’ll discover the highest methods we defend Cloud Storage knowledge. On the similar time, knowledge safety is finally a shared duty (the most typical trigger of information loss is unintended deletion by a consumer or storage administrator), so we’ll present finest practices to assist defend your knowledge towards dangers like pure disasters and consumer errors.

Bodily sturdiness

Most individuals take into consideration sturdiness within the context of defending towards community, server, and storage {hardware} failures.

At Google, our philosophy is that software program is finally one of the best ways to guard towards {hardware} failures. This permits us to realize larger reliability at a beautiful value, as a substitute of relying on unique {hardware} options. We assume {hardware} will fail on a regular basis—as a result of it does! However that doesn’t imply sturdiness has to undergo.

To retailer an object in Cloud Storage, we break it up into various ‘data chunks’, which we place on completely different servers with completely different energy sources. We additionally create various ‘code chunks’ for redundancy. Within the occasion of a {hardware} failure (e.g., server, disk), we use knowledge and code chunks to reconstruct your complete object. This method is named erasure coding. As well as, we retailer a number of copies of the metadata wanted to search out and browse the item, in order that if a number of metadata servers fails, we are able to proceed to entry the item.

The important thing requirement right here is that we all the time retailer knowledge redundantly throughout a number of availability zones earlier than a write is acknowledged as profitable. The encodings we use present enough redundancy to help a goal of greater than 11 nines of sturdiness towards a {hardware} failure. As soon as saved, we usually confirm checksums to protect knowledge at relaxation from sure kinds of data errors. Within the case of a checksum mismatch, knowledge is mechanically repaired utilizing the redundancy current in our encodings.

Finest observe: use dual-region or multi-region areas

These layers of safety towards bodily sturdiness dangers are effectively and good, however they could not defend towards substantial bodily destruction of a area—assume acts of warfare, an asteroid hit, or different large-scale disasters.

Cloud Storage’s 11 nines sturdiness goal applies to a single area. To go additional and defend towards pure disasters that might wipe out a whole area, contemplate storing your most vital knowledge in dual-region or multi-region buckets. These buckets mechanically guarantee redundancy of your knowledge throughout geographic areas. Utilizing these buckets requires no further configuration or API modifications to your purposes, whereas offering added sturdiness towards very uncommon, however probably catastrophic, occasions. As an additional benefit, these location types additionally include considerably larger availability SLAs, as a result of we are able to transparently serve your objects from multiple location if a area is briefly inaccessible.

Sturdiness in transit

One other class of sturdiness dangers issues corruption to knowledge in transit. This could possibly be knowledge transferred throughout networks throughout the Cloud Storage service itself or when importing or downloading objects to/from Cloud Storage.

To guard towards this supply of corruption, knowledge in transit inside Cloud Storage is designed to be all the time checksum-protected, with out exception. Within the case of a checksum-validation error, the request is mechanically retried, or an error is returned, relying on the circumstances.

Finest observe: use checksums for uploads and downloads

Whereas Google Cloud checksums all Cloud Storage objects that journey inside our service, to attain end-to-end safety, we advocate that you just provide checksums whenever you add your knowledge to Cloud Storage, and validate these checksums on the shopper whenever you obtain an object.

Human-induced sturdiness dangers

Arguably the largest threat of information loss is because of human error—not solely errors made by us as builders and operators of the service, but in addition errors made by Cloud Storage customers!

Software program bugs are probably the only largest threat to knowledge sturdiness. To keep away from sturdiness loss from software program bugs, we take steps to keep away from introducing data-corrupting or data-erasing bugs within the first place. We then preserve safeguards to detect a majority of these bugs rapidly, with the goal of catching them earlier than sturdiness degradation turns into sturdiness loss.

To catch bugs up entrance, we solely launch a brand new model of Cloud Storage to manufacturing after it passes a big set of integration exams. These embrace exercising a wide range of edge-case failure eventualities corresponding to an availability zone taking place, and evaluating the behaviors of information encoding and placement APIs to earlier variations to display screen for regressions.

As soon as a brand new software program launch is accepted, we roll out upgrades in phases by availability zone, beginning with a really restricted preliminary space of influence and slowly ramping up till it’s in widespread use. This permits us to catch points earlier than they’ve a big influence and whereas there are nonetheless further copies of information (or a enough variety of erasure code chunks) from which to get better, if wanted. These software program rollouts are monitored carefully with plans in place for fast rollbacks, if obligatory.

There’s lots you are able to do, too, to guard your knowledge from being misplaced.

Finest observe: activate object versioning

One of the crucial frequent sources of information loss is unintended deletion of information by a storage administrator or end-user. If you activate object versioning, Cloud Storage preserves deleted objects in case you’ll want to restore them at a later time. By configuring Object Lifecycle Management insurance policies, you possibly can restrict how lengthy you retain versioned objects earlier than they’re completely deleted with a purpose to higher management your storage prices.

Finest observe: again up your knowledge

Cloud Storage’s 11-nines sturdiness goal doesn’t obviate the necessity to again up your knowledge. For instance, contemplate what a malicious hacker may do in the event that they obtained entry to your Cloud Storage account. Relying in your objectives, a backup could also be a second knowledge copy in one other area or cloud, on-premises, and even bodily remoted with an air gap on tape or disk.

Finest observe: use knowledge entry retention insurance policies and audit logs

For long-term knowledge retention, use the Cloud Storage bucket lock characteristic to set knowledge retention insurance policies and guarantee knowledge is locked for particular durations of time. Doing so prevents unintended modification/deletion, and when mixed with data access audit logging, can fulfillregulatory and compliance requirements corresponding to FINRA, SEC, and CFTC and sure well being care trade retention rules

Finest observe: use role-based entry management insurance policies

You possibly can restrict the blast radius of malicious hackers and unintended deletions by making certain that IAM knowledge access control insurance policies comply with the ideas of separation of duties and least privilege. For instance, separate these with the flexibility to create buckets from those that can delete tasks.

Encryption keys and sturdiness

All Cloud Storage knowledge is designed to all the time be encrypted at relaxation and in transit throughout the cloud. As a result of objects are unreadable with out their encryption keys, the lack of encryption keys is a big threat to sturdiness—in spite of everything, what use is very sturdy knowledge in case you can’t learn it? With Cloud Storage, you’ve gotten three decisions for key administration: 1) belief Google to handle the encryption keys for you, 2) use Customer Managed Encryption Keys (CMEK) with Cloud KMS, or 3) use Customer Supplied Encryption Keys (CSEK) with an exterior key server.

Google takes comparable steps as described earlier (together with erasure coding and consistency checking) to guard the sturdiness of the encryption keys below its management.

Finest observe: safeguard your encryption keys

By selecting both CMEK or CSEK to handle your keys, you’re taking direct management of managing your individual keys. It’s important in these instances that you just additionally defend your keys in a way that additionally gives no less than 11 nines of sturdiness. For CSEK, this implies sustaining off-site backups of your keys so that you’ve a path to restoration even when your keys are misplaced or corrupted indirectly. If such precautions should not taken, the sturdiness of the encryption keys will decide the sturdiness of the information.

Going past 11 nines

Google Cloud takes the duty of defending your knowledge extraordinarily critically. In observe, the quite a few strategies outlined right here have allowed Cloud Storage to exceed 11 nines of annual sturdiness up to now. Add to that the perfect practices we shared on this information, and also you’ll assist to make sure that your knowledge is right here whenever you want it—whether or not that be later immediately or many years sooner or later. To get began, try this complete assortment of Cloud Storage how-to guides.


Due to Dean Hildebrand, Technical Director, Workplace of the CTO, who’s a coauthor of the doc on which this put up is predicated.



Leave a Reply

Your email address will not be published. Required fields are marked *