Data lifecycle
The sheer volume data we produce and consume in the modern digital world is beyond what our brains can imagine. Just imagine the working of the famous flight tracking service Flightradar24. It maps every operating flight across the globe, as shown in Figure 10-10, at any point in time and shows each one’s movement at frequent intervals. To provide such a unique experience, it collects and processes data from multiple satellites, radars, airlines, airports, etc.
Figure 10-10. A map of all aircraft in operation globally at a particular time (source:
As your applications are swamped by more than they can digest, you need policies and adequate processes as part of the data lifecycle, as shown in Figure 10-11. You often hear businesses equate digital data to gold. The key difference between the data gold and the real gold is that not all digital data remains gold forever. Once the data has been processed and the insights extracted, much of it turns into data dust that is useless to anyone. If you neglect it and don’t deal with it in time, the dust settles and slowly forms data-waste dunes that offer no value—but they do have a cost to the business, and crucially, they pose a hazard to a sustainable environment.
Figure 10-11. A typical data lifecycle, from creation to destruction (source: adapted from an image on the Blancco website)
Sustainability patterns for data and storage
As data goes through its lifecycle, you can apply patterns and practices at every phase to aid sustainability in the cloud. The following sections provide some tips.
Select a suitable data store for your data and access patterns. As mentioned in Chapter 1, there are many types of databases available (object storage, key/value, relational, document, graph, etc.). Each type has its purpose, and choosing the right one based on its fitness for your purposes is key. By offering various data storage services, AWS has already optimized operations on several factors, including sustainability. It is your responsibility to choose the right one and avoid getting into a “square peg in a round hole” situation.
Amazon S3 is a high-performing object store. It is not designed to operate efficiently with structured data and perform the data operations you normally do in a relational database.