Recently we went through the process of reviewing our storage strategies. A big project we undertook was to archive all our data that that fit the “archive” description. We have data that is legally required to be stored but have a low likelihood of being accessed again.
In the past, we made use of the low cost of AWS’ Glacier offering but Azure’s Archive tier costing has motivated us to move to this as our solution.
We are hoping that this article, and a few more we will be publishing soon, will save you a lot of time if you are looking to do the same.
Azure’s Blob storage has three-tier options.
This tier is for data that you wish to access more frequently. It has higher storage costs but lower costs to interact with the storage. It also does not have a minimum data retention policy, meaning you can delete the data or move to another tier without paying a penalty.
This would be for data less frequently accessed, but not typical archived data. It has a lower storage cost but higher transaction and access cost. This tier also carries a retention period of 30 days.
Azure Archive has the lowest cost. The data is slow to retrieve as it’s classified as “offline”. The retention period for this tier is 180 days.
Cool and archive tiers have a 30 day and 180-day retention period respectively. What this means is, from the date of creation or change (for instance changing tier), if you delete, move or make another change to the data before the retention period you get penalized. You will pay what you would have been paid if the retention period was respected. A real-world example would be if you upload data to the cool tier, then after 4 days, you decide to change the tier, delete or move the data, will be charged for 26 days (30 days retention period – 4 days that you are into the retention period) at the cool tier and then whatever new costs there might be depending on your action. For archive, this includes accessing the data, called rehydrating.
For this reason, it is important to keep this in mind when designing your data storage strategy. Archive tier really should only be used for data that you are pretty sure you will not be accessing in 180 days. Another strategy we employed was to keep the data in smaller retrievable silos. If we only need a 30MB, we did not want to retrieve a 1TB compressed folder.
At the time of writing this article and using unreserved, publicly available costing as a guide.