S3 Glacier Instant Retrieval deep dive: Which S3 Storage Class is right for me?
S3 Glacier Instant Retrieval acts as an archival solution that has both low costs AND synchronous instant access. Which S3 Storage Class is right for you?
Jun 08, 2023 • 9 Minute Read
In this post, we'll get hands on with S3 Glacier Instant Retrieval to see how much this new AWS storage class can help you save. Plus, we offer an infographic to help you decide which S3 Storage Class is right for your needs.
AWS re:Invent 2021 came with a whole bunch of news, but one that would be easily missed is the new Amazon S3 Glacier Instant Retrieval storage class.
Other AWS re:Invent 2021 highlights
Accelerate your career
Get started with ACG and transform your career with courses and real hands-on labs in AWS, Microsoft Azure, Google Cloud, and beyond.
“We don’t need it now, and we probably won’t for years. But when we need it, we will need it now” is a common user story with storage.
AWS has given us a lot of options over the years, particularly with S3, but the need for immediate access has kept Glacier, and its low costs, out of reach of many companies.
S3 Glacier Instant Retrieval aims to give you the best of both worlds, as an archival solution that has both low costs and synchronous instant access. But is it the right S3 storage class for you?
Test driving S3 Glacier Instant Retrieval
First, let’s see how “instant” S3 Glacier Instant Retrieval truly is.
We’ll begin by creating some new S3 buckets in a few different regions around the world, removing their public access block, and confirming that they’re ready to go:
Now let’s make a random file, and upload two versions to each of the buckets; one using S3 Standard, and one using the new S3 Glacier Instant Retrieval, along with setting the file’s ACL to Public.
aws s3 cp did not support the new storage class; only
aws s3api put-object did; unsure if by omission, or intention. If it’s not working for you with the
put-object, you may need to update your AWS CLI.
And since we’re dealing with a new feature, let’s double-check in the AWS Console to make sure the new Storage class applied as expected:
Now that we have our files in the buckets, let’s pull them back down using
curl, and get the total response time from each transfer.
All right,,that looks pretty good! Some of the results are a bit varied due to the natural issue of my distance to different regions and normal (Australian) internet variability, but they’re all within the same ballpark.
Regarding why ap-southeast-2 was especially faster, that’s due to my location in Australia, putting me in the nearest region possible (at least until AWS launches the Melbourne region next year).
Cost savings with S3 Glacier Instant Retrieval
Whether you’ll save money using S3 Glacier Instant Retrieval comes down to how you intend to use it.
Let’s do some cost modelling (ooh, exciting!) based purely on storage and retrieval costs to see how that looks. Other costs like data transfer are the same regardless of the storage class. We’ll use the S3 Pricing as of December 2021 in us-east-1.
First, let’s look purely at storage costs, and see how it looks with zero data transfer. We’ll compare with some of the other S3 Storage Classes: S3 Standard, S3 Standard-IA, and S3 Glacier Deep Archive for reference:
Unsurprisingly, Deep Archive remains the cheapest, and Standard the most expensive. But the difference between Standard-IA and Glacier Instant Retrieval is quite remarkable, with a 68% cost reduction, just as the official announcement proclaimed.
But the devil is in the details, and the details here are retrieval costs.
Let’s see what this looks like if we retrieve just 1% of our 100TB across a year (broken up as 2,875 files of 1MB files per day) across a year, comparing both Standard-IA and Glacier Instant Access Retrieval:
Believe it or not, even pulling down 1TB from our bucket over the course of a year doesn’t change Standard-IA that much (only 0.03%, in fact). Glacier Instant Access Retrieval on the other hand showed a dramatic difference (54.7%).
The retrieval costs for Glacier Instant Retrieval are substantially more expensive than Standard-IA, specifically because it’s designed for archival purposes. In fact, retrieval costs per GB are triple the price, and the GET requests are a staggering x10 more expensive.
But even here, downloading such a large amount, we’re still 30% cheaper than Standard-IA! And that’s nothing to sneeze at.
The type of files you’re retrieving also matter. In our last scenario, we were using 5,250 files of 1MB files per day; the size of your average document or PDF. But in another scenario, let’s go with 5 files per day of 1GB in size; the size of a moderate media project. These both equal out to about 1.8TB per year in our 100TB bucket.
We’ll take these two, and compare with again with Standard-IA:
Comparatively, Glacier Instant’s retrieval costs scale better with smaller numbers of large files. For the same total transfer sizes here, there’s a difference of nearly 20%. We can also see Glacier Instant Access actually becoming more expensive than Standard-Infrequent Access.
Again: S3 Glacier Instant Retrieval is an archive; where files shouldn’t be intended for access, but need to be rapidly available if the need arises. When the objects remain untouched, our cost savings are astounding.
Meanwhile, S3 Glacier Deep Archive continues to be its usual astoundingly cheap self, at over 90% off from any of these other solutions, as long as you can afford the delay in retrievals.
The hero we deserve.
What about my use case?
It’s the oldest rule of them all: Understand your business needs.
If you have objects that you need to retain for business reasons, and absolutely have to be on-hand for immediate access, S3 Glacier Instant Retrieval is spot-on for your needs.
If it can wait even a few minutes, S3 Glacier Flexible Retrieval (formerly just S3 Glacier) can continue to save you heaps, and even more if hours or days are an option, where you may use S3 Glacier Deep Archive.
Sometimes though, we just have no clue what the actual business needs are, because the business may not know. And for those cases, we have S3 Intelligent-Tiering
Chances are AWS knows your data usage patterns better than you do. And we can use this to make our lives easier with S3 Intelligent-Tiering.
This isn’t a new feature, and I won’t go deep into it. But in cases like this, where we’re trying to find the line between using different storage classes like Standard-IA or Glacier Instant Retrieval, we can let AWS manage this for us.
If you’re not quite sure whether your data belongs in Glacier Instant Retrieval, Intelligent-Tiering can take that headache away. And while you do pay a small fee for the convenience ($0.0025 per 1,000 objects), that can be easily be offset by the cost savings of having your files in the best storage class.
Infographic: Which S3 Storage Class is right for me?
No question, S3 Glacier Instant Retrieval can and will save many businesses a lot of money. It does add some new complexity to the question of which storage classes to use. So here’s a handy infographic to help guide that process. (EDIT: Thanks to James Vasanthan on Twitter for pointing out an error in the original!)
One thing that’s also held back the embrace of S3 Glacier is the fear of asynchronous requests, and what that means for our applications. With S3 Glacier Instant Retrieval using the same synchronous API’s as S3 Standard and Standard-IA, no application refactoring is necessary.
For compliance cases, this is a no-brainer. But it’s also excellent for organizations who have terabytes of data that they’re confident that people won’t be accessing, but the business exclaims “everything ever must be instantly accessible always”, while saving costs even more than with Standard-IA.
And that’s a good problem to solve.