NewGenApps Blog posts

Following AWS Simple Storage Service (S3) best practices

Written by Gary Wilson | Feb 20, 2019 6:30:00 PM

Amazon’s S3 (Simple Storage Service) is a valuable storage option for developers of all backgrounds and experiences, it’s straightforward to use, and has many benefits, including: the ability to store both large and small amounts of data at a time convenient to yourself, being a completely scalable and reliable infrastructure, and being a cost-effective choice.

You will get the most out of Amazon’s Simple Storage Service by being aware of, and implementing the recommended best practices, which we look at here. See more about s3 best practices before after you read the content within this article.

Performance best practice guide

1. Be careful about not overloading the system

You can scale the AWS Simple Storage Service as needed, and the system is more than capable of dealing with extremely high request rates – in this case buckets are partitioned automatically to deal with this but there are limits, so if you expect more than eight hundred GET requests or over three hundred put-list-delete requests every second or so it’s wise to request support. To ignore this could mean a temporary freeze on your request rate.

2. Have an organized system

If you have a workload which includes lots of different types of requests you must pick suitable key names for each object. This is necessary as the S3 system indexes the key names of each object in separate AWS regions, following alphabetical ordering, while object keys are stored in one of many partitions, determined by their key name. An organized system is crucial if anything is to be located easily.

3. Prepare for failure

It’s not the end of the world if anything fails, but it’s important to know how to recover when you need to. For example:

  • Favor multipart uploading for PUTs as this allows greater numbers of uploads at one time, and should anything fail it is only that isolated section which needs to be uploaded again.
  • GETs benefit from using a range http header as in the case of a download failure only the affected section need to be identified and downloaded a second time.

4. Be up-to-date with security

Reduce the risk of having data lost through overwriting or accidental deletions by using ‘Versioning’, and there are added bonuses to be had – namely being able to retrieve old versions at will, or to find and restore anything deleted, accidentally or not. However, it is very important to note that buckets must be backed up constantly as these cannot be retrieved once deleted, regardless of the reason why it happened.

5. Have a tracking plan in place

It’s easy enough to organize – simply make use of Event Notifications [so you get the heads up on put or delete requests], CloudTrail [which deals with API calls and log files], and CloudWatch [good for keeping up with what the S3 buckets are doing, as well as monitoring and keeping track of the metrics – including noting object counts and how many bytes are currently stored].

 

Be aware that there are many other alternatives and solutions available in the market that can help you easily store and access data at a very fast pace for real time applications or online services. The best solution to consider when picking a high quality storage solution is to test different services across multiple CDN instances. From here, you will be able to get a wide spread of data in order to find out exactly what is working best for your users.