![]() Daily reports that go out at the same time everyday were now showing up late. While this looked good on paper (matched the CPU and Storage of the much larger 2 dc2.8xlarge nodes and was half the price) it did not transfer from paper to the real world. Queries that were normally taking a minute or two were now taking over 10 minutes. We started out with 2 (big)dc2.8xlarge nodes that were using around 60% of their CPU. So, we decided to try 32 (small) dc2.large nodes. Our Production Cluster was a bit more complicated. We could resize to 3 ds2.xlarge nodes and barely go over 60% CPU usage, but at the same time spend 50% of the money we were currently spending. ![]() Resizing started out simple: Our Non-Production cluster was running 6 ds2.xlarge nodes and barely went over 30% CPU usage. Over provisioning of anything in the cloud tends to be common, and it is never questioned until something breaks ( or someone looks at the spend ). So, this one is obviously going to be the big money saver. Resizing (Too Much, Too Little, Just Right) So last year we decided to investigate how we are using what we are paying for and how to improve it. It turns out there are a bunch of ways to save money using Redshift without sacrificing performance and a bunch of built-in features to help you along the way. But the backbone of our data infrastructure is Amazon Redshift (aka Jarvis). Redshift is a data warehouse used for everything from collecting data from RDS instances and S3 data lakes, to generating daily reports, data visualization (or data artistry if you will… please don’t.) and populating Redis instances that our APIs will use to then populate dashboards & websites.īut it is also kind of a big spend. Also, some homemade applications (Max, Nessie, Kraken & AWS Glue ) to move it around, along with Rundeck to schedule when things are moved and when reports are generated. We use several AWS Services including RDS, EC2s running DBs, Elasticache (Redis), DynamoDB, S3 and Redshift to store our data. Here at Equinox, we move data around quite a bit. – real quote from one of my college professors “Programming isn’t that hard, it’s just moving data around.”
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |