Cloud computing may seem like a walk in the park: it takes just a few clicks to set up a working application environment. However, it can sometimes get very difficult when certain network complexities, firewalls, balancers and other things come into play. Of course, it all depends on the perspective and the project you are working on. Sounds obvious, right?
Often, these complexities can lead to different mistakes when working in the cloud. I have identified five common “accidents at work.” Over the years, I've developed a set of practices that allow me to avoid them. Read on to make sure you don't make these mistakes – and if you do make them sometimes, learn how to deal with them.
Cloud mistake 1: Lack of budget control
As I mainly work with cloud computing projects, I have developed a specific way of doing things and try to stick to the standards developed by my team. One of them is the budget and log configuration for the AWS account. It is usually done at the beginning of the project and adjusted depending on the client's needs and the project size.
Forgetting about the budget is quite common. The development team focuses on the functionalities, scaling, improving, and polishing the application at the beginning of the project, and it happens that the budget is only an afterthought. Why is this unacceptable? At some point, you will have to explain the totals on the invoice, and things will get nasty.
Storing logs in CloudWatch is one of the things that can generate additional costs. Incorrect configuration of the development environment may lead to significantly greater costs than initially assumed. Plus, a large number of logs can generate high costs. You should always remember to set it to only log only the necessary information.
In addition, set up relevant notifications informing the development team whenever they're about to reach the assumed budget threshold.
Cloud mistake 2: IAM accesses are not verified
IAM is a service that acts as the first layer of security on an AWS account. You should pay close attention to what the policies contain and follow specific standards. Especially by following the principle of “grant least privileges,” which says you should give very strictly limited access rights.
Verifying these accesses is very important in the context of days off. You should always make sure that the person replacing another person gets appropriate access rights for this period and will be able to configure things (e.g., S3 buckets) in the absence of the account owner. If the stand-in does not have adequate access rights and you need to act quickly, you may need to create a new element and reconfigure it.
Cloud mistake 3: Sending test emails to unverified addresses
Configuring the email service in the AWS cloud is simple: just verify the domain from which you want to send messages and contact AWS Support, who will change your account status from sandbox to production.
When working on a new application and testing it thoroughly by sending emails, remember not to send them to unverified addresses. The sandbox account has limited functionalities and allows you to send emails only to verified addresses. By sending them to test addresses, you can lead to a loss of reputation and a situation in which you will exceed the critical number of “bounces” and, consequently, the account will be blocked. I also recommend setting up relevant alerts and monitoring.
Cloud mistake 4: Lack of database encryption
I often hear teams forget to turn on database encryption. This is an optional parameter. When following the infrastructure as code approach, the cloud service you create is described with the Terraform code (a code creation tool for the cloud infrastructure). Remember to turn on encryption when preparing the database configuration – it will not turn on automatically.
According to the best AWS practices and recommendations, working with encrypted data is not enough. It is also necessary to enable encryption for newly created databases. Unfortunately, this is an option that cannot be enabled for an existing database. Fortunately, however, there is a workaround for this.
To “enable” database encryption:
— create a database backup (snapshot),
— start a new database with the encryption option enabled,
— change the name of the original database,
— change the name of the database created from the snapshot to the name of the original database.
With that, the services connecting to the database won't get disconnected and refer to the new, encrypted database.
Cloud mistake 5: Insufficient space on the Kubernetes cluster
When scaling your applications, you may face the problem of insufficient resources on the Kubernetes cluster. If you do not monitor the project properly, the cluster may turn out to be too small, but only once the running applications get the Evicted status (i.e., when there are no resources to run specific pods).
It may also happen that the deployment time is extremely long, and GitLab gives “timeouts” when applications take up a lot of space. To remedy this, you can increase the “timeout” – by uploading a new version of the application, you can extend the build time. However, it is not a good solution in the long run. In this case, the best solution would be optimizing the docker images you build for your application. You should throw away unnecessary dependencies and duplicates and reduce the size of images. These changes will help speed up the deployment process and thus change the status from Evicted to Running.
I hope you found my list of mistakes and ways to handle them useful. If you have questions about AWS programming, feel free to send a message to hi@rst.software and we will get back to you with answers.