• Home
  • Blog
  • Benefits and challenges connected with serverless architecture

CLOUD, TECHNOLOGIES

05.02.2021 - Read in 18 min.

Benefits and challenges connected with serverless architecture

05.02.2021 - Read in 18 min.

Thinking about using serverless architecture in your current or upcoming project? Maybe you have heard about it or about some other company using it? Regardless of what you know about serverless, this (extended) introduction based on AWS will provide some new information.

RST Software Blog - Benefits and challenges

This text is inspired by my lecture during RST CodeMeetings #15. If you prefer a video version where I present a sample app step-by-step, here is a YouTube video:
“What your friend forgot to tell you about serverless?”.

What is serverless, and what it definitely is not?

First of all, serverless is a buzzword – a catchy term that seems to describe something, but in a rather quaint way. After all, how can you have an app or an entire system without a server? Unlike what the name suggests, there is a server. The whole serverless idea means that you as a developer can focus on the value delivered by you app. You don’t have to wonder whether and how hosts should be scaled with the app, whether the base can cope with the Black Friday traffic, or whether logs will take up your disk space. So, the “serverless” name basically means that you “care less about the server”.

Contrary to appearances, the concept is not new, and even though most people associate it with Lambda service introduced by AWS in late 2014, its history begins at least a few years earlier. The name came into circulation around 2012, and the things it describes in fact evolved from Platform-as-a-Service solutions, like Heroku or Google Cloud Platform which had gained popularity a few years back.

Lambda from AWS that I mentioned earlier, is an example of a Function-as-a-Service solution, where you write the code, create a package, send it to you service hosted by the provider, and there you go – your app is implemented. So, serverless is not entirely what the name suggests. You don’t have to bother about the server, but the code is run in a specific runtime environment, specific version of Node or Python under a specific system image. Because you’re not responsible for management, the provider (e.g. AWS) can surprise you and remove a runtime with a specific version they find obsolete (that’s what they did with Node 8). What’s more, system image updates can also cause issues, which may negatively impact the app. And if your functions aren’t “warmed up”, i.e. they haven’t been used for some time, restoring or setting up a runtime environment can take half a second on average, depending on the technology. This so-called cold start can also impact end-users. Luckily, there are ways to deal with the cold start.

 

Some say that comparing serverless solutions to LEGO bricks is OK… provided that you’re 5 years old.

However, this analogy is not entirely bad, as Function-as-a-Service is not the only serverless service – there’s more, and you can (and should) combine them into blocks even in basic apps.

And so, in AWS we use S3 for storing files. To store data in structures resembling databases, we can use e.g. DynamoDB or Aurora Serverless. To present the app to the world, we use API Gateway. To integrate system components using queues and publish-subscribe mechanisms, SQS and SNS services can be used. If you can’t live without cron, you can use EventBridge, and CloudWatch can take care of logs.

Honestly speaking, neither serverless nor cloud solutions are monopolised by AWS, and similar services can be found in clouds from Google or Microsoft Azure. If you’re not afraid of serverless PHP, you may want to consider Function Compute from Alibaba Cloud, where PHP 7 is a first-class citizen, unlike Lambda from AWS.

RST Software Blog - benefits - serverless

Benefits and their price

What is the point of it all? Why learn new technologies and migrate systems to serverless services? There are several benefits at different levels.

  • a niche technology with a growing demand – benefits for a developer,
  • a perfect tool for prototyping (not only for start-ups!),
  • maximisation of time-to-market,
  • minimisation of operational costs (mostly maintenance),
  • focus on values delivered by the software.

As a developer, you’re entering a niche domain – maybe we’ve missed early adopters, but in my opinion we’re at the beginning of the early majority stage. Serverless competences are still a rarity among developers, but some trends indicate that the demand will grow significantly in the upcoming years. Gaining these competences will enhance your CV and enable you to get promoted, but there’s one more thing – if your dream is to become a startupper or even an independent developer, this will open new possibilities in developing prototypes and then final applications, for various side projects that sometimes come to mind. I am currently working on a project based on the least absorbing system maintenance and serverless fits perfectly here.

From the business perspective, arguments are even stronger – learning necessary basics will allow you to achieve shorter time-to-market for new apps and drastically lower operational costs for existing systems by separating these less effective subsystems and processes. For instance, Netflix does that by using Lambda functions as a help during backups and deployments.

By using readymade services, e.g. for authenticating users, sending e-mails, etc. you can shift workload dedicated to development to something that makes your business stand out – building a competitive advantage and reacting quicker to the situation on the market.

Scalability. This is interesting for developers but, sooner or later, it also affects business – whether they like it or not. One of the advantages of serverless is that scalability is inherent in the definition of architecture. Your work may be cut down to adjusting parameters for specific needs.

Of course, a cloud may be quite expensive, but everything depends on the specific solution you’ll use. In case of serverless, it’s clearly visible because the commonly cited argument used by e.g. AWS, i.e. the pay-as-you-go model, is crucial in this case. In practice, it means that along with the resources behind your application costs are scaled too – from zero to infinity. You don’t pay based on the time that the instance is active in, but e.g. the number of HTTP requests to API Gateway, the time of executing functions in Lambda, individual records and readings in DynamoDB, gigabytes transferred from S3, etc.

It’s not true that AWS only waits for that to happen and then rubs their hands, as you can monitor current costs, set alerts on exceeding the limit of expenses, and scalable mechanisms can also, to some extent, reduce the use of resources if they detect that it may result from an error.

Applications… and restrictions

Serverless is useful in various applications – backend for web and mobile apps, systems that process larger data packages in the background with the use of event streams, and backend for the Internet of Things.

 

Is there something that can’t be done in serverless? I would guess yes, there must be projects in which this approach might be cost-ineffective. It may happen that in your specific case, with your plans, requirements, and limitations, a dedicated server will have to be created and as a result a kitten will die somewhere – there’s nothing wrong with that. Remember to always match tools with the problem, not the other way round.

Despite that, I’m sure that many projects, especially green-field projects, will benefit greatly from designing the architecture as serverless from the very beginning.

 

Examples of use

We should mainly think about serverless as a concept of an architecture based on events. After all, we have a strongly dispersed environment, so services communicate with one another using events triggered by a certain change in data, HTTP request, etc.

It’s best to show it using a concrete example. I have prepared a serverless app to show you basic concepts and initial issues you may stumble across.

It’s a rather basic app, but it’s not Hello World basic. Let’s imagine a large system – a product used daily by tens of thousands of users – that should integrate with a legacy system containing information e.g. whether a given company is financially sound and reliable to cooperate with, or on the contrary – it’s indebted and poses a threat to contractors. There’s a single server, but the app is not scalable, and the rig fights for its life every day when the janitor arrives. You are tasked with creating a REST API to be used by this larger product, and you need to ensure its performance so as not to kill the legacy system. You cannot switch responsibility for data onto the new service, as the other system is the primary working tool for the customer service and legal teams, and many other people. That’s not easy, but after reading this article you’ll at least try to design it as a serverless architecture.

In functional terms, this looks easy. REST API with a single endpoint where each GET for a company with a specific ID receives simple data like “reliable company” or “indebted”, with the last change of its status.

The infrastructure underneath may look e.g. like this:

RST Blog - Serverless infrastructure

API Gateway supports the HTTP layer and forwards each GET to REST API as an event to Lambda. To avoid excessive response times (our data source is slow and unpredictable), the function communicates exclusively with DynamoDB. For each specific project you need to decide what to do when there is no data about a given company, and how often this data should be updated. For the sake of this example, we can decide that in such situation error 500 is acceptable (e.g. 502 Bad Gateway), but at the same time a message should be put in the SQS queue that will be asynchronously handled by another Lambda function, which will fetch data from an external API. Once it has the data, it saves it in DynamoDB along with its fetch date and time. That’s it.

Have a look at the code or watch the video where I discuss and run the solution.

RST Software Blog - Serverless - Framework

Framework

The very first thing to remember when working with serverless – use a framework. I’m not talking about a framework for the app itself with controls, etc., but rather about a framework for managing resources in the cloud.

Managing the infrastructure via the AWS console from your browser (e.g. clicking new resources, uploading app code, etc.) makes sense only at the beginning – for a new service or serverless itself. Alternatively, you can manage everything via the console, i.e. the CLI tool from AWS. Yet, this is another level as you can write scripts and include them in the CI/CD process.

Nonetheless, a framework is a readymade set of scripts – prepared by somebody else and tested by the community. Here, we have a framework with a hardly original name, Serverless Framework. This is not an official AWS framework – they have their own called SAM, Serverless Application Model, but this one is in theory independent from the cloud provider, and the community keeps creating useful plugins.

There’s a plethora of frameworks to choose from – like Terraform, which can also be used for serverless architecture, but it’s not as comfortable.

 

 

Stack

A group of resources comprising a specific service within as app is called a stack. You can have different services within the app, but it may also turn out that the whole app serves as one big service.

Of course, in case of a cloud the stack is no longer agnostic, because we have to indicate somewhere that we are interested in AWS.

RST Software blog - serverless
        
functions:
  unreliableDataSource:
    handler: functions/dataSource.default
    events:
      - http:
          path: slow-data-source/financial-risk/{id}
          method: get
    timeout: 5
  httpCheckCompany:
    handler: functions/financialRisk.apiCheckCompany
    events:
      - http:
          path: financial-risk/{id}
          method: get
  asyncCompanyFetch:
    handler: functions/financialRisk.queueFetchCompany
    events:
      - sqs:
          arn: !GetAtt DataFetchQueue.Arn
        
    

Here, we have our Lambda functions.

There are three of them, even though I have previously mentioned two, as I will implement the aforementioned external API (the unstable source of data) as a dummy in the form of a Lambda function hidden behind API Gateway. Lambda functions alone are not available via HTTP and can be invoked only via API, SDK or CLI from AWS. In order for them to be available via HTTP, first we have to set up the API Gateway. To do that in Serverless Framework, we can indicate that a specific function will be triggered by an http event. You only need to specify the request method and the name of the endpoint, and that’s it. Full URL will be known at the deployment in the chosen environment.

Since I already have my environment, I put the full URL as an environment variable above. This IS NOT the official way to invoke one lambda with the use of another lambda. However, remember that in this case we want to model the legacy system as a separate API available via HTTP, which is why we’re doing it this way.

For this mock-up, I’ve also set that it has to be shut down by API if the response time exceeds 5 seconds. These things can sometimes happen, because the function code features a random sleep time of 2 to 7 seconds.

Then, we have our proper functions – one responsible for endpoint exposed to the system that will integrate with us, and the other triggered by events from SQS queue.

Below, we have resource declarations that the framework will not create by itself – we have to create a table in DynamoDB and a queue in SQS, following the CloudFormation syntax.

 

Let’s get back to the top where we have our declarations of environment variables. There are different approaches to this, different plugins to help out and integrate e.g. with dotnev library that you may already know. In this project I’ve used this basic approach.

 

Another extremely important thing is safety! This is elementary – do not hard code secrets in the code, use AWS roles, preferably per each function. By default, Serverless Framework uses one role for all functions of a service, but there’s a plugin to change that. I just decided not to use it. However, remember that it’s a good idea to use “the least privilege” approach and give your function access only to the things it needs to operate.

RST Software Masters - serverless

Deployment

OK, let’s assume that after the last change of code I did not conduct deployment. I enter the console and simply type “serverless deploy” or “sls deploy” – the latter is just a shorter version of the same command.

It may take a minute or two, and the fact that I write code in TypeScript, which has to be transpiled to JavaScript, also adds to the delay. For the sake of this recording, I’ll speed things up a bit.

The changes have been implemented. Serverless Framework took care of things that had to be deleted, added, and things that shouldn’t be touched during the deployment, and provided us with i.a. HTTP endpoints launching our Lambda functions.

Deployment to a different environment? Imagine that this is only a matter of adding the “-s” parameter, like stage, with a value (e.g. “rc”). That is why in some parts of service description you’ll find references to the “stage-a” value, for instance in table or queue names.

The default environment is indicated in YAML, and even though it can’t be seen, it is there and it has the “dev” value. Similarly, I didn’t indicate any AWS region as it was irrelevant to me, but it can be selected either in serverless.yml or by using a parameter during deployment.

The framework allows us to clean things up and delete the entire stack in a selected stage using a single “remove” command.

Time to run it. I’ll take the address of our only relevant endpoint and I’ll shoot a GET at it. Next, I’ll neatly format the JSON I get in return.

I’ll use the ID 123 as this is one of the two IDs hard coded in our function – a data source mock-up.

Since the DynamoDB table is still empty, you may remember that at the beginning we’ll receive the 502  error, as per the contract. But after a few seconds of sending requests by the other Lambda to our data source in the background, if this operation ends successfully (and we already know that the success is not guaranteed), one of the requests will return current data.

 

To check if the function handling queue events was indeed executed successfully, we can use the  “sls logs -f asyncCompanyFetch” command.

 

Functions

At this point it’s a good time to have a look at what our functions actually look like. Mind that the code is not entirely correct. That’s how it’s supposed to be.

Have a look at what’s happening here. We identify a function that accepts a specific event type, fetches required input data, and connects with DynamoDB to fetch the data. Additionally, I have added a logic to verify if the data was entered during the last hour. If not, or if there’s no data, we send a message to SQS to schedule the download of fresh data from an external service.

Finally, the function returns a JSON with some response code, depending on the actual situation.

 

The other function.

I’ll tell you right away that there’s no bug here. Even though SQS can send a package of different events to the function, I’ll take the easy way here and use only the first event for the sake of this example.

Why is this a critical bug? By sending a package of messages, SQS expects that either all of them are handled correctly (and can be removed from the queue), or all return to the queue, even if only a single event failed. In this case, I can have several events with different IDs of different companies, but I want to fetch data only for one of the companies. If the request to an external API is successful, the remaining events that I didn’t even take into consideration are lost.

 

The way in which SQS works brings us to yet another very important thing. When writing your functions, you need to ensure their idempotence, which means that when invoked with the same input, they won’t change the result or the app status in an unpredictable way. You have to be prepared that a certain event, even though it was already handled in the past, may occur again.

 

Then, we fetch data from the API and check if the request was successful. It’s important to verify that and return an error, as only then such a message can go back to the queue and be re-sent. Depending on the settings, it can be re-sent a few times and then deleted or, instead of being deleted, it can be placed in a separate queue of failed messages, the so-called Dead Letter Queue (DLQ). Messages from this queue can be consumed just like any other messages, but this depends on the developer and the actual deployment.

 

Finally, we use putItem to add them in DynamoDB.

 

The biggest challenge in serverless

Now, if I were on the other side, I’d ask: but what about tests?!

There are no tests because such code cannot be tested. I left it in this form on purpose, as I wanted to talk more about this.

 

In general, tests in serverless are quite a challenge. Any ideas why?

Besides the fact that the code breaches the rules of proper programming, more importantly, it is entirely dependent on the cloud. You can’t run it with no Internet access. You can’t run it without AWS.

 

When I began working with serverless, I asked my colleagues how functions should be tested. Somebody recommended localstack – one of many tools that try to implement AWS services (e.g. DynamoDB) locally or in a docker of some sort. There are people who find this approach useful, but it also has its fierce opponents. Not to mention the fact that using such localstacks (e.g. within CI/CD) means having some ops, whereas serverless promised us NoOPS in the first place.What is actually tested this way? Definitely not the things that are run in the cloud later on. Maybe these implementations operate in an identical way to AWS services, or maybe there’s an edge case in which they differ.

And then there’s an even more important thing. Whenever something doesn’t work, it’s usually caused by the lack of access, not by the request sent to DynamoDB. Incorrectly set up roles, missing access rights, or access rights limited by permissions boundary of a given function.

I am not entirely against using such tools – you can play with them locally during the deployment – but reasonable tests should be conducted in a different manner. How?

 

First of all, by dividing the code into smaller parts and testing them in isolation, as you have already seen. Architectural patterns (e.g. hexagonal architecture) as well as the SOLID approach (or the S itself, i.e. Single Responsibility Principle) may be helpful here. I have seen an approach that slightly changes the traditional pyramid of testing for the needs of serverless. Depending on your imagination, it may be a honeycomb or a somewhat odd egg.

RST Software - tests e2e

Write e2e tests, write unit tests, but most importantly test the integration of the things that communicate with AWS services. How to do that if localstack doesn’t do the trick? Well, you have the cloud, so use it! You have already seen how easy it is to conduct a deployment in a new environment – so create such an environment (e.g. test). Or, if you find it reasonable, you can even create different environments per feature branch or per developer. The goal is to test the integration with AWS in AWS, as this is the only way to capture all the nuances specific to this particular provider.

 

The code you have seen is just a separate branch for discussing problems with testing.
The master – apologies that it’s not main – already contains a bit more code.

 

This is not the only way to break things apart. To me, it was reasonable to use the CQRS and split saving and fetching state into commands and queries. This is where CQRS is rather poor, as we could use a better implementation of the message bus than my basic one, plus something to perform dependency injection.

The more important part is to break up separate code units onto a business logic that can be tested using unit tests. And things that have to communicate with a service can be placed in a separate interface and mocked-up. The actual implementation should be separated and integration tested.

 

Tests are an extensive topic and you may want to read more about it.

 

Tests also mean that you will need debugging. How to debug Lambda code? Locally, you can launch functions on proprietary event mock-ups – both Serverless Framework and AWS SAM have the “invoke” command for that. If you need debugging in the cloud for things that cannot be replicated locally, there are dedicated tools for that – AWS offers X-Ray, but there are also other options like e.g. Lumigo or Thundra.

 

Are there any other challenges while working with serverless apps? Sure, a whole lot of them.

  • Lambda functions have a specified maximum run time – and currently you cannot set the timeout to more than 15 minutes. It seems sufficient, but it also means that you won’t be able to apply those functions in certain scenarios. Function timeout is one thing, but when your Lambda is behind API Gateway, you must know that it also has its own timeout of 30s.
  • When you synchronously invoke one Lambda from another using SDK, remember that the SDK uses HTTP API, and the 200 response code does not mean that the function was invoked correctly. Watch the response field called FunctionError.
  • Cold start – Lambda is run in a certain environment. If a specific function hasn’t been used for some time, the environment is put in a sleep mode. Since it has to be woken up, the first request usually takes longer. To handle that, in the past functionalities were „warmed-up”, e.g. by periodically initializing them. Since late 2019, there’s no need for that any more, as AWS introduced the option to specify the minimum number of instances of a given function.
  • You should also consider service limits – for instance, the maximum number of simultaneous function callouts or the maximum number of operations in a given window (e.g. in Textract to recognise text from images).

Finally, I should mention the importance of the vendor lock-in. For instance, AWS focuses on retaining their customers, and it’s good as they have many useful services. But what if we have to change the provider? In such cases, it’s important how the app was written from the beginning and to what extent it is dependent on the specific services and the way they work.

 

 

The future

I believe serverless is the future. It has its entry threshold – you need to select the provider as well as specific services; it has its own challenges during work, and it requires a different mindset, but the speed of value generation is so appealing that you may want to consider it for your project.

Most probably, it won’t replace traditional hosting, but according to my predictions (and others’ predictions as well), companies will increasingly adopt it, and serverless developers will become desirable on the market. As a consequence, a large portion of knowledge acquired along the years (in my case, it’s over 20 years) on the configuration of WWW servers, RabbitMQ, MySQL, or fine-tuning, will become useless and obsolete.

OK, I may be exaggerating. There’s this concept of “Mechanical Sympathy” – for our industry it means that while working on software solutions, we should make sure they’re in symbiosis with the hardware. Apart from reasonably managing resources, we should also understand all the nuances of how things work underneath. Indeed, serverless makes it a bit easier, but knowing what’s what can help us understand and solve specific edge cases. The main difference is in what we can control, and which tools have been taken away from us. Therefore, you should master the underlying technologies – AWS provides extensive documentation. Read and learn, not least to allow the data centers hosting your apps to cut down the levels of CO2 emissions…

 

If you need more information on serverless, you can find lots of training materials and articles by visiting ServerlessPolska or AWS Training. To learn more about software architecture in general, visit my friends at Better Software Design.

 

To share your experiences or ask a question, please add a comment to this article or contact me using my website – marcinkowski.pl. The video recording of this CodeMeetings lecture can be found on YouTube.

May the serverless be with you!

Tags:

Article notes

Share

RST Software - Blog - Tomek Marcinkowski

Tomek Marcinkowski

Developer

Developer to the bone who’s not ashamed of that. He tries to use a pragmatic approach in his work to find the right balance between business needs and the pursuit of technical perfection. He claims that software’s goal is to solve user problems, and not to boost developer’s ego. Instead of fighting holy wars between developers, he focuses on finding solutions with the support of... meditation and yoga.

Thank you!

Your email has been sent.

Our website uses cookies to work correctly. Using this website with current settings means that cookies will be stored in the browser’s memory. Cookies settings can be changed in the browser’s options. For more information please visit Cookies policy.