11.4.2025
But sometimes you just want to borrow it for a second or less.
In ye olden days when the need for more computing power arose, hardware had to be ordered and then you had to wait for it to arrive. Somebody had to install the hardware, operating system, and software, and configure everything.
Then it had to be maintained, and it quickly grew obsolete. You know the drill. Then came the virtual machines that allowed the instances to be provisioned in a more fine-grained way, but the top-level problem was still there with the host that ran the virtualization. And finally came the cloud providers, and companies could push the problem one more step (out of their own doors).
Servers in your servers
For a long time, the standard solution for cloud computing was to run virtualized server instances. While virtualization was prevalent also in on-premise data centers, the added value the cloud providers brought to the table was ease of use and self-service. Instead of endless bureaucracy with the IT-department, that might end up with you having a virtual machine running in 6-8 weeks, in the cloud you could get a virtual machine of practically any specification up and running in a couple of minutes.
This was a huge step up from the previous solutions, but it still had two nagging problems. First, the capitalist cloud providers wanted you to pay for your instance even if it wasn’t in use, and then the lazy cloud providers didn’t even patch your operating system, etc. Well, perhaps the latter was for the best, but still, there was lots of stuff you had to do
even if you “just wanted to run some code” and sometimes only infrequently.
Servers in other peoples servers
Sharing computer resources via virtualization is nothing new, IBM and others have been dabbling with it since the ’60s. But it has always been at the whole machine level; big physical servers sliced into small virtual servers, each of which had to be maintained as a separate machine. But what if you want to focus on what makes your product great, not on system administration. What if you’re doing something that doesn’t need a server running 24/7? Apparently, the cloud providers were thinking something along the same line. Why not offer customers what they need to run their code, without having to manage the infrastructure around it. Customers will take care of their code, the cloud providers will take care of the servers. No servers for the customer to worry about. And thus, Serverless was born.
Sharing servers in other people servers with…other people
Serverless comes in many formats, but here we’re focusing on one, maybe the most game-changing of them all: serverless functions (known as Lambdas on AWS, Functions on Azure). There had, of course, been other generally available cloud services such as S3 and SQS which could be used without provisioning resources, but Lambdas had two interesting side effects. For one thing, users could now architect their applications so that code was executed (and, more importantly, paid for) on demand. It also enabled a more dynamic glue between the various cloud technologies.
No longer did you have to have an EC2 instance listening 24/7 to an SQS queue that received one message per hour just to do a simple task. When a message arrived, the Lambda would be triggered, and you would only pay for a minuscule amount of time. The same chain of integrations was built throughout the ecosystem, with Lambda-based triggers enabled, etc. Lambdas would instantaneously and automagically scale on demand, something that EC2 instances struggled with.
2 Unlimited: No limit. No, wait…
Automatic scaling, no patching of the OS required, the microservice dream come true! So everyone switched to Lambdas and lived happily ever after, right? I mean, this is the silver bullet — doing anything else is madness, right? Funny you should ask. When you have e.g. an EC2 instance serving a website, the response is instantaneous since the instance is running but when a request hits a lambda integration point there might be no lambda available so it has to be started up. This might take a few seconds which will be seen as a delayed response. When another request comes in after a few seconds, the response will be quick since the lambda is “warmed up” but if the delay is too long between requests, the lambda might be discarded and another one has to be started up.
The same thing happens when upscaling and the requests are more numerous than the available lambdas (one lambda serves one request). The problems can be mitigated in various ways, e.g. by using SnapStart which spins up a lambda and takes a memory snapshot of it for quicker startup times (but is a bit more complex), using native binaries instead of e.g. plain Java/Python (longer build times), having reserved concurrency (but adds to costs) etc.
Lambdas are also designed for short tasks, there is a time limit of 15 minutes for an execution. You can whip your programmers to produce faster code but you can’t do everything in 15 minutes. There is also a limit to vertical scaling – lambdas can’t be set to the same oomph-power as EC2 instances and memory/CPU can’t be cranked up separately. You pay for execution time but this on-demand also comes at a price. If you constantly have 50 big lambdas running, it might be worth doing some calculations if ECS/Fargate might be a better fit for that section of the architecture. Also, since Lambdas are inherently stateless and two back-to-back executions are unlikely to be executed on the same instance, they might also not be that well suited if it needs to load a huge chunk of data just to do a small operation and then store it back somewhere. It takes time – and time is money in the cloud. I know you never write bugs but should it ever happen, there is no shell access to the lambdas and you’ll find yourself knee-deep in CloudWatch log streams of red herrings.
There is also a slight risk of vendor lock-in when running lambdas – containers can usually be run more cloud-agnostic. OTOH, if you’re not leveraging other services of your cloud provider, you’re probably missing out. Migration from one cloud service to another is possible but well, if you move for cost only, there is no guarantee that the competitor won’t follow suit. Given that most of the major cloud players are US based and how things are playing out on the other side of the ocean I wouldn’t mind a European player joining the game. So no silver bullet.
All hope abandon not, ye who enter here
But the fun thing with cloud services such as AWS is that you can use cost calculators to see at which point one architecture is cheaper to operate than another. Or just spin it up and simulate load for a few hours and then compare. In summary – for many AWS based integrations, Lambdas are the way to go but for certain compute-intensive spots in the architecture, calculations and prototyping is in order so that the bill at the end of the month won’t drop you off your chair.
Personally, I nowadays start with a “serverless-first”-mindset when designing architectures in the cloud.
This text might contain various inaccuracies and rounding of corners but hey, it was written by a human. Honest.