Disclaimer: I work for Salesforce, Heroku’s parent organisation.
I have had so many conversations with devops managers and developers who are individual contributors and the Lambda hype reached frothing levels at one point.
Contradictory requirements of scale down to zero, scale up infinitely with no cold starts, be cheap and no vendor lock in seemed to all be solved at the same time by Lambda.
Testability? Framework adoption? Stability? Industry Skills? Proven Architectures...? Are some of the other question marks I never heard a good answer for.
You’re always locked into your infrastructure. People don’t willy nilly change their infrastructure once they reach a certain size any more than companies get rid of their six figure Oracle infrastructure just because a bushy tailed developer used the “repository pattern” and avoided using Oracle specific syntax.
And the “lock-in” in lambda is over exaggerated. If you’re using lambda to respond to AWS events, you’re already locked in. If you are using it for APIs, just use one of the officially supported packages that let you add a few lines of code and deploy your standard C#/Web API, Javascript/Node Express, Python/Flask/Django... app as a lambda.
Testability? Framework adoption? Stability? Industry Skills? Proven Architectures...? Are some of the other question marks I never heard a good answer for.
If you haven’t heard the “right answers” for those questions you haven’t been listening to the right people.
Lambdas are just as easy to test as your standard Controller action in your framework of choice.
Do you have any resources on testing a Lambda? When I was fooling around with it, the only thing I ran into was that AWS Sam-client or whatever. Thing looked like an absolute nightmare to get up and running.
You’re doing it wrong (tm). You setup and test lambda just like you test your controller action in an API.
Your handler should just accept the JSON value and the lambda context and then convert the JSON to whatever plain old object your code needs to do to process it and call your domain logic.
AWS has samples of what the JSON looks like for the different types of events. You can see the samples by just creating a new lambda in the console, click on test and then see the different types of events.
You can also log the JSON you receive and use that to setup your test harness. I don’t mean an official AAA type of unit test, it can be as simple as having a console app that calls your lambda function and passes in the JSON.
For instance in Python, you can wrapped your test harness in an
if __name__ == "__main__":
block in the same file as your lambda.
This is the same method that a lot of people use to test API controllers without using something like Postman.
> Thing looked like an absolute nightmare to get up and running.
So you tried it? I don't remember it being hard to set up, at least compared to a DB. Or, you can use the underlying docker images (open source, https://github.com/lambci/lambci) to run your Lambdas in. SAM provides some nice abstractions, e.g. API gateway "emulation", posting JSON to the function(s), or providing an AWS SDK-compatible interface for invoking the functions via e.g. boto3. This way you can run the same integration tests you would run in prod against a local testing version.
This! To me the only upside to the whole architecture (from a devs perspective) is that you can deploy these things to production independently of other parts of the system. If you can't do that with confidence because you're attempting to test it's functionality in some larger deployed context you've turned your monolith into a distributed monolith and now you have the worst of both worlds.
The good news; you should be able to accomplish most testing locally, in memory. The bad news; your test code is probably going to be much larger and your going to have to be very aware of the data model representing the interface to the lambda and you're going to have to test the different possible states of that data.
I've found unit testing to work fine for Lambdas. The biggest difference between running as a Lambda and running locally is the entry point. With a Lambda you have an event payload that (usually) needs to be parsed and evaluated.
I'll typically write all the core functionality first, test it, then write the lambda_handler function.
You call it just like you call any other function. Your handler takes in either a JSON payload and a context object or a deserialized object depending on the language. You call it from a separate console app, your test harness etc.
We deploy to a test stack and run a full integration test of the code running in AWS. I believe it's also possible to self-host locally, but we never really looked into it.
Why do HN folks find this so difficult? It's like putting your shoes on the wrong way around. After you've done it, it's clearly uncomfortable. So don't do it.
Serverless is specifically a stateless paradigm, making testing easier than persistent paradigms.
> Framework adoption?
Generally we use our own frameworks - I do wish people knew there was more than serverless.com. AWS throw up https://arc.codes at re:Invent, which is what I'm using and I generally like it.
> Stability? Industry Skills? Proven Architectures...?
These are all excellent questions. GAE, the original serverless platform, was around 2010 (edit: looks like 2008 https://en.wikipedia.org/wiki/Serverless_computing). Serverless isn't much younger than say, node.js and Rust are. There are patterns (like sharding longer jobs, backgrounding and assuming async operations, keeping lambdas warm without being charged etc) that need more attention. Come ask me to speak at your conference!
> Serverless is specifically a stateless paradigm, making testing easier than persistent paradigms.
No because Lambdas are proprietary which means you can't run it in a CI or locally.
Also, it becomes stateful if it pulls data from a database, S3 or anywhere else on AWS which it almost always does.
> Serverless isn't much younger than say, node.js and Rust are.
AWS Lambdas which I consider to be the first widely used Lambda service was was released in April 2015 which is 6 years after the release of Node.js.
Also, Node.js is way more popular and mature than Lambda solutions.
Overall Lambdas are only useful for small, infrequent tasks like calling a remote procedure every day.
Otherwise, things like scheduling, logging, resource usage, volume and cost make Lambdas a bad choice compared to traditional VPSs / EC2.
> No because Lambdas are proprietary which means you can't run it in a CI or locally. Also, it becomes stateful if it pulls data from a database, S3 or anywhere else on AWS which it almost always does.
Lambda is a function call. So it makes no difference if it’s proprietary or not.
Are you saying that it’s difficult to test passing an object to a function and asserting that it’s functioning as intended?
Lambas are not simple functions because your environment is different in local compared to production.
If I run a Node.js function in AWS Lambda, my Node.js version might be different, my dependencies might be different, the OS is different, the filesystem is different, so I or one of my node_modules might be able to write to /tmp but not elsewhere, etc.
It's the reason people started using Docker really.
If you don't have the same environment, you can't call it reproducible or testable for that matter.
Nothing you mentioned has anything to do with the ability to test a Lambda. You’re trying to use limitations and restrictions as friction to backup your inability to test.
There’s a lot of annoying things about lambda. And a lot of stuff I wish was easier to find in documentation. But that doesn’t change the fact that Lambda is more or less passing an event object to your function and executing it.
Writing a function in node 12 and then running it on node 4 and throwing your hands in the air cos it didn’t work isn’t the fault of Lambda.
It's great to see that factual evidence is answered with ad-hominem by the Lambda hype crowd.
In any case, if you have a Node.js module or code with a native C/C++ build, that runs shell commands, that writes to disk (not allowed besides /tmp in Lambda) or makes assumptions about the OS, your "simple" function will absolutely return different results.
e.g: My lambda is called when somebody uploads an image and returns a resized and compressed version of it.
This is done using Node.js and the mozjpeg module which is dependent on cjpeg which is built natively on install.
If I test my function on my machine and in Lambda it's very possible that I get different results.
Also, certain OSs like Alpine which are heavily used for Docker don't event use glibc as compiler, so again, another difference.
"In any case, if you have a Node.js module or code with a native C/C++ build, that runs shell commands, that writes to disk (not allowed besides /tmp in Lambda) or makes assumptions about the OS, your "simple" function will absolutely return different results."
This is true, but it's not Lambda qua Lambda. That's just normal production vs. testing environment issues, with the same basic solutions.
Lambda may offer some minor additional hindrances vs. something like Docker, but I wouldn't consider that catastrophic.
You are absolutely right that you could recreate a similar environment to Lambda in Docker. But you would first need to reverse engineer Lambda's environment to discover how it is actually configured and the limits that are set.
Even if you did find a way, you would still need to keep it up to date in case AWS decides to update that environment.
Logged in to say that this has actually been done (not by me) and my team has been finding it very helpful for local “serverless” SDLC: https://github.com/lambci/docker-lambda . It‘s billed as “ A sandboxed local environment that replicates the live AWS Lambda environment almost identically – including installed software and libraries, file structure and permissions, environment variables, context objects and behaviors – even the user and running process are the same.” We test our functions mocked and against local deployments of that lambci container . There also lambda “layers” (container images for building custom runtimes for AWS Lambda) but we have not used that feature at this point. Interesting space with lots of room for improvement in this tool chain though for sure
I’m not 100% sure as I didn’t create the image (though I’m evangelizing as someone who has found it truly helpful for daily dev.) . I believe the creators tarball’d the entire distro/execution environment from a running lambda so the file system layout and libs likely match Amazon Linux if that’s the default lambda execution distro image. If not I assume it matches the default
At least the Docker image used by AWS SAM CLI is created by AWS.
Also, you compile before packaging, so you dev/CI system already has to be able to compile for Lambda, independenly from testing/debugging with Docker.
> > Writing a function in node 12 and then running it on node 4 and throwing your hands in the air cos it didn’t work isn’t the fault of Lambda.
> It's great to see that factual evidence is answered with ad-hominem by the Lambda hype crowd.
I don't think that was a personal attack.
We've answered technical questions with technical answers.
- You have a definition of stateless which includes having no persistence layer, which is at best at odds with the industry.
- You think serverless was created with AWS Lambda which we've been kind about, but most people would say you're simply wrong.
- You're advocating for containers, which are well known for having their own hype as people write their own cloud providers on top of the cloud provider their employer pays for with dubious benefit.
Saying that local dev and Lambda are different is a strawman. How is that harder than developing on a Mac or Windows (or even Linux) and then testing on a different OS and config via CI/CD?
You shouldn't be testing "on your machine" - that's the oldest excuse in the book!
You should build your function in a container based on AWS Linux, just the same as you should for a Lambda deploy. That guarantees you the same versions of software, packages, libraries, etc. It makes it possible for me to develop Lambda functions on a Mac and test binary-for-binary to the deployed version.
"Nothing you mentioned has anything to do with the ability to test a Lambda" is not ad-hominem, it's a statement of fact.
Why not then have lambda run the same container you can run and test locally?
I don't use lambda but we have our jenkins spin up the same ec2 to run tests that we would spin up to do development so that we never run into this problem.
I'm not sure I understood your question correctly.
If you mean running a Docker container in Lambda, that is to may knowledge to possible.
You could schedule Docker tasks in AWS ECS (their managed container service) but it's not meant for anything realtime and more for cron job type tasks.
If you mean emulating the Lambda environment in Docker, then I wrote an answer with the difficulties of doing that below to another user.
No, your lockfile doesn't care about build steps so any post-install script might run differently for the many other reasons listed.
> Agree, but how much does one Linux+systemd different from other Linux+systemd? How much does the FS?
Plenty. For example filesystem change events are known to have filesystem and OS dependent behaviours and quirks / bugs.
When a Node module runs a shell command, it's possible that you have a BSD vs a GNU flavour of a tool, or maybe a different version altogether.
The Linux user with which you are running the function might also have different rights which could become an issue when accessing the filesystem in any way.
> VMs, docker and having to care about and manage isolation platforms is the reason people started using serverless.
Maybe, but serverless doesn't answer those questions at all. It just hand waves testing and vendor independent infrastructure.
Then you're not talking about dependency versioning are you? you're talking about install order. In practice it hasn't been an issue, I should find out how deterministic install order is but I'd only be doing this to win a silly argument rather than anything that has come up in nearly a decade of making serverless apps.
> For example filesystem change events are known to have filesystem and OS dependent behaviours
> When a Node module runs a shell command, it's possible that you have a BSD vs a GNU flavour of a tool
Are you generally proposing it would be common to use an entirely different OS? Or a non-boring extX filesystem?
All your issues seem to come from edge cases. Like if you decide to run FreeB or ReiserFS locally and run a sandbox it it, fine, but know that's going to differ from a Linux / systemd / GNU / extX environment.
> > VMs, docker and having to care about and manage isolation platforms is the reason people started using serverless.
> Maybe, but serverless doesn't answer those questions at all.
Serverless exists precisely to answer the question. I can throw all my MicroVMs in the ocean with no knowledge of dockerfiles, no VM snapshots, no knowledge of cloudinit, no environment knowledge other than 'node 10 on Linux' and get my entire environment back immediately.
> Then you're not talking about versioning are you? you're talking about install order.
I didn't mean build order but install scripts and native module builds.
The first type can create issues when external resources are downloaded (Puppeteer, Ngrok, etc.), which themselves have different versions or that fail to download and where the Node.js module falls back to another solution that behaves slightly differently.
The second type can occur when you have say Alpine Linux that uses MuslC and Amazon Linux uses GCC or when the native module tries to link with a shared library that is supposed to exists but doesn't.
> Are you generally proposing it would be common to use an entirely different OS? Or a non-boring extX filesystem?
I haven't checked but Amazon Linux by default uses XFS on EBS disks so I wouldn't be surprised if Lambda's used the same. So not a boring extX filesystem. ZFS is also relatively common.
> Serverless exists precisely to answer the question.
No, it clearly doesn't because your function will fail in local and succeed in Lambda, or the reverse, exactly due to the issues I mentioned in my various comments here and you will be left debugging.
Debugging which starts by finding exactly the differences between the two environments which would have been solved by a VM or Docker.
> I didn't mean build order but install scripts and native module builds.
OK. Then you're still not talking about your dependencies being different. The dependencies are the same, they're just particular modules with very specific behaviour...
> external resources are downloaded (Puppeteer, Ngrok, etc.), which themselves have different versions or that fail to download
That's more a 'heads up when using puppeteer' than a indictment of serverless and a call to add an environment management layer like we did in 2005-2015.
> Linux by default uses XFS on EBS disks so I wouldn't be surprised if Lambda's used the same.
That's worth checking out.
> Debugging which starts by finding exactly the differences between the two environments which would have been solved by a VM or Docker.
I see what you're saying, but planning your whole env around something like a given puppeteer module dynamically downloading Chrome (which is very uncommon behaviour) isn't worth the added complexity.
> No, your lockfile doesn't care about build steps so any post-install script might run differently for the many other reasons listed.
You shouldn’t be uploading node_modules folder to your deployed lambda so this is an issue of your development environment, not lambda.
> Maybe, but serverless doesn't answer those questions at all.
“Serverless” or lambda/Azure functions, etc, are not a silver bullet that solve every single scenario. Just like docker doesn’t solve every single scenario, not does cloud servers or bare metal. It’s just another tool for us to do our job.
I have the latest Windows insiders build on a spare machine. I've found all the aws tooling including SAM works pretty well under WSL 2 now that WSL uses a true Linux kernel.
Docker largely works fine on macOS? At least for testing, haven't run into any issues other than bad filesystem performance if you're writing a lot of data to a mounted FS. Our team uses it daily - a far cry from "unusable".
> No because Lambdas are proprietary which means you can't run it in a CI or locally.
Arc has a full sandbox, so yes you can run a lambda in a CI or locally. You should know this - if you're developing your apps by deploying every lambda as you edit it you're doing it wrong. Most lambdas don't really need you to simulate AWS.
> Also, it becomes stateful if it pulls data from a database, S3 or anywhere else on AWS which it almost always does.
Sure, persistence exists, but when people say 'stateless' they mean 'has no transient state'. They don't mean 'has no data'.
> AWS Lambdas which I consider to be the first widely used Lambda service
OK. You don't consider GAE (2008) widely used. I disagree - mainly because I was using GAE back then and thinking about this new world of no-memory, timeouts, etc - but lambda is definitely more popular.
That's not what I've understood "stateless" to mean. Sure, anything more complicated than a calculator app is going to rely on data stored elsewhere, and in Lambda world that means you're reading from DynamoDB, S3, RDS or whatever. Those are definitely dependencies and that's where the state lies. But the pattern encouraged by Lambda is that your instance contains no state of its own. The disk is used as scratch for each invocation, objects are created for each request and not shared between them, etc. That's what people mean by stateless.
I see your point, but this definition is so lax that it applies almost perfectly to any application I've ever deployed to the cloud. Using external services to save state isn't unique to lambda, any app hosted on a normal ec2 instance needs to follow the same pattern because the instance and its filesystem can go away at any time (unless you attach a volume, but I've always considered that to be bad practice).
Sure, it's a common pattern and the right one in many scenarios. But especially for non-cloud deployments it's very common to store session data in process memory or temp data to local disk, etc.
On-disk state isn't what people mean when they say stateless. * They mean there is no memory state, which is true and which absolutely means serverless functions are easy to test.
* you could argue that 'stateless' should include no long term persistence, but you'd be fighting an uphill battle. Like saying 'serverless' isn't correct because there are servers.
State is saved somewhere, sure. How many useful applications are truly stateless when considered in their entirety?
This isn't the sense in which servers are usually said to be stateless, however. A Lambda that reads and writes state from a database when invoked is not maintaining state outside of the request-response cycle itself, so it can be said to be stateless.
It's funny because the only use case I was considering lambda for was a pure / stateless function. As soon as you add state / side-effects I assume you've greatly increased your complexity and testing (you'd want to have separate tests for all of the inputs / outputs that resemble production... which is when things can get complicated).
I'm probably looking at it wrong I guess. I considered using lambda to unload some CPU intensive calculations / logic and that's it. I figured it would be good for that as long as the latency didn't outweigh any benefits.
Yes, but the parent said "statelessness that makes it easier to TEST". It is not stateless in that sense: purity makes it easier to test. Here you need to mock all interactions with external services, just like you'd do with non-serverless applications.
It is stateless in that sense. To run your lambda, there needs to be zero things in memory, unlike a traditional express/django/rails or other non-serverless apps.
If your lambda involves persistent storage, your testing might want to wipe or prepopulate the DB with some objects first, but that's not hard and doesn't add complexity, and as mentioned you don't need anything in memory.
I have had so many conversations with devops managers and developers who are individual contributors and the Lambda hype reached frothing levels at one point.
Contradictory requirements of scale down to zero, scale up infinitely with no cold starts, be cheap and no vendor lock in seemed to all be solved at the same time by Lambda.
Testability? Framework adoption? Stability? Industry Skills? Proven Architectures...? Are some of the other question marks I never heard a good answer for.