Is it more secure to run Lambda inside a VPC?
If you ask most security professionals if a resource which does not need access to anything inside the corporate network should be connected to the corporate network, you’ll typically be told that it should not.
Ask that same group of people whether Lambda functions should run inside a VPC, and you’ll typically be told that they must, for the sake of security.
Hopefully you’ve already realised that there appears to be some cognitive dissonance here - people believe in least privilege access, but also in providing Lambda functions with connectivity in excess of that which they need to operate.
In this article I’m going to explore how these two conflicting views became simultaneously popular, and the all important why to help you decide which is right for your situation.
Why should I run Lambda functions outside VPC if they don’t need VPC access?
As alluded to in the introduction, in order to build secure environments we typically follow the principal of least privilege: give each resource no permissions than is necessary for it to work.
Following this logic, where possible we should avoid running Lambda functions inside a VPC. If a function has no access to our local network, there is less risk of an exploitable function being leveraged to gain wider access. Note that I say less risk, not no risk, because Lambda functions running outside VPC can still have AWS IAM permissions which could potentially be used in an attack. If a function is only accessing AWS services which run outside VPC (for example, S3 and DynamoDB), then a strong case can therefore be made for doing so.
There are also some non-security related reasons for running outside VPC. Historically, Lambda functions running inside VPC paid a heavy ‘cold start’ penalty whilst an Elastic Network Interface (ENI) was created, meaning it could be several seconds before it started running. This has been largely resolved now, but Lambda still uses ENIs and therefore still consumes resources in your VPC. There are soft and hard limits on the number of ENIs and IP addresses usable in a given account/VPC, creating additional potential headaches that could prevent functions from working.
Why should I run Lambda functions inside VPC even if they don’t need VPC access?
If we don’t need to access anything else inside the VPC, why in the world would we ever run a Lambda function in one?
The answer is that running inside VPC provides the ability to limit the connectivity available to a Lambda function to less than that provided to a function running outside VPC. You can limit outbound Internet connectivity, utilizing AWS Network Firewall or an HTTP proxy to limit connectivity to specific pre-approved endpoints only.
The possibilities for auditing are also expanded. VPC Flow Logs provide logging of network connections established by Lambda functions. DNS lookups made by functions can also be logged. VPC traffic mirroring can be used to examine the exact data sent and received by your function. Very probably that traffic will be encrypted and useless, but this could be useful in some specific scenarios.
By contrast, a Lambda function running outside VPC always has unrestricted outbound Internet access, and there is no traffic logging available. But these expanded possibilities for auditing and network restriction come with an additional security burden - if security controls are not configured appropriately a Lambda running inside VPC can be a liability, providing a potential entry point for attackers looking to compromise your network. Anyone choosing to force all functions to run inside VPC needs to understand the trade-off they are making, and ensure they have appropriate controls to deal with the potential for an increased attack surface.
An aside - are Lambda functions really exploitable anyway?
Put simply - yes. By its nature, the Lambda execution environment presents some challenges to exploitation, but there are still potential vectors an attacker could use to execute arbitrary code.
To give an example, imagine a Lambda function that processes XML files. These XML files are stored in S3, with execution of the Lambda function triggered by an S3 event - there is no way for an attacker to directly execute the S3 function.
If there is a vulnerability in the XML parsing library used (and a search of the CVE database will reveal a number of such vulnerabilities in a variety of XML parsing libraries), then if an attacker can control the contents of the XML file to be processed, they could potentially execute arbitrary code with the permissions of the Lambda function.
So what is the right answer?
Both approaches have advantages and disadvantages. Running inside VPC potentially broadens the risk associated with a successful exploit, but it can also make it more likely that a successful exploit will be discovered.
If you have robust segregation between environments and strong egress controls from your VPC, I would suggest that running all functions inside VPC makes sense.
If you don’t have strict egress controls, and/or you have relatively open internal networks, I believe that running outside VPC is a lower risk because it limits the potential damage an attacker could do. Whilst there may be a temptation to force the use of Lambda inside VPC to facilitate the use of controls which may be implemented later I would caution against this approach. Switching Lambda functions to run inside a VPC is straightforward enough, and the pain of doing so is lower than the potential for compromise.
What should be clear is that running outside VPC isn’t automatically bad, and security tools that blindly raise an alert as such should be treated with suspicion. Like almost everything else in the security field, the choice is a trade-off, and thought needs to be given to which trade-off is most appropriate for your scenario. A blanket decree that Lambda outside VPC is bad fails to appreciate the nuance.