Defence in Depth: Secure APIs (part 4/7)
Our first three articles were about designing and getting an access token. We also established a model for how we move from identity and scopes to the permissions that we base all further access control on.
In this article, we discuss what you need to do when implementing your API in order to protect your functions and your data.
Strong defense in depth according to the “Least privilege” principle means that we need to limit permissions for each request to the absolute minimum. We implement strong and fine-grained access control in six steps:
- Validate that the request is correct
- Validate that the access token is correct
- Transform the access token into a permissions model
- Validate that the data in the request is correct
- Validate the permission to execute the operation
- Validate the permission to query or change data
We move from a correct, verified access token (e.g. a JWT) to an object that represents our permissions in the system. Using this permission model, we implement strong and fine-grained access control in steps 4, 5 and 6.
Note that in this model, we have left out what you should have in front of your API in the form of infrastructure protection, e.g. a firewall, WAF, API gateway or similar. A WAF, for example, can perform initial input validation of known injection attacks and we might also conduct basic access control based on IP or trusted device.
The two are not mutually exclusive, they in fact complement one another. It’s important that our API security does not rely solely on upstream layers of protection. We build defense in depth where our API can withstand public exposure, according to the Zero Trust principle.
We can make similar arguments for steps 4 and 5, where the order can vary depending on the framework etc. For example, it is common to perform basic input validation early and deeper validation later in your domain logic. What’s important is that you do it and that you cancel incorrect requests early without wasting unnecessary system resources.
What we want to highlight with this model is that these six steps are crucial if you want strong and fine-grained access control, and that these steps are implemented with a mandatory pattern so that the core, your domain logic, is never exposed without access control.
Weaknesses in these steps is one of the first things I test whenever I’m doing penetration tests. This often pays off. It is also a vector where I can extract data by going directly to the service instead of taking a detour via another user’s account.
More on penetration tests including an example of such a vulnerability can be found in our article on Offensive application security.
Validate that the request is correct (Step 1)
Validating the HTTP request itself is something you might not think about as a developer, but it is an important part of a secure API. If we choose a solid web server, with secure default settings, we get this for free. One example of verification is that the request is in the correct format and is of a reasonable size. A number of products can conduct more in-depth analyses and reject requests which contain data that could be considered harmful.
Validate that the access token is correct (Step 2)
The access token that is included in the request should be verified using the framework we build our API on. You should only be doing configuration here, not implementing the actual control. Taking the JWT as our example, the framework needs to verify:
- Correct cryptographic signature
- Signed by the correct IdP (usually configured as a URL to our IdP)
- Executed with an audience that is valid for our API
- Correct type, i.e. an access token and nothing else
- Valid, e.g. with regard to time
https://tools.ietf.org/html/rfc8725 defines how a JWT is validated. There are also other types of tokens beside JWT. If it is a reference token, for example, then the reference needs to be translated into an access token first by performing a lookup against the IdP (known as “token introspection”).
For systems with a higher security level, we often use access tokens that are bound to the client, often via mTLS. If your access token is certificate-bound, this connection to the client’s certificate should be validated.
In order to be able to verify a correct signature, we need the key material for our IdP. There are different ways we can get this. It’s customary to perform a lookup against the IdP using a “JSON Web Key Set” (JWKs). This is a protocol for retrieving the public part of our IdP’s signing key. You can also opt to install the key material on the same machine that our API is running on.
An important aspect is that the IdP can rotate its key material. The vast majority of IdP products support JWKs and in practice we find a good solution to be having our API restart daily and perform a lookup when booting up.
Normally, your framework should return a 401 Unauthorized if requests are made with an invalid access token.
Transform the access token into a permissions model (Step 3)
Having validated our access token, we can now transform the information it contains into an object of our permissions model. The object contains all the information (permissions) we need in our domain logic to check whether the user has access to the function and the data being requested. From this point onward, we are working according to our permissions object and not with the information from our access token.
See Article 2, Claims-Based Access Control for details on how you can implement a transformation.
Validate that the data in the request is correct (Step 4)
In this step, we validate the data that the user has sent in their request based on our domain model. One example might be that telephone numbers must be in the correct format. If this validation fails, we always return a 400 Bad Request.
This early validation gives us a valuable opportunity to reject and log incorrect requests. A secure system doesn’t usually generate any violations against this type of input validation. An automated alarm based on logging incorrect requests can capture a lot of attempted attacks and alert you to the fact that someone is attempting to break into your system.
If you implement your API in a strongly typed language, a robust approach is to select a type that minimizes the problem of injection. Instead of declaring parameters in your API as strings, you might use integers, booleans etc. This reduces an attacker’s ability to inject values with the aim of breaking out of the intended function and modifying the meaning of the request.
Note that input validation can never protect against injection attacks entirely. There are many examples of situations where we need to give the user the ability to attach data without restriction. Ultimately, data needs proper output coding wherever it is going to be used.
For example, a request to
/api/products/1, where the attacker attempts to inject a string which contains an injection attack instead of the value
1. If the declaration of parameters in our API for product ID is of the type integer, then the framework handles validation for us. If product ID is instead of the type string, then we need to verify ourselves that the value comprises only numbers, for example.
Input validation is one layer of our defense in depth. A strong model for minimizing attack vectors is to transition to domain primitives as early as possible so that your domain logic works only with domain-specific types.
A good example of such in-depth input validation is customer ID and quantity. The customer ID is not an unlimited string that can contain as many characters as it likes. In your domain, your customer ID might comprise three capital letters and three numbers. Equally, the quantity is not an unlimited integer and is instead a number from 1 to 100. This makes it impossible to place an order for a minus number or for several thousand items, for example.
This approach is taken from the book Secure By Design and also congestion or errors resulting from corrupt data. Security is also quality and availability. https://www.manning.com/books/secure-by-design
Validate the permission to execute the operation (Step 5)
In this step, we validate whether the user has permissions to execute the operation without retrieving data or exploiting other resources which create loads on the system. If this requirement is not fulfilled, we return the request directly together with a 403 Forbidden.
For example, a request to
/api/products/1, which requires permissions to read products and that the user is included in the “users”. We can return a
403 Forbiddenearly on if any of these requirements is not met, without having to perform any sort of lookup in the product database.
Validate the permission to access the data (Step 6)
In the final step, we have normally performed a lookup for the data that the request refers to, either to return it, or to modify or remove it. For example, in this layer we can verify that the data that is to be returned belongs to the user’s organization or role.
Sometimes, determining whether the user has permissions to the data being requested is easy. Other times, it requires complex domain logic. One example is search functions where we might have to perform access control late on in the process after the data has been read.
If this verification fails, we ordinarily have two return codes to choose from. 404 Not Found is appropriate if we do not want to reveal that the user has requested data that belongs to another organization, for example. 403 Forbidden might be more appropriate if we want to alert the client to the fact that the data exists but cannot be accessed because of their permission set.
The actual data lookup in a database, for example, should be performed using a database account that limits access to data as much as possible. If an API needs to read parts of all the data there is in a database, it should be given an account that is restricted to just this purpose. “Least privilege” applies to both user accounts and system accounts.
For example, a request to
/api/products/1where the user does not have permissions to the specific product in question. In this situation, we can choose to return a
404 Not Found.
Note also that access to data can still apply even if it is a command that is to be executed, and not any data returned. A simple example is
/api/products/1where we also need to verify that the user has permissions for the specific product in question.
In our experience, a very common mistake is not verifying that the user has permissions to the data that is to be returned or that is being referred to in a command.
I often see GUID being used as an ID with the argument that it provides sufficient protection since guessing a GUID is tricky. This may be true, but a GUID is rarely cryptographically secure and often the value is a direct reference to the object. If you share the value it can’t be revoked, and access to the object is no longer controlled.
Logging and managing errors
Attackers will often exploit unexpected behavior in your system. Usually, these vulnerabilities are the result of inadequacies in error management with the result that the system can be put into an undefined state. For a secure API, it’s important that we have full control over how the application works, part of which involves having a clear and consistent approach to managing and logging errors.
Logging errors centrally in the system, combined with using the correct return status, is fundamental for being able to detect and act on attempted break-ins. A system which during normal operation does not create errors and returns HTTP status codes in the 200 series gives us the foundations for configuring automated alarms for unexpected events.
We can simplify things for a system that consists of multiple APIs if this logging is also centralized and correlated. This means that there is a central location where we can view the logs for the entire system, and where can follow the trail of requests whenever one API makes a request to another and as a chain.
There are currently a lot of good centralized logging products out there on the market. One important feature that some of these products offer is anomaly detection, i.e. automated alerts any time we have deviations from the normal pattern in terms of how our system functions. This can be hard to build yourself and gives you solid foundations for detecting attempted break-ins.
While we should always avoid logging sensitive data, such as personal ID numbers or access tokens, logs often constitute sensitive data sources and so these should be protected in the same way as our business data. There is a very high risk that personal data, for example, might end up in our logs one way or the other.
If you are performing an independent penetration test on your system, make sure that at the same time you also verify that your log system detects attempted break-ins!
With penetration tests, it is customary for applications to leak sensitive information in the event of an error, e.g. stack traces or even connection strings.
Remember to log in for the right purpose and at the right level. Otherwise we risk missing important logs. What you as a developer need during development might not be the same as what’s required for operational monitoring.
So far, this article series has discussed how to design your system in order to implement strong access control. We have also highlighted the importance of input validation and centralized logging.
In our next article, we will take a look at infrastructure such as transport layer security and data storage.
We look at secure APIs more closely in the additional article Secure APIs by design.
See Defence in Depth for additional reading materials and code examples.
More in this series:
- Defence in Depth: Identity modelling (part 1/7)
- Defence in Depth: Claims-based access control (part 2/7)
- Defence in Depth: Clients and sessions (part 3/7)
- Defence in Depth: Secure APIs (part 4/7)
- Defence in Depth: Infrastructure and data storage (part 5/7)
- Defence in Depth: Web browsers (part 6/7)
- Defence in Depth: Summary (part 7/7)