API Security: The Complete Guide

Swwapnil Pawar
The Security Chef
17 min readOct 31, 2022

--

Originally published at https://www.pingidentity.com.

Web API security is the application of any security best practice applied to web APIs, which are prevalent in modern applications. Web API security includes API access control and privacy, as well as the detection and remediation of attacks on APIs through API reverse engineering and the exploitation of API vulnerabilities as described in OWASP API Security Top 10.

Whether an application is targeting consumers, employees, partners or otherwise, the client side of an application (e.g., a mobile app, a web app) interacts with the server side of an application via an Application Programming Interface (API). Simply put, APIs make it easy for a developer to create a client-side app. Microservice architectures are also made possible by APIs.

Because they’re often available over public networks (access from anywhere), APIs are typically well-documented or easily reverse-engineered. Also highly sensitive to denial of service (DDOS) type incidents, APIs are attractive targets for bad actors.

An attack might include bypassing the client-side application in an attempt to disrupt the functioning of an application for other users or to breach private information. API security is focused on securing this application layer and addressing what can happen if a malicious hacker were to interact with the API directly.

API development has increased astronomically in the past few years, fueled by digital transformation and the central role APIs play in both mobile apps and IoT. This growth is making API security a top concern.

In its How to Build an Effective API Security Strategy report, Gartner predicts that “by 2022, API abuses will be the most-frequent attack vector resulting in data breaches for enterprise web applications.” To protect yourself against API attacks, Gartner recommends adopting “a continuous approach to API security across the API development and delivery cycle, designing security [directly] into APIs.”

Given the critical role, they play in digital transformation and the access to sensitive data and systems they provide-APIs warrant a dedicated approach to security and compliance.

Because you only control your own APIs, API security centers on securing the APIs you expose either directly or indirectly. API security is less focused on the APIs you consume that are provided by other parties, though analyzing outgoing API traffic can also reveal valuable insights and should be applied whenever possible.

It’s also important to note that API security as a practice overlaps various teams and systems. API security encompasses network security concepts such as rate limiting and throttling, as well as concepts from data security, identity-based security and monitoring/analytics.

APIs take many forms and come in many styles. Sometimes, the style of an API affects how security is applied to it. For example, before web APIs, the standard style in use was SOAP Web Services (WS). During the service oriented architecture WS era from 2000–2010, XML was ubiquitous, and a rich set of formal security specifications were widely recognized under WS-Security/WS-*.

The SOAP style of security is applied at the message level using digital signatures and encrypted parts within the XML message itself. Decoupled from the transport layer, it has the advantage of being portable between network protocols (e.g., switching from HTTP to JMS). But this type of message-level security has fallen out of favor and is mostly encountered only with legacy web services that have survived without evolving.

Representational state transfer (REST) became the more common API security style over the past decade. REST is often assumed by default when the term “web API” is used. A fundamental convention of the REST style of APIs is that resources are uniquely identified by HTTP URIs. This predictable aspect of REST APIs inspired a generation of access control methodologies in which rules are associated with the URI (resource) being accessed or at least the pattern of the URI being accessed.

Access control rules are often based on a combination of the HTTP verb (GET/PUT/POST/DELETE) and the HTTP URI (the resource identifier) patterns. Identifying which data is being accessed through the URI means that rules can be applied without visibility into and most importantly, without an ability to understand the payload in these API transactions. This has been practical, in particular, for middleware security solutions that enforce access control rules decoupled from the web API implementations themselves by sitting in front of them (e.g., gateways) or acting as agents (e.g., service filters).

Yet another API style is GraphQL, an emerging open-source API standard project. GraphQL is popular with front-end developers because it puts them in control. They’re no longer restricted to a fixed set of API methods and URI patterns but instead get to customize their queries in whichever ways best suit their applications and context. Because of this added control-and additional benefits like non-breaking version upgrades and performance optimizations-GraphQL is on its way to becoming omnipresent among web APIs.

While GraphQL isn’t a substitute for REST, and both API styles will continue to co-exist, it’s an increasingly common choice. In fact, its popularity is threatening to disrupt a decade of web API access control infrastructure. This disruption centers on one major divergence from the popular REST pattern: GraphQL requests do not identify the data being accessed via the HTTP URI. Rather, GraphQL identifies the data requested using its own query language, typically embedded inside an HTTP POST body.

In a GraphQL API, all resources are accessed through a single URI (e.g., /graphql). Existing web API access control systems and infrastructure often are not designed for this type of API traffic. Access control rules for GraphQL are more likely going to require access to the structured data in the API payloads and be able to interpret this structured data for the purposes of access control. Suffice it to say that API providers need to consider what will be best suited to each new set of requirements when choosing their approach.

Advances in technology like cloud services, API gateways and integration platforms let API providers secure APIs in unique ways. The technology stack you choose to build your APIs affects how you secure them. For example, in large organizations, different departments may develop their own applications with their own APIs. Also, through mergers and acquisitions, large organizations end up with multiple API stacks or API silos.

When all of your APIs are in a single silo, API security requirements can be mapped directly to this silo’s technology. For portability purposes, these security configurations should be portable enough to be extracted and mapped to another technology in the future.

For heterogeneous environments, however, the defining of API security rules typically benefits from API security-specific infrastructure which operates across these API silos. This connectivity between API silos and API security infrastructure can take the form of sidecars, sideband agents and, of course, APIs which are integrated between cloud and on-premises deployments.

As discussed above, the scope of API security is wide. Many layers, each focusing on their own scope of API security are required to achieve a strong level of protection.

API Discovery

You can’t secure what you’re not aware of. The obstacles that prevent security operatives from having full visibility into all APIs exposed by their organization are many. First, you have API silos as described in the previous section. API silos affect API visibility by having partial lists of APIs, under disconnected governance.

Another obstacle to API visibility is the rogue or shadow API. Shadow APIs happen when an API is developed as part of an application but the API itself is considered an implementation detail of the application and is only known by a close-knit group of developers. Shadow APIs are not on the radar of security operatives because they don’t have visibility into the implementation details.

Finally, APIs go through their own lifecycle. An API evolves over time, new versions of an API come up or an API may even be deprecated but still continue to operate for a temporary period for backward compatibility and then be forgotten or gradually fall off the radar because they receive very little traffic.

API discovery is a race between API providers and hackers who will easily exploit the APIs when found. To discover your APIs before attackers do, you can mine your API traffic metadata. This data is extracted from API gateways, load balancers or directly inline of network traffic and then fed to a specialized engine that reports on an effective list of APIs which can then be compared with catalogs of APIs that are available via an API management layer.

OAuth and API Access Control

In order to restrict API resources to only the users who should be allowed to access them, the user-and potentially the application which acts on behalf of the user-needs to be identified. This is typically achieved by requiring client-side applications to include a token in the API calls that they make to the service, which can then validate that token and get the user information from it. OAuth is the standard which describes how a client-side application obtains an access token in the first place. OAuth defines many different grant types to accommodate various flows and user experiences. For more information on OAuth 2, this developer guide describes these various OAuth flows in detail.

Based on an incoming token, access control rules can be applied. For example, a rule can be used to determine if the application or user should be allowed to make this particular API call.

The definition and management of the rules is done via policy definition tools, and a policy enforcement layer needs to be able to apply these rules at runtime. These rules take into consideration characteristics like:

  • The user identity and its associated attributes or claims
  • The application and OAuth scopes that are associated with the token
  • The resources being accessed or the query being made
  • The privacy preferences of the user

In a heterogeneous environment, controlling access in a consistent way across API silos requires processes and integration.

API Data Governance and Privacy Enforcement

Leaks happen through APIs because data flows through APIs. For this reason, API security must also include looking at the structured data coming into and going out through your APIs and enforcing rules at the data layer.

Because data is structured in a predictable way in your API traffic, the enforcement of data security by inspecting API traffic is very well suited for this task. In addition to [yes/no] type rules, API data governance lets you transform the data structured into your API traffic in real time for redaction purposes. A typical example of this pattern is to redact specific fields that may contain information that a user’s privacy settings dictate should remain hidden from the requesting application. As discussed earlier, enforcing data-level access control helps you support GraphQL which does not identify resource identifiers via URIs.

The decoupling of privacy preference management and enforcement from a GraphQL service implementation offers a number of benefits. Home-grown software comes with a high cost of ownership and can be slow to adapt. The Node.js developer and the person responsible for enforcing privacy regulations seldom intersect. But empowering security architects and business analysts with their own tool to implement this level of access control accelerates digital transformation. In addition, this decoupling future-proofs the investment in GraphQL services and REST APIs by making them more resilient to changes as they relate to fine-grain data governance.

API Threat Detection

API threat detection inherits from general threat protection measures. For example, APIs are often behind a firewall which offers some baseline protection. APIs are also sometimes behind a web application firewall (WAF). A WAF might scan API traffic for signature-based threat detection, looking for things like SQL injections and other injection attacks. API gateways also play a role in threat detection from an API specific angle. A gateway might enforce a strict schema on the way in and general input sanitization. It will look for deep nesting patterns, xml bombs and apply rate limits in addition to acting as a policy enforcement point.

API Behavior and Analytics

Using API traffic metadata, an AI engine can build models for what normal API traffic is like and leverage this model to look for anomalous behavior. These anomalies can help identify attacks in progress but can also point to system misbehaviors and other forms of non-malicious disruption to your service such as friendly fire, API flaws or a partner misusing or abusing an API. By analyzing API traffic metadata, such a layer can pinpoint the source of this attack or misbehavior and this information can then be leveraged to stop the incident in progress, fix the API or address the issue with the partner.

Rule-based security is only as good as the rules themselves and the security achieved through rule-based systems is limited by the operators of such technology. The reason so many API security breaches persist despite the availability of sophisticated security technology is that not enough experts are available to define those rules in the first place. Humans also make mistakes and may miss important rules that should be defined in these systems.

By contrast, an AI engine learning about your APIs by analyzing your API traffic requires no rules, it just crunches the numbers. As a result, this additional layer is a powerful protection against security gaps in other layers and clever ways that hackers use to work around those access control rules and threat protection layers.

By attaching identity information to the API traffic metadata analyzed over time reveals patterns around API consumption and identity. For example:

  • Which users consume which APIs
  • What sequence of API calls are common vs rare vs never seen
  • What error rates is an API client generating

Identity-based security for APIs goes beyond access control. By analyzing API traffic metadata augmented with identity information, the ability to pinpoint the source of an attack is improved.

A common understanding of the specific threats that enterprises need to defend against is essential. Long known for and relied upon for its original OWASP Top 10 web security vulnerabilities, OWASP recently launched an API-specific list: the OWASP API Security Top 10 vulnerabilities.

OWASP API Security Top 10

From this Top 10 list, you’ll see that four of the items (including the top two) relate directly to a lack of access control rules and a lack of strong authentication. This is a reflection of the most common source of error in security incidents involving APIs. Number three on the list is caused by a lack of data governance. As you go through each item in the list, you quickly see how wide the scope of API security is and how many layers of security may be required to address the threats.

Any API provider would be wise to not take for granted their coverage for the 10 threats identified in the OWASP list. It provides a great starting point for assessing your current API security. Going back to this list should also be baked into ongoing security testing.

Additional API Security Threats

Beyond the OWASP API Security Top 10, there are additional API security risks to consider, including:

  • Hackers are users, too
    Applying sophisticated access control rules can give you the illusion that the hacker is a valid user. The hacker may be an insider or may have signed up to the application using a fake email address or a social media account.
  • Valid account, valid credentials
    Attackers have many ways to get access to valid credentials, from credential stuffing to buying them on the dark web. Because they know users reuse passwords, hackers can take over legitimate accounts, effectively bypassing the first layer of access control rules.
  • Stolen token
    OAuth token can be leaked through phishing, public repos on GitHub and other ways. Since the vast majority of token confirmations are lightweight bearer tokens, this type of leaked token can be used from anywhere and by anyone until it expires.
  • Outside-the-app scenarios
    By passing the client-side app, hackers poke around to find hidden vulnerabilities in your API. These vulnerabilities are hidden to the API provider as well.

These are persistent API security risks. While they may be reduced by tightening security procedures, the risk never really goes away. The key to mitigating these risks is to leverage AI to detect anomalies as described earlier.

In the end, security is everybody’s job. APIs touch backend services, databases, and IAM -and all of this infrastructure needs to be properly secured. This starts at the transport level with using SSL (HTTPS) and enforcing TLS 1.2 (older versions of TLS should be deprecated). You also need to get rid of things like HTTP basic authentication.

When it comes to the API layer, there are several best practices you can apply to make a secure API.

API Inventorying

Digital transformation initiatives accelerate the development of new APIs, so you need to review new APIs for the appropriate security measures. But you can’t secure what you don’t know about.

By analyzing API traffic metadata, an AI engine will discover APIs that may not have been on the radar of security practitioners. This level of API discovery ensures that you minimize blind spots from rogue APIs. When new APIs are discovered in this way, the same API security checklist can be applied to them. The same API traffic metadata analysis that allows for this API discovery can also be put to use for threat detection as described below.

API Access Control

Using standards like OAuth and JWT to authenticate API traffic, you can define access control rules that determine which identities, group memberships, identity attributes, and roles are required to access specific API resources.

If your API transactions go through multiple network boundaries, you can apply Zero Trust security principles and propagate identity to allow for each layer to make its own decisions. Application security can also leverage these propagated identities.

Additional access control best practices include:

  • Mapping between token formats as appropriate when crossing boundaries, such as an opaque token on the public side and a signed token on the private side
  • Enforcing authorization rules at each API silo
  • Enabling access control rules for third-party applications acting on behalf of users and controlling the scope granted for each application
  • Enabling the definition and enforcement of user privacy preferences and general data governance

API Threat Detection

Combine real-time and out-of-band threat detection. Real-time threat detection involves an API gateway, a WAF or an agent applying a set of validation rules. Each API request and response is subjected to this set of rules and is only allowed through if the rules are passed:

  • Look for signature-based threat detection such as SQL injections
  • Validate incoming messages against API definition contracts using JSON schemas and JSON paths. The tighter these rules, the harder it becomes for attackers to abuse your API.
  • Apply rate limits to protect your API backends

There is a limit to the layers of real-time security that can be applied in sequential mode before latency is negatively affected. Out-of-band analysis of API traffic should be offloaded to a dedicated AI engine decoupled from the API traffic path. From this AI engine, capture API traffic metadata to build ML models for each API and track error rates, API sequences, API grouping across tokens, API keys, IP address, cookies, etc. When an AI engine detects an anomaly, instruct the API gateway or load-balancer to start blocking the API client.

API Security Testing

Continuously test security and look at it through an API lens. Design test cases that skip the client-side application as a hacker would when attacking your API. Your security testers should be using tools like Postman and JavaScript. Try calling the API in ways that the application does not do and attempt to trick the API in returning data for which the requester should not have access.

Monitoring and Analytics Across API Silos

Monitor your API traffic from the inside. Feed API traffic metadata into a centralized AI engine and correlate identity from API traffic. You should be able to break traffic down per user, per IP, per token and per API across your API silos. Integrate your API monitoring and threat detection to your existing security information and event management (SIEM) systems. Review anomalies detected at regular intervals and tweak models as needed. By having visibility into your API traffic at all times and broken down across any factor, you gain a better understanding of what is going on with your APIs, including whether you are experiencing an attack or a malfunction.

API traffic analysis is broken down for a user identity

Auditing and Incident Response

Detecting and stopping a breach is only part of the response to a security incident. By recording detailed information about historical API traffic, you can generate forensic reports for a given token, API key, user identity or IP address. Conduct forensics reporting to gain a complete picture of the activity that occurred during an incident. This facilitates compliance and investigations and can help you repair the damage that occurred prior to the automatic detection and blocking of a breach.

Example of forensics reporting for a specific token

Since the early days of web APIs, API developers and security practitioners have leveraged Ping Identity’s thought leadership and tools. Ping has been a contributor to the OAuth standard for nearly a decade and was an early implementer of the OAuth authorization server. To this day, Ping team members are helping to define key API security standards such as JWT, token revocation, token introspection, dynamic client registration, financial-grade APIs and myriad other relevant specifications.

Ping Identity’s intelligent identity solutions include an industry-leading OAuth server, strong authentication, MFA, API access control, API-based consent and privacy enforcement, and API cybersecurity based on AI.

PingFederate

Issuing and managing OAuth tokens is a core concern of API security. The top OAuth authorization server technology for both protocol support and market presence, PingFederate enables token issuing to your API consumers. Leveraging a rich set of standard and custom flows, PingFederate helps you deliver a great experience for your end users. It’s also used by the API server to validate tokens and retrieve attributes that are used in API access control decisions.

PingAccess

With its out-of-box OAuth policies for token/scope validation and attribute-based access control rules definition, PingAccess lets you define and enforce API access control rules. For advanced rule definition, you can feed scripts into this rules engine. Deployed as a sidecar or inline, PingAccess works across your API silos.

PingAuthorize

Attached to your APIs inline or via an API gateway, PingAuthorize provides policy-based, fine-grained access controls for attribute-by-attribute data protection and filtering, ideal for regulatory compliance and consent management. It has a graphical user interface for business users to collaboratively build, test and enforce access control policies to data across user directories and APIs. It also provides a centralized solution to authorize and filter API calls in real time-a huge benefit to managing and enforcing customer consent and data privacy.

PingIntelligence for APIs

PingIntelligence for APIs analyzes your API traffic metadata to discover and protect your APIs. It also gives you rich insights into your API traffic by associating API traffic metadata with identity information to deliver a single pane of glass from which you can monitor your API activities across all gateways, data centers, and clouds. This allows you to report on API traffic across silos, broken down across users, tokens, IP addresses, cookies, etc. PingIntelligence for APIs uses machine learning to build models on your API traffic and spot deviations that point to anomalies and attacks with no rules to write and maintain. These models track a rich set of API traffic metadata including transaction rates, error rates, sequences, user identity, resources being accessed, action taken, volumes, latencies, network location, and more. Through its out-of-box integrations with all common API gateways and load-balancers, PingIntelligence for APIs can identify API design flaws and bugs in production, flag partners that are misusing or abusing your APIs, and detect and block hackers working on your APIs to breach your organization.

Already an attractive target for bad actors, APIs are predicted to soon become the top attack vector. Given the critical role, they play in digital transformation and the access to internal sensitive data and systems they provide-APIs warrant a dedicated approach to security and compliance.

To gain a deeper understanding of how to build a solid defense against API security threats and vulnerabilities, read the white paper.

--

--

Swwapnil Pawar
The Security Chef

Entrepreneur, Cloud Evangelist, AWS/Google Certified Architect, Building Cool Things With Serverless. Avid Reader