August 07, 2024
Share
Perimeter is an egress traffic controller designed for distributed systems, to ensure that all outgoing requests comply with the rate limits of external systems. This article provides an overview of the need for a rate limiting solution, the architecture of Perimeter, and how it is being used at WareIQ to observe and manage outgoing API traffic.
WareIQ is a logistics middleware that seamlessly connects merchants to various sales channels, warehouse management systems and delivery partners. WareIQ platform interacts with various third party systems as mentioned above via APIs, to maintain a single source-of-truth and provide data consistency for our customers as far as the e-commerce order related details are concerned.
Currently, WareIQ makes over 1 million API requests daily across our partners, and this number is climbing swiftly as we onboard more clients and partners. WareIQ platform is comprised of multiple micro-services. As each micro-service can call same external partner simultaneously for different functionalities, we’ve noticed an increasing number of request failures due to rate limits being hit on these external partners.
This led us to realize that rate limiting solely at the individual micro-service level is insufficient. We needed a centralized traffic controller that shapes egress traffic being generated across all our micro-services, so that we stay within the bounds of each external partner’s rate limits.
Suppose we have a service that fetches orders from Shopify, which has a rate limit of 2 requests per second. If we have 10 instances of this service running, the combined rate limit would be 20 requests per second in the worst case scenario. However, if all 10 instances are making requests simultaneously, the actual limit we must adhere to is still 2 requests per second. This is where Perimeter comes into play. It ensures that the total rate limit is not exceeded across all instances, maintaining smooth and efficient operations.
By implementing Perimeter, we centralized rate limiting and are effectively managing our API requests and ensuring we meet the rate limits set by our external partners.
Addressing the issue of rate limiting at the network level is indeed an efficient solution. However, it presents certain challenges. We could not afford to drop any request that hit rate-limit. If a request is made that exceeds the rate limit, it must be queued and processed soon, ensuring no loss of requests.
In our research into network layer proxies for rate limiting, we considered Envoy Proxy and Nginx. However, we discovered that both solutions do not meet our requirements. Specifically, both Envoy and Nginx drop requests that exceed the rate limit instead of queuing them for later processing. This behavior does not align with our need to honor all outgoing requests.
Another requirement was to have a rate limiter where the configured rate limits were dynamic and could be updated on the fly without restarting the service. Also the service should react to the configuration changes in real-time.
Network layer solutions for this requirement would require a lot of custom code to be written on top of the existing solutions. This would make the solution complex and difficult to maintain.
Since an application layer best fits these crucial requirements, we decided to build Perimeter as an application layer rate limiter.
WareIQ uses a microservices architecture and these services are managed in a kubernetes cluster. All key services are written in Python. Perimeter sits between these microservices and the external systems.
Since Perimeter is a critical component, there are 2 questions that we needed to answer before we started building Perimeter:
We implemented a token bucket algorithm to enforce rate limits. The token bucket algorithm is a widely used algorithm for rate limiting. It works by adding tokens to a bucket at a fixed rate. When a request comes in, a token is removed from the bucket. If there are no tokens in the bucket, the request will wait till there is a token available to consume. This ensures that the rate limit is enforced correctly, and the requests are processed in a timely manner.
Tokens are added to the bucket based on configurations saved in a postgresql database. This allows us to change the rate limits on the fly without having to restart the service.
All our services are in a Kubernetes cluster, and perimeter is deployed as another service in the cluster. Even though having multiple replicas of Perimeter is desirable for high availability, it brings up a new challenge. If we have multiple replicas of Perimeter, how do we ensure that the rate limits are enforced correctly across all these replicas? We decided to park this problem for the future and make the single-replica Perimeter service as robust and fault-tolerant as possible.
Perimeter is a single central service that is deployed as a Kubernetes deployment with a single replica. This ensures that all requests pass through the same instance of Perimeter, and the rate limits are enforced correctly. An instance of perimeter is set to be available at all times, and if it goes down, the Kubernetes deployment ensures that a new instance is spun up immediately.
Additionally, as a fallback, we have updated all our services to have a retry mechanism in case of a rate limit error. This ensures that even if Perimeter goes down, the services will continue to function, albeit with a higher failure rate.
Since it is going to be a single instance, the service itself needed to be fast and lightweight. We chose to write Perimeter in Golang, as it is known for its speed and efficiency in handling concurrent requests.
Perimeter has four main components:
Perimeter utilizes the blocking behavior of Go channels to queue requests when the rate limit is exceeded. This ensures that no requests are dropped and all requests are processed as soon as the rate limit allows.
To ensure Perimeter can handle the requirements of our system, we conducted a variety of tests to evaluate its performance and reliability. We built our own load simulator using a simple Python script that sends requests to Perimeter at a configurable rate or load. Additionally, we incorporated our own custom services, each with different rate limits, to test Perimeter’s flexibility and enforcement capabilities. We spun up different docker containers corresponding to some of our micro-services and triggers, enabling us to simulate various network conditions and configurations. This comprehensive testing approach ensured that Perimeter could handle the load and enforce rate limits correctly across diverse scenarios.
Perimeter enhances the consistency of egress traffic and is now utilized to monitor and manage outgoing API traffic at WareIQ. Perimeter has enabled us to identify and resolve several previously undetected external API related issues within our system. We have established alerts for different metrics, helping us respond to issues quickly.
Perimeter is a critical component in ensuring that WareIQ’s egress traffic adheres to the rate limits of external partners, enhancing system reliability and performance. By implementing Perimeter, we have centralized rate limiting, effectively managed our API requests, and ensured compliance with external rate limits. As we continue to develop Perimeter, we are focused on adding new features and improving existing capabilities to meet our growing needs.
Supercharge your fulfilment with WareIQ now, contact our team.