[javascript] Dogfooding our own rate-limited API

3 Answers

If this is causing you a problem, it will cause your putative ecosystem of developers a problem (e.g. when they try to develop an alternative UI). If you are really eating your own dog food, make the API (and the rate limiting) work for your application. Here are some suggestions:

  • Do not rate limit by IP address. Rather, rate limit by something associated with the user, e.g. their user ID. Apply the rate limit at the authentication stage.

  • Design your API so that users do not need to call it continuously (e.g. give a list call that returns many results, rather than a repeated call that returns one item each time)

  • Design your web app with the same constraints you expect your developer ecosystem to have, i.e. ensure you can design it within reasonable throttling rates.

  • Ensure your back end is scalable (horizontally preferably) so you don't need to impose throttling at levels so low it actually causes a problem to a UI.

  • Ensure your throttling has the ability to cope with bursts, as well as limiting longer term abuse.

  • Ensure your throttling performs sensible actions tailored to the abuse you are seeking to remove. For instance, consider queuing or delaying mild abusers rather than refusing the connection. Most web front ends will only open four simultaneous connections at once. If you delay an attempt to open a fifth you'll only hit the case where they are using a CLI at the same time as the web client (ot two web clients). If you delay the n-th API call without a gap rather than failing it, the end user will see things slow down rather than break. If you combine this with only queuing N API calls at once, you will only hit people who are parallelising large numbers of API calls, which is probably not the behaviour you want - e.g. 100 simultaneous API calls then a gap for an hour is normally far worse than 100 sequential API calls over an hour.

Did this not answer your question? Well, if you really need to do what you are asking, rate-limit at the authentication stage and apply a different rate limit based on the group your user fits into. If you are using one set of credentials (used by your devs and QA team), you get a higher rate limit. But you can immediately see why this will inevitably lead you to your ecosystem seeing issues that your dev and QA team do not see.



My company has developed a rate-limited API. Our goal is twofold:

  • A: Create a strong developer ecosystem around our product.
  • B: Demonstrate the power of our API by using it to drive our own application.

Clarification: Why rate-limit at all?

We rate limit our API, because we sell it as an addition to our product. Anonymous access to our API has a very low threshold for API calls per hour, whereas our paid customers are permitted upwards of 1000 calls per hour or more.

The Problem:

Our rate-limited API is great for the developer eco-system, but in order for us to dogfood it we can't allow it to be restricted to the same rate-limiting. The front end of our API is all JavaScript, making direct Ajax calls to the API.

So the question is:

How do you secure an api so that rate-limiting can be removed where in the process in removing such rate-limiting can't be easily spoofed?

Explored Solutions (and why they didn't work)

  1. Verify the referrer against the host header. -- Flawed because the referrer is easily faked.

  2. Use an HMAC to create a signature based off the request and a shared secret, then verify the request on the server. -- Flawed because the secret and algorithm would be easily determined by looking into the front end JavaScript.

  3. Proxy the request and sign the request in the proxy -- Still flawed, as the proxy itself exposes the API.

The Question:

I am looking to the brilliant minds on Stack Overflow to present alternate solutions. How would you solve this problem?

Unfortunately, there is no perfect solution to this.

The general approach is typically to provide a spoofable way for clients to identify themselves (e.g. an identifier, version, and API key -- for example), for clients to register information about themselves that can be used to limit access (e.g. the client is a server in a given IP address range, so only allow callers in that range; e.g. the client is JavaScript, but delivered only to a specific category of browser, so only allow access to HTTP requests that specify certain user agent strings; etc.), and then to use machine learning/pattern recognition to detect anomalous usage that is likely a spoofed client and then to reject traffic from these spoofed clients (or confirm with clients that these usages are indeed not coming from the legitimate client, replace their spoofable credentials, and then disallow further traffic using the older spoofed credentials).

You can make it slightly more difficult to spoof by using multiple layers of key. For example, you give out a longer-lived credential that lives on a server (and that can only be used in a limited set of IP address ranges) to make an API call that records information about the client (e.g. the user agent) and returns a shorter-lived client-side key that is syndicated in JavaScript for use on the client for client-side API requests. This, too, is imperfect (a spoofer could issue the same server call to get the credential), but it will be more difficult if the returned API key is included in obfuscated (and frequently changing) JavaScript or HTML (which would make it difficult to reliably extract from the response). That also provides a way to more easily detect spoofing; the client-side key is now tied to a particular client (e.g. specific user agent, perhaps even a specific cookie jar) that makes reuse in another client easy to detect and the expiration also limits the duration in which the spoofed key may be reused.

Can you stand up a separate instance of the UI and throttle-free API, and then restrict access to IP addresses coming from your organisation?

E.g., deploy the whole thing behind your corporate firewall, and attach the application to the same database as the public-facing instance if you need to share data between instances.

  • Whitelist source IP addresses
  • Use a VPN, whitelist VPN members
  • Proxy solution or browser addon that adds HTTP headers should be fine if you can secure the proxy and aren't concerned about MITM attacks sniffing the traffic
  • Any solution involving secrets can mitigate the impact of leaks by rotating secrets on a daily basis