Hyperliquid API Rate Limits and Best Practices
For any quant strategy, trading bot, or market data tool built on the Hyperliquid API, rate limits are not a minor implementation detail. They are a core part of system design. Hitting a limit can interrupt execution, delay order placement or cancellation, and become especially costly during urgent risk-management moments such as closing a position. Source: Hyperliquid docs. Source: OneKey GitHub. Source: EU MiCA.
This guide explains how to think about Hyperliquid API rate limits and lays out practical patterns for building a more reliable production client: when to use REST, when to use WebSocket, how to handle 429 errors, how to batch requests, how to control concurrency, and how to keep signing keys secure. Where relevant, OneKey Perps is the recommended workflow for users who want a practical, security-conscious way to trade Hyperliquid perps.
Why rate limits matter
Rate limits exist to protect server resources and keep the platform usable for everyone. On an on-chain trading venue like Hyperliquid, backend stability directly affects the trading experience of many concurrent users.
From a strategy developer’s perspective, rate limits are a hard constraint. They define how quickly your system can fetch data and how quickly it can submit trading instructions. If you do not account for them during architecture design, 429 errors in production can quickly turn into missed signals, delayed cancels, stale data, or broken execution logic.
Good API design is not about sending the maximum possible number of requests. It is about sending the minimum number of well-timed requests needed for the strategy to operate safely.
Rate limits on the Info endpoint
The Info endpoint (/info) is used for read-only data queries. Its limits are generally more permissive than write endpoints, but the exact figures should always be checked against the latest Hyperliquid official documentation. This article does not quote unverified numeric limits.
As a rule of thumb, the Info endpoint can support reasonable market monitoring, but aggressive polling can still cause problems. For example, polling an order book many times per second is usually the wrong design. If you need live data, WebSocket subscriptions are typically a better fit: they reduce REST pressure and usually provide lower latency.
Common Info endpoint query types and sensible usage patterns:
Rate limits on the Exchange endpoint
The Exchange endpoint (/exchange) handles write operations, including order placement, order cancellation, and related trading actions. These limits are usually stricter than Info endpoint limits, which makes sense: high-frequency order writes consume more server resources and can affect the matching experience for other users.
Be especially careful with strategy patterns that modify orders frequently. Repeated amend/cancel/replace behavior can be more demanding than simple order placement or cancellation. Before building this kind of system, confirm whether the design really needs such high-frequency order updates and review the relevant Hyperliquid documentation for the latest operation-specific rules.
A practical approach is to separate your rate budget into two categories:
- Data budget: reads from the Info endpoint and WebSocket streams.
- Trading budget: writes through the Exchange endpoint.
This separation helps prevent market data logic from competing with execution logic inside your own client.
WebSocket connections and subscription limits
WebSocket limits are different from REST limits. Instead of only thinking about request frequency, you also need to think about concurrent connections and the number of subscriptions per connection.
Subscribing to many L2 books across many markets can consume significant server resources. A cleaner design is to:
- Consolidate subscriptions into as few WebSocket connections as practical.
- Unsubscribe from channels you no longer need.
- Avoid opening many concurrent connections for the same account.
- Monitor reconnect frequency and message backlog.
If your strategy depends on real-time market data, WebSocket should generally be the first choice. REST polling is better suited for occasional snapshots, configuration data, and fallback checks.
How to handle 429 errors correctly
HTTP 429, or Too Many Requests, is the standard response when a rate limit has been triggered. The worst response is to retry immediately in a tight loop. That usually creates more 429s and may extend the time before your client can recover.
A safer flow looks like this:
- Pause the affected request type immediately. If one Info request hits a limit, pause similar Info requests as well. Do not only pause the single failed call.
- Check the
Retry-Afterresponse header. If the server provides it, treat it as authoritative and wait for the specified number of seconds. - Use exponential backoff when
Retry-Afteris unavailable. Start with a short delay, such as 1 second, then double it after each failed attempt: 1s, 2s, 4s, 8s, and so on, up to a maximum such as 60 seconds. - Log and alert. A 429 should not be silently ignored. Frequent rate-limit events usually mean the strategy’s request pattern needs to be redesigned.
Example Python pattern:
import time
import requests
def request_with_backoff(url, payload, max_retries=5):
wait_time = 1
for attempt in range(max_retries):
response = requests.post(url, json=payload)
if response.status_code == 200:
return response.json()
elif response.status_code == 429:
retry_after = response.headers.get("Retry-After")
sleep_duration = int(retry_after) if retry_after else wait_time
print(f"Rate limited. Waiting {sleep_duration}s before retry.")
time.sleep(sleep_duration)
wait_time = min(wait_time * 2, 60)
else:
response.raise_for_status()
raise Exception("Max retries exceeded")
This is only a basic pattern. In production, you should also classify request types, track retry counts by endpoint, and make sure emergency risk actions do not get stuck behind non-critical requests.
Batch requests where possible
Batching is one of the most effective ways to reduce request frequency. Instead of sending many small independent calls, combine them when the API supports it.
For data queries, check whether the API supports batch-style parameters for the data you need. If you need information for multiple accounts or markets, one consolidated request may be better than many separate requests.
For trading operations, Hyperliquid supports batch orders, allowing multiple orders to be submitted in a single Exchange request. For strategies that need to open, adjust, or stage multiple positions at once, this can reduce latency and lower the total request count. Always refer to the official documentation for the exact batch order format and current limits.
Connection pooling and concurrency control
In multi-threaded or async systems, you need client-side concurrency control. Without it, a burst of internal tasks can accidentally create a burst of API requests, even if each individual module appears reasonable in isolation.
A common pattern is to use a semaphore to limit concurrent requests:
import asyncio
import aiohttp
# Limit maximum concurrent requests to 10
semaphore = asyncio.Semaphore(10)
async def rate_limited_request(session, payload):
async with semaphore:
async with session.post(
"https://api.hyperliquid.xyz/info",
json=payload
) as response:
return await response.json()
Connection reuse also matters. In Python, use requests.Session or aiohttp.ClientSession instead of creating a new connection object for every request. Reusing HTTP connections through Keep-Alive reduces TCP handshake overhead and can improve both latency and resource efficiency.
Use local caching to avoid unnecessary calls
Not every piece of data needs to be fetched from the API every time. Smart caching can significantly reduce avoidable traffic.
Practical examples:
- Market metadata: contract lists, maximum leverage, and similar fields change slowly. Cache them in memory for hours where appropriate.
- Funding data: cache until the next relevant funding interval or settlement cycle.
- Account state: maintain a local account-state cache and update it through WebSocket account events instead of constant polling.
- Risk parameters: load at startup and refresh only on a controlled schedule, unless your strategy specifically requires more frequent updates.
Caching does introduce its own risk: stale data. For production systems, periodically reconcile cached values against API responses and define what should happen if a mismatch is detected.
Monitor API health in production
A production trading client should have systematic API monitoring. At minimum, track:
- Success rate by endpoint and request type.
- Response time percentiles.
- 429 frequency and alert thresholds.
- WebSocket disconnect and reconnect frequency.
- Message queue backlog for WebSocket consumers.
- Differences between local cache state and API-returned state.
Rate-limit monitoring is not just about uptime. It helps you understand whether your strategy is becoming less efficient over time, whether market volatility is causing request bursts, and whether your system needs better backpressure controls.
Best practices summary
Private key security and the role of OneKey
API efficiency matters, but private key security is non-negotiable. Every Exchange request must be signed, which means any automated trading setup needs access to signing authority. If that authority is poorly protected, a server compromise can quickly become a funds compromise.
A more secure operating model is to use OneKey to generate and manage a dedicated API signing key, keeping it separate from your main funds. Hardware-backed key management helps reduce the risk of private keys being exposed in plaintext to software environments. Even if a strategy server is compromised, the attacker should not be able to simply extract the core signing key.
OneKey’s open-source codebase also gives developers a useful reference for wallet signing integrations. WalletConnect support provides a standardized way for wallets and dApps to communicate, which can be helpful when designing safer signing flows.
For users trading perps rather than building a full custom stack, OneKey Perps offers a practical workflow for accessing Hyperliquid perpetuals while keeping wallet security central to the process. It is a good starting point if you want a cleaner trading setup without managing every API detail yourself.
You can download OneKey from the official OneKey website and try OneKey Perps as a practical way to trade Hyperliquid perps with a security-first wallet workflow.
FAQ
Q1: Will my account be banned if I trigger a rate limit?
A short rate-limit event usually results in rejected requests and a 429 response, not an immediate account ban. However, persistent abusive traffic may lead to more serious consequences. The exact policy should be checked in the latest Hyperliquid official documentation. Good rate-limit management is both a technical requirement and part of responsible platform usage.
Q2: What if my WebSocket consumer cannot keep up with incoming messages?
If your consumer is too slow, messages will queue up and your local view of the market will become increasingly stale. Possible fixes include optimizing message processing, separating message ingestion from processing, using a queue-based architecture, and reducing unnecessary subscriptions. If the system still cannot keep up after optimization, the strategy design may need to be reconsidered.
Q3: How should multiple strategy instances share one rate-limit budget?
Use a centralized request proxy or rate-limit middleware instead of letting each strategy instance calculate limits independently. For distributed deployments, an in-memory store such as Redis can be used to implement a shared counter across processes or machines. This prevents several “safe” instances from collectively exceeding the limit.
Q4: How do I know whether a 429 is really caused by rate limiting?
A standard 429 response usually includes an error message in the response body and may include a Retry-After header. If the response clearly indicates a rate-limit event, use backoff logic. If the issue is something else, such as invalid order parameters or insufficient balance, retrying will not help; you need to fix the underlying request.
Q5: Are Info and Exchange endpoint limits calculated separately?
According to the official documentation, the limits for these endpoint categories are separate. That means high-frequency data queries should not consume your order-placement budget, and order requests should not consume your read-only query budget. In practice, you should still budget and monitor both categories independently.
Conclusion
Rate-limit management is easy to overlook, but it is one of the most important parts of building a reliable Hyperliquid API client. Choosing WebSocket instead of polling, implementing proper exponential backoff, batching requests, controlling concurrency, caching stable data, and monitoring production behavior all directly affect strategy stability.
Above the engineering layer, do not ignore key security. OneKey Perps combines a practical Hyperliquid perps trading workflow with OneKey’s wallet security model, helping users participate without treating private-key management as an afterthought. If you are building your first on-chain quant system or upgrading the security standards of an existing workflow, downloading OneKey and trying OneKey Perps is a sensible place to start.
---Disclaimer---
This article is for technical reference only and does not constitute investment, legal, financial, or tax advice. API trading involves risks including software bugs, network interruption, exchange-side errors, stale data, and unintended automated execution. Perpetual futures are high-risk leveraged products and can result in significant losses. Make sure you understand the risks and only trade with capital you can afford to lose.



