AINOGAP Technical Whitepaper
Technical documentation covering system architecture, metering protocol, smart contract design, and security model.
目录 / Contents
// overview
System Overview
Overview
Design Goals
AINOGAP is an API routing and settlement system that provides unified request forwarding, token metering, and fund settlement between multiple independent upstream LLM providers and downstream callers.
Core design constraint: the request forwarding layer and the fund settlement layer are separated. The request layer uses a traditional Web2 stack (NestJS + MySQL) for low-latency forwarding and precise metering; the settlement layer uses Base chain (EVM L2) smart contracts for on-chain auditability of fund custody and distribution.
Tech Stack
| Layer | Technology | Responsibility |
|---|---|---|
| API Gateway | NestJS 10 / Node.js | Auth, route resolution, request forwarding, metering |
| Proxy Forwarding | LiteLLM Proxy | Anthropic SDK format conversion, multi-provider endpoint management |
| Data Layer | MySQL 8 | Ledger (request logs, balances, earnings) |
| Cache/Locks | Redis 7 | Distributed locks, session storage |
| On-chain Settlement | Base mainnet (EVM L2) | USDC custody, provider distribution, revenue withdrawal |
| Frontend | Next.js 15 (App Router) | Provider/buyer consoles, admin panel |
| Contract Language | Solidity 0.8.34 | TreasuryVault + PayoutVault |
// architecture
System Architecture
Architecture
Request Data Flow
A complete API request traverses the following path:
Client (Anthropic SDK) | POST /<routeAlias>/v1/messages | Headers: x-api-key, anthropic-version v NestJS Router (RoutingService) | 1. Validate API Key -> find Buyer | 2. Resolve routeAlias -> candidate provider list | 3. Sort and select optimal node v LiteLLM Proxy | 4. Rewrite base_url + api_key to target upstream provider | 5. Forward request, stream response through v Upstream Provider (Anthropic API) | v After response completes MeteringService | 6. Parse usage fields -> 5-bucket token split | 7. Calculate buyer charge + provider earnings | 8. Atomic MySQL write (balance debit + earnings credit + request log) v Return response to Client
Routing Sort Algorithm
RoutingService.resolveProvider() sorts candidate providers by the following rules:
When an upstream request returns 5xx or times out, the node is immediately marked unhealthy. ProbeService polls all active providers' configured models every 60 seconds; recovered nodes are re-marked healthy.
- 11. healthStatus: healthy=0, unknown=1, unhealthy=2 (ascending)
- 22. priority: lower value = higher priority (provider-configurable)
- 33. cursor-based round-robin: global cursor mod candidate count for even distribution
// metering
Metering System
Metering
5-Bucket Token Model
The Anthropic API usage response contains multiple token count fields. The system normalizes them into five independent billing buckets:
| Bucket | Source Field | Description |
|---|---|---|
| uncachedInput | usage.input_tokens | Uncached input tokens (Anthropic returns the uncached portion directly) |
| cacheWrite5m | usage.cache_creation.ephemeral_5m_input_tokens | Tokens written to 5-minute ephemeral cache |
| cacheWrite1h | usage.cache_creation.ephemeral_1h_input_tokens | Tokens written to 1-hour ephemeral cache |
| cacheRead | usage.cache_read_input_tokens | Cache-hit read tokens |
| output | usage.output_tokens | Model-generated output tokens |
Billing Formula
Each bucket is priced independently. For token billing mode:
Where price_per_token_micros is in USD micros/token (1 micro = $0.000001). Buyer price = ceil(provider price / 0.70), ensuring the platform retains a 30% margin.
For requests billing mode: cost = requestUnits * usdMicrosPerRequest. requestUnits map by model type (opus/sonnet/haiku each have independent multipliers).
cost_micros = sum(bucket_tokens[i] * price_per_token_micros[i]) for i in [0..4]Atomicity Guarantees
MeteringService performs the following operations within a single MySQL transaction:
Transaction isolation level is READ COMMITTED. Balance deduction uses row-level locks (SELECT ... FOR UPDATE) to prevent concurrent over-deduction.
- 1INSERT request_log (full request record + 5-bucket token counts + costs)
- 2UPDATE buyer SET balance = balance - buyerCost WHERE balance >= buyerCost (insufficient balance rolls back entire transaction, returns 402)
- 3UPDATE provider_earnings SET pending = pending + providerEarnings
// contracts
Smart Contracts
Smart Contracts
Contract Architecture
On-chain settlement consists of two independent contracts deployed on Base mainnet (Chain ID 8453). Source code is verified on Basescan.
| Contract | Address | Responsibility |
|---|---|---|
| TreasuryVault | 0xb8d0c9b8f4222d7a4865d7035a0ab66f6720f072 | Holds all user-deposited USDC; Operator can allocate funds to PayoutVault |
| PayoutVault | 0x55B6417fe1718c9aaB4dAcbA87baCd8e9f382dD5 | Executes idempotent single-payout by batchId to providers or buyer withdrawals |
| USDC | 0x833589fCD6eDb6E08f4c7C32D4f71b54bdA02913 | Base native USDC (Circle official) |
TreasuryVault
TreasuryVault is the fund pool contract holding all platform USDC. Key design:
- 1deposit(uint256 amount): Users transfer USDC into the contract, updating balances[msg.sender] and totalDeposited
- 2withdraw(uint256 amount): Users withdraw their own balance (not a platform operation)
- 3fundPayoutVault(address, uint256): Only callable by treasuryOperator; transfers specified amount to whitelisted PayoutVault
- 4approvedPayoutVault: Whitelist mechanism — fundPayoutVault can only transfer to the approved PayoutVault
- 5Accounting invariant: totalDeposited >= totalWithdrawn + totalFundedOut
- 6Reentrancy protection: Custom nonReentrant modifier (no OpenZeppelin dependency)
- 7Operator change timelock: 24-hour timelock + 2-step confirmation (setTreasuryOperator -> acceptTreasuryOperator)
- 8Admin transfer: 2-step (transferAdmin -> acceptAdmin)
- 9Emergency withdrawal: emergencyWithdraw callable even when paused
PayoutVault
PayoutVault is the payment execution contract. Key design:
- 1payout(address to, uint256 amount, bytes32 batchId): Only callable by payoutOperator
- 2Idempotency: executedBatchIds[batchId] mapping — executed batchIds cannot be replayed (double-spend protection)
- 3batchId generation: keccak256(stringToHex(batchId string)), format e.g. provider_payout_{providerId}_{timestamp} or buyer_withdraw_{buyerId}_{timestamp}
- 4Operator change: 24-hour timelock + 2-step confirmation
- 5Admin transfer: 2-step
- 6Pause mechanism: admin can setPaused(true) to freeze all payouts
Settlement Flow
Provider payout trigger conditions: pending earnings >= thresholdUsdtMicros AND time since last payout >= minPayoutIntervalSeconds.
Buyer withdrawal follows a similar flow: atomically debit MySQL balance (debitBuyerBalanceAtomic), then execute on-chain payout. On-chain failure triggers automatic creditBalance rollback.
1. Admin triggers executeProviderPayout(providerId) 2. ChainService calculates payable amount = pending earnings (converted to USDC units) 3. Check PayoutVault existing balance; if insufficient, fundPayoutVault(shortfall) 4. Wait 3s for RPC propagation delay 5. Call PayoutVault.payout(providerWallet, amount, batchId) 6. Wait for on-chain confirmation -> update MySQL settlement record
Compiler Parameters
| Parameter | Value |
|---|---|
| Solidity Version | 0.8.34+commit.80d5c536 |
| Optimizer | enabled, 200 runs |
| EVM Target | cancun (default) |
| License | MIT |
// auth
Authentication
Authentication
SIWE Authentication Flow
Providers and buyers authenticate via EIP-4361 (Sign-In with Ethereum) style wallet signatures:
- 1Challenge single-use: deleted immediately after successful verify, prevents replay
- 2Challenge TTL: 5 minutes (Redis TTL)
- 3Role binding: walletAddress -> providerId or buyerId (database lookup)
- 4Unregistered wallet addresses automatically create a new buyer account
1. Client -> POST /auth/challenge { walletAddress, role? }
2. Server generates challengeId + EIP-4361 message (with nonce, domain, uri, chainId)
3. Client signs message with wallet
4. Client -> POST /auth/verify { challengeId, signature }
5. Server verifies signature with viem.verifyMessage -> matches wallet address -> finds provider/buyer
6. Issues session token (session_{random}) -> stored in Redis (TTL 7d) + HttpOnly CookieAPI Key Authentication
Buyers calling /v1/messages use API Keys (x-api-key header):
- 1Format: rk_ prefix + random string
- 2RoutingService.resolveBuyer(apiKey) looks up corresponding buyer -> returns buyerId + walletAddress
- 3API Key can be rotated at any time (POST /buyer/rotate-api-key)
// probe
Health Probing
Health Probing
Probe Mechanism
ProbeService executes health checks at 60-second intervals for all active providers across all configured models:
- 1Sends minimal request per (provider, model) combination (max_tokens=1, short prompt)
- 2Timeout threshold: 30 seconds
- 3Result status: ok / fail / timeout
- 4Records latency (ms) and error message
- 5Results written to probe_results table (persistent history)
- 6Latest status cached in memory (ProbeStatusMap) for routing sort use
Model Support Detection
Providers declare upstream mappings for each model via modelRedirects. Empty string means the provider does not support that model. ProbeService skips (provider, model) pairs with empty mappings — no fail recorded.
// security
Security Model
Security
Attack Surface & Mitigations
| Attack Surface | Mitigation |
|---|---|
| Balance over-deduction | MySQL row-level lock + WHERE balance >= cost atomic deduction |
| Double-spend (on-chain) | PayoutVault.executedBatchIds mapping + batchId uniqueness |
| Replay attack (auth) | Challenge single-use + 5-minute TTL |
| API Key leakage | Provider API Keys encrypted at rest (AES-256-GCM); injected by LiteLLM at forward time, never in responses |
| Reentrancy attack (contract) | Custom nonReentrant modifier |
| Operator key compromise | 24-hour timelock + 2-step confirmation; admin can emergency pause |
| Direct backend access | Next.js Server Actions proxy; NestJS API port not publicly exposed |
| LiteLLM parameter injection | api_key/api_base/custom_llm_provider fields forcibly stripped from request body |
Contract Permission Model
- 1admin: Deployer. Can pause/unpause, set approvedPayoutVault, initiate operator changes, transfer admin
- 2treasuryOperator: Only address that can call fundPayoutVault. Changes require 24h timelock
- 3payoutOperator: Only address that can call payout. Changes require 24h timelock
- 4Regular users: Can deposit/withdraw their own balance (TreasuryVault)
// api
API Specification
API Specification
Core Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /<routeAlias>/v1/messages | Anthropic Messages API compatible endpoint (streaming/non-streaming) |
| GET | /buyer/balance | Query buyer balance and profile |
| GET | /buyer/requests | Buyer request history (paginated) |
| POST | /buyer/recharge | Verify on-chain deposit tx and credit balance |
| POST | /buyer/withdraw | Buyer balance withdrawal to on-chain wallet |
| GET | /provider/earnings | Provider earnings detail |
| GET | /provider/marketplace | Public provider marketplace data |
| GET | /probe/status | Latest probe status (public) |
| GET | /probe/history | Probe history records (public) |
| POST | /auth/challenge | Initiate SIWE authentication challenge |
| POST | /auth/verify | Verify wallet signature |
Request Forwarding Protocol
Clients send standard Anthropic Messages API requests. The system injects/modifies the following before forwarding:
- 1x-api-key -> replaced with provider's upstream API Key
- 2base_url -> replaced with provider's configured upstreamBaseUrl
- 3model -> mapped via modelRedirects to provider's actual upstream model name
- 4stream -> preserves client's original setting, chunks transparently forwarded when streaming
- 5Forbidden fields in request body (api_key, api_base, custom_llm_provider) are forcibly removed
// deployment
Deployment Architecture
Deployment
Production Environment
- 1Single-node deployment (OVH VPS)
- 2TLS certificate: Let's Encrypt auto-renewal
- 3Domain: web3.meowai.net
- 4Chain RPC: https://mainnet.base.org (Base official public RPC)
| Component | Runtime | Port |
|---|---|---|
| NestJS API | Docker container (web3-router) | 4000 (internal) |
| Next.js Web | Same container | 3000 (internal) |
| LiteLLM Proxy | Separate Docker container (litellm) | 4000 (internal) |
| MySQL 8 | 1Panel-managed container | 3306 (internal) |
| Redis 7 | 1Panel-managed container | 6379 (internal) |
| Nginx | Reverse proxy | 443 (public) |
Get Started with AINOGAP
Deposit USDT and call all providers' Claude models instantly