How UsageGuard Works

UsageGuard provides access to a wide range of open-source, 3rd party (e.g., OpenAI, Meta, Mistral, Anthropic), through a unified inference API.

This allows you to interact with multiple models using a single API endpoint with the same request/response format. UsageGuard manages the authentication, policy enforcement, and request transformation for each model, ensuring a consistent and secure experience.

You can configure different connections for each use case or application, specifying the required policies. this unified approach simplifies the integration process, allowing you to switch between models without changing your application code.

This page explains the API request/response flow and key components of the UsageGuard system.

  • Summary:
    • Single API endpoint for multiple LLM providers
    • Unified inference API request/response format
    • Cost-effective inference
    • Centralized management of policies
    • Centralized logging and monitoring

UsageGuard Flow

The UsageGuard API seamlessly intercepts and processes requests between your application and LLM providers. Here's a step-by-step breakdown of how it works:

  1. Client Request: Your application sends a request to the UsageGuard API endpoint https://api.usageguard.com/v1/inference/chat

This request includes your UsageGuard API key and connection ID in the headers.

  1. Authentication & Connection: UsageGuard authenticates your request using the provided API key and identifies the appropriate connection based on the connection ID. This connection contains the configuration and credentials for the target LLM provider.

  2. Policy Application: Before forwarding the request, UsageGuard applies the policies configured for your connection. This may include:

    • Content moderation (e.g., filtering prohibited content, NSFW detection)
    • PII detection and management
    • Token usage limits
    • End-user tracking (if enabled)
  3. Request Transformation: If necessary, UsageGuard transforms the request to match the target LLM provider's format while preserving compatibility with your original request structure.

  4. Provider Communication: UsageGuard forwards the processed request to the LLM provider or run inference directly on depending on the mode of operation.

  5. Response Processing: Upon receiving the response from the LLM provider, UsageGuard applies any necessary post-processing steps, such as:

    • Logging for audit purposes
    • Applying output moderation policies
    • Tracking token usage
  6. Client Response:

    • Proxy Mode: Finally, UsageGuard sends the processed response back to your application, maintaining the expected format from the original LLM provider.
    • Direct Mode: UsageGuard sends the response in unified inference format to the client.

Data Flow

UsageGuard Data Flow

Key Components

Connections

Connections are at the heart of UsageGuard. Each connection represents a configuration for accessing a one or more models. Connections store:

  • Model(s) configuration
  • Spending limit
  • Policy configurations
  • Logging preferences

You can manage multiple connections to work with different providers or to have distinct configurations for various use cases or environments.

Policies

Policies are rules and configurations applied to your requests and responses. They help ensure safe and compliant use of LLM services. Key policy types include:

  • Content Moderation: Filter out prohibited content, detect NSFW material, and manage sensitive topics.
  • PII Detection: Identify and handle personally identifiable information in requests and responses.
  • Usage Limits: Set constraints on token usage, request frequency, and other usage parameters.
  • Word Lists: Custom lists of words or phrases to detect, redact, or audit in the content.

Audit Logging

UsageGuard provides comprehensive logging capabilities to help you monitor usage, debug issues, and maintain compliance. Logging options include:

  • Request and response body logging (configurable)
  • Policy application results
  • Token usage tracking
  • End-user attribution (when enabled)

Next Steps

Now that you understand how UsageGuard works, you're ready to start integrating it into your application:

Was this page helpful?