Cost and Usage Management

Cost and usage management is a critical aspect of maintaining the efficiency and cost-effectiveness of your API requests and resources. It involves monitoring and controlling the consumption of resources to prevent unexpected expenses and optimize performance.

UsageGuard helps you to manage and optimize the cost and usage of your API requests and resources with two key features:

Cost Control: Set usage limits and monitor usage to prevent unexpected expenses.
Usage Optimization: Analyze usage patterns and identify opportunities for cost savings at connection and/or end-user level.

End-User Tracking

End-user tracking allows you to monitor and analyze the behavior of users within your application. This can help you understand user interactions, improve user experience, and ensure compliance with various regulations. as well as understanding costs associated with each user.

End-user tracking can be beneficial in various scenarios, including:

Security and Compliance: Monitor user activities to detect suspicious behavior and ensure compliance with data protection regulations.
User Behavior Analysis: Understand how users interact with your application, which features they use the most, and where they encounter issues.
Personalization: Tailor the user experience based on individual user behavior and preferences.
Customer Support: Provide better customer support by understanding user actions leading up to an issue or support request.
Feature Adoption: Track the adoption rate of new features and gather insights on how users are engaging with them.
A/B Testing: Conduct A/B testing to compare different versions of your application and determine which one performs better based on user interactions.
Retention Analysis: Analyze user retention rates and identify patterns that contribute to long-term user engagement and loyalty.

Enabling End-User Tracking

To enable end-user tracking on a connection, follow these steps:

Navigate to the Connection -> Select a connection -> Edit Policies section of your dashboard.
Locate the "End-User Tracking" option and toggle it to "Enabled".
Save your changes.

Maximum Tokens per Request

You can set the maximum number of tokens allowed per request by configuring the Max Tokens Per Request option in the connection settings. This value will override the corresponding maximum token value in individual requests.

Understanding Tokens

Before diving into the settings, it's important to understand the difference between tokens and characters:

Characters: These are the individual letters, numbers, punctuation marks, and spaces in a text.
Tokens: In the context of language models, tokens are the units of text that the model processes. A token can be as short as one character or as long as one word. For example, "hello" is one token, while "don't" is two tokens ("don" and "'t").

Generally, one token is approximately 4 characters for English text. However, this can vary depending on the specific text and the tokenization method used by the model.

Setting Maximum Tokens

In the Dashboard

Navigate to the Connection -> Select a connection -> Edit Policies section of your dashboard.
Look for the "Maximum Characters per Request" or "Maximum Tokens per Request" option.
Enter the desired maximum value.
Save your changes.

Use Cases

Setting a maximum token limit can be beneficial in various scenarios:

Cost Control: Limit the amount of text processed to manage API usage costs.
Performance Optimization: Ensure that requests don't become too large, which could slow down processing times.
Security: Prevent potential abuse by limiting the size of user inputs.
Compliance: Adhere to API provider limits or internal policies on data processing.

Benefits

Implementing token limits offers several advantages:

Predictable Costs: By capping the maximum input size, you can better predict and control API usage costs.
Improved User Experience: Prevent timeouts or slow responses due to overly large requests.
Enhanced Security: Mitigate risks associated with excessively large inputs, such as denial-of-service attacks.
Compliance: Ensure your application stays within the guidelines set by API providers or internal policies.

Best Practices

When setting maximum token limits:

Choose the Right Metric: Decide whether to use characters or tokens based on your specific use case and the models you're using.
Balance Restrictions and Functionality: Set limits that protect your application without overly restricting legitimate use cases.
Communicate Limits to Users: If applicable, make sure users are aware of input limits to avoid frustration.
Monitor and Adjust: Regularly review your limits and adjust as needed based on usage patterns and feedback.

Remember that setting limits too low may restrict functionality, while setting them too high could lead to unexpected costs or performance issues. Find the right balance for your specific use case.

By carefully configuring these settings, you can optimize your application's performance, security, and cost-effectiveness when working with language models.