/chat/completions
endpoint, specifically the p50 and p95 for generation time and tokens per second,
and also the total prompt and completion tokens processed within the interval. The
user id and total request count within the interval are also returned.
Authorizations
Bearer authentication header of the form
Bearer <token>
, where <token>
is your auth token.Query Parameters
Timestamp of the earliest query to aggregate. Format is
YYYY-MM-DD hh:mm:ss
.Timestamp of the latest query to aggregate. Format is
YYYY-MM-DD hh:mm:ss
.Models to fetch metrics from. The list must be a set of comma-separated strings. i.e.
gpt-3.5-turbo,gpt-4o
Providers to fetch metrics from. The list must be a set of comma-separated strings. i.e.
openai,together-ai
Number of seconds in the aggregation interval.
Secondary user id. The secondary user id will match any string previously sent in the
user
attribute of /chat/completions
.