Chronicle
Written
— Updated
- Provides a proxy which can be called instead of the normal URL, then passes a request to the LLM provider and returns the response. Chronicle can be embedded directly into a Rust application or can run as a standalone server.
Task List
Up Next
- Support tool use fields
Soon
- Javascript client
- This should be both a normal client and have the ability to wrap other common clients such as the OpenAI SDK
- Python client
- This should be both a normal client and have the ability to wrap other common clients such as the OpenAI SDK
- Easy way to set up clients from JS and Python to call the proxy with appropriate headers
- This is most useful in concert with other libraries that make LLM calls, like dspy and instructor
- Extra providers
- Fireworks
- Together
- Javascript client
Later/Maybe
- Send logged data to arbitrary HTTP endpoint
- This should be done in a way that it can sent to something like Elasticsearch without custom code
- Global rate limiting
- When looping around providers with retries, omit providers who had an unrecoverable error.
- Analysis
- visualize by arbitrary metadata
- Ability to create database indexes on arbitrary metadata even in JSON fields
- Price Tracking
- Associate each provider and its calls with a pricing plan
- Fetch and update prices for each provider
- Support binary upload APIs like Deepgram as well
- Support streaming responses
- Submit request metadata (org/user/workflow id) via HTTP headers or cookies?
- Send logged data to arbitrary HTTP endpoint
Done
- API should have default to do everything without authorization
- Do this by not only setting up a default user, but also adding it as the anonymous fallback
- Testing — Apr 26th, 2024
- For API mode, add data tables as Filigree models instead of using the built-in tables
- When multiple providers are in use, keep retrying even on normally un-retryable errors
- Allow configuring fallback provider and model on retry. — Apr 24th, 2024
- This is part of the model alias configuration. Basically instead of a single provider and model there's an array of provider/model/apikey tuples
- Support model/provider aliases — Apr 23rd, 2024
- Support api keys — Apr 23rd, 2024
- These can only be referenced by aliases
- Save metadata into SQLite or Postgres — Apr 22nd, 2024
- Load model and provider definitions from a configuration file — Apr 22nd, 2024
- Store and load model and provider definitions from the database — Apr 22nd, 2024
- Configurable user agent for HTTP client — Apr 21st, 2024
- Link requests to internal users/orgs/projects — Apr 21st, 2024
- Configurable timeout — Apr 20th, 2024
- Common format chat messages and responses — Apr 19th, 2024
- Automatic retry with rate-limit support — Apr 19th, 2023
- Endpoint that proxies the call — Apr 19th, 2024
- Send all relevant metadata as Otel traces — Apr 19th, 2024
- API should have default to do everything without authorization
- Probably take some code from Promptbox and change that to use this as a library, since it already has some of the needed functionality
- Maintain a price sheet with input/output token price per provider and model
- Each price sheet entry as an active flag
- When prices are updated for a model, add a new entry and mark it active
- In the future have a scraper or other mechanism of getting latest price data for each model
- Support multiple methods of output:
- Record in a postgres table
- Output OpenTelemetry
- Consider allowing metadata such as org and user ID can be sent in a cookie or in HTTP headers in addition to the body. Not sure how useful this is though.
- For each entry, record:
- Org ID
- User ID
- Run ID (ID linking related prompt calls together)
- Workflow Name
- Workflow Step
- Arbitrary other metadata
- endpoint called
- provider and model used
- input text
- output text
- input token count
- output token count
- which price sheet row was used
- response time