Paid Models
Last updated
Last updated
The pricing for different providers and models varies depending on the specific model and usage requirements. Here's an overview of different providers and models for both the large language models and embeddings:
Embedding models are used to convert text data into numerical representations called embeddings. These embeddings capture the semantic meaning and relationships between different pieces of text. When the documents/
directory is updated, the content is sent to the embedding API. The resulting embeddings are saved to the cache/
directory, allowing the AI to find relevant context efficiently without reprocessing or making new API requests for each query.
Platform | Model Name | MTEB | Embedding Dimensions | API Price (per 1K tokens) |
---|---|---|---|---|
Large Language Models (LLMs) are powerful AI models that can understand and generate human-like text based on the input they receive. In ServerAssistantAI, when a user asks a question, the system retrieves relevant cached context from the embedding API results. This context, along with the user's question, is sent to the LLM to generate accurate and context-aware responses.
Platform | Model Name | ELO | Speed (tokens per second) | API Price (per 1K tokens) |
---|---|---|---|---|
64.59
3072
Usage: $0.00013
62.26
1536
Usage: $0.00002
60.99
1536
Usage: $0.0001
1337
49.97 TPS
Input: $0.005
Output: $0.015
1272
166 TPS
Input: $0.00015 Output: $0.0006
1269
113.28 TPS
Input: $0.003
Output: $0.015
1260
57.2 TPS
Input: $0.0035 Output: $0.0105
1257
38.90 TPS
Input: $0.03
Output: $0.06
1248
25.51 TPS
Input: $0.015
Output: $0.075
1227
133.3 TPS
Input: $0.00035 Output: $0.00105
1179
214.95 TPS
Input: $0.00025
Output: $0.00125
1106
53.07 TPS
Input: $0.0005
Output: $0.0015