2 Jan 2024

2024 Pricing comparison of text based Commercial AI LLM models - GPT, Gemini, Cohere, Mistral....

This is the AI age. AI is now included in most of the software that we use. AI powered apps and software have an edge when it comes to competition and also simplifies the user experience. Incorporating AI into your product in 2024 is mostly an easy affair and also with increase in AI models there is more choice.

What is LLM?

"LLM" stands for "Large Language Model." It's a type of artificial intelligence model that is trained on vast amounts of text data to understand and generate human-like text. These models are based on deep learning techniques and neural networks, specifically a type called Transformer models.

What is Context Window of a LLM?

The "context window" in a Large Language Model (LLM) like GPT-4 refers to the amount of text input that the model can consider at any given time when generating a response. This concept is crucial for understanding the capabilities and limitations of these models. It is like the horse power of an engine. The greater the context window, the better the model is.

What is a token?

In language models, text is broken down into smaller pieces called tokens. A token can be a word, part of a word (like a syllable or a subword), or even punctuation. The way text is tokenized depends on the specific language model and the tokenization algorithm it uses.

Commercial vs open-source LLM

There are commercial LLM providers who price based on data or tokens and open-source LLM providers where the cost overhead is based on servers and maintenance. Certain advantages of using commercial LLM over opensource is that they provide better support, less maintenance, automatic updates and scalable APIs. However it comes with a usage based cost, This article will compare different LLM options along with their pricing and guide you in making a better decision.

Text based Commercial LLMs

LLM with token based pricing for input and output text. i.e Use their API to input a prompt and get response as Text (can be structured or unstructured) are classified here as text based LLM. Due to their similarities in function and pricing, we will compare LLM only in that aspect.

1. OpenAI GPT Pricing

OpenAI, is the pioneer, leader and the most popular LLM provider and it provides multiple AI models with its API. Let us go through their pricing for latest AI models:

1a) GPT-4 Turbo

The most powerful OpenAI model as of now. and as GPT vision capabilities which can understand image based inputs.

Context window: 128,000 tokens

Pricing	USD	Tokens
Input	$0.01	1K
Output	$0.03	1K

1b) GPT-4

Best for natural language responses and has broad knowledge. Enough for most use cases. Also the most expensive.

Context window: 8,192 tokens to 32,768 tokens

Pricing	USD	Tokens
Input	$0.03	1K
Output	$0.06	1K

1c) GPT-3.5 Turbo

Best of previous generation model. Very cost effective and useful. Although its training data is old (upto Sep 2021).

Context window: 16,385 tokens

Pricing	USD	Tokens
Input	$0.0010	1K
Output	$0.0020	1K

It is recommended to choose between GPT-4 Turbo or GPT-3.5 Turbo models when it comes to OpenAI. As they are cost-effective and powerful based on your requirement.

For more information and latest pricing visit OpenAI.com

2. Google Gemini Pricing

Gemini is Google's latest family of large language models. According to Google, it is highly advanced and is on par with GPT-4 and also exceeds it in certain scenarios.

However it is not available to all so its results are yet to be verified.

There will be three models of Gemini - Ultra, Pro and Nano. Pricing is available only on Gemini Pro for now.

Gemini Pro

Context window: 32,000 tokens

Pricing	USD	Tokens
Input	$0.00025	1K
Output	$0.0005	1K

For more information and latest pricing visit Google.com

3. Anthropic Pricing

Anthropic PBC is an American artificial intelligence startup company, founded by former members of OpenAI. Anthropic develops general AI systems and large language models. When compared with OpenAI modes, they have much larger context windows for large data and also they are easier to personalize.

3a) Claude Instant

Best for Low latency, high throughput use cases like casual dialogue, text summarization.

Context window: 100,000 tokens

Pricing	USD	Tokens
Input	$0.0008	1K
Output	$0.0024	1K

3b) Claude 2.1

Best for Superior performance on tasks that require complex reasoning like coding, complex dialogue.

Context window: 100,000 tokens

Pricing	USD	Tokens
Input	$0.008	1K
Output	$0.024	1K

Note: They price per million tokens.

For more information and latest pricing visit Anthropic.com

4. Cohere Pricing

Cohere develops AI for large companies and enterprises for solving real-world business use cases. Its main LLM is called Command and it stands out by delivering special AI features like Chat, AI Search and summarization.

4a) Command Light

command-light is the smaller and faster version of command. Use command-light if you are optimizing for latency.

Context window: Unknown

Pricing	USD	Tokens
Input	$0.0003	1K
Output	$0.0006	1K

4a) Command

command is the flagship model of cohere with top-of-the-line features.

Context window: Unknown

Pricing	USD	Tokens
Input	$0.001	1K
Output	$0.002	1K

Note: They price per million tokens.

For more information and latest pricing visit Cohere.com

5. Mistral AI Pricing

Mistral AI is a French company in artificial intelligence. It was founded in April 2023 by researchers previously employed by Meta and Google. It focuses on providing open source LLM. However recently they released pricing for API of their cloud hosted models. The pricing is based on the size of LLM.

5a) Mistral Tiny

Context window: Unknown

Pricing	USD	Tokens
Input	$0.00015	1K
Output	$0.00046	1K

5b) Mistral Small

Context window: Unknown

Pricing	USD	Tokens
Input	$0.00066	1K
Output	$0.00197	1K

5c) Mistral Medium

Context window: Unknown

Pricing	USD	Tokens
Input	$0.00274	1K
Output	$0.00821	1K

Note: They price per million tokens.

For more information and latest pricing visit Mistral.ai

The BIG LLM pricing comparison table

Screenshot 2024-01-02 at 8.09.09 PM.png

Screenshot 2024-01-02 at 9.49.58 PM.png

Our Conclusion

Best price/quality ratio

If you are looking for a cost-effective price option without sacrificing on quality AI outputs. We recommend:

✅️ Winner: GPT-3 Turbo

Top quality for price

If you focus only on quality and pricing is secondary. Then the winner is:

✅️ Winner: GPT-4 Turbo

Inexpensive LLM

If cost is your most important criteria for choosing LLM, we recommend atleast trying Mistral as they have both cloud and opensource options.

✅️ Winner: Mistral AI

It is also worth mentioning Google is aggressively pricing their LLM to compete with OpenAI while newer features like multi-modal capabilities.

LLMs are getting increasingly competitive, so we expect better pricing prospects and more choices in 2024. Also stay tuned for our pricing comparison regarding text to image LLMs like Stable Diffusion and MidJourney.

At SubPage.app, we use a combination of GPT-3 turbo and GPT-4 turbo based on requirements. This helps us to keep our costs down and at the same time deliver optimum quality output to our customers whether it is generating a blog article, writing a release note, or compiling a business policy.

Guide to get Consent for your Business Policies

In today's digital landscape, where data privacy is a growing concern, obtaining the consent of end users has become crucial...

10 months ago

Guide

Ultimate marketing guide for your SaaS product without spending on ads

We know there are hundreds of marketing guides and articles written all over the web. What makes this guide different is that...

4 months ago

Guide

How to reduce your AWS bill in 2024 - Includes Bonus Code

AWS is the market leader when it comes to server infrastructure. It provides several services that spans different categories....

3 months ago

Guide