Available Models
Current Models
Chat AI provides a large assortment of state-of-the-art open-weight Large Language Models (LLMs) which are hosted on our platform with the highest standards of data protection. The data sent to these models, including the prompts and message contents, are never stored at any location on our systems. Additionally, Chat AI offers models hosted externally such as OpenAI’s o1, GPT-4o, and GPT-4.
Available models are regularly upgraded as newer, more capable ones are released. We choose models based on their performance across various LLM benchmarks, such as HumanEval, MATH, HellaSwag, MMLU, etc. Certain models are more capable at specific tasks and with specific settings, which are described below.
Organization | Model | Open | Knowledge cutoff | Context window in tokens | Advantages | Limitations | Recommended settings |
---|---|---|---|---|---|---|---|
πΊπΈ Meta | Llama 3.1 8B Instruct | yes | Dec 2023 | 128k | Fastest overall performance | - | default |
π¨π³ OpenGVLab | InternVL2.5 8B MPO | yes | Sep 2021 | 32k | Vision, lightweight and fast | - | default |
π¨π³ DeepSeek | DeepSeek R1 | yes | Dec 2023 | 32k | Great overall performance, reasoning and problem-solving | Political bias | default |
π¨π³ DeepSeek | DeepSeek R1 Distill Llama 70B | yes | Dec 2023 | 32k | Good overall performance, faster than R1 | Political bias | default temp=0.7, top_p=0.8 |
πΊπΈ Meta | Llama 3.3 70B Instruct | yes | Dec 2023 | 128k | Good overall performance, reasoning and creative writing | - | default temp=0.7, top_p=0.8 |
π©πͺ VAGOsolutions x Meta | Llama 3.1 SauerkrautLM 70B Instruct | yes | Dec 2023 | 128k | German language skills | - | default |
πΊπΈ NVIDIA x Meta | Llama 3.1 Nemotron 70B Instruct | yes | Dec 2023 | 128k | Good overall performance | - | default |
π«π· Mistral | Mistral Large Instruct | yes | Jul 2024 | 128k | Good overall performance, coding and multilingual reasoning | - | default |
π«π· Mistral | Codestral 22B | yes | Late 2021 | 32k | Coding tasks | - | temp=0.2, top_p=0.1 temp=0.6, top_p=0.7 |
πΊπΈ intfloat x Mistral | E5 Mistral 7B Instruct | yes | - | 4096 | Embeddings | API Only | - |
π¨π³ Alibaba Cloud | Qwen 2.5 72B Instruct | yes | Sep 2024 | 128k | Good overall performance, multilingual, global affairs, logic | - | default temp=0.2, top_p=0.1 |
π¨π³ Alibaba Cloud | Qwen 2.5 VL 72B Instruct | yes | Sep 2024 | 90k | Vision, multilingual | - | default |
π¨π³ Alibaba Cloud | Qwen 2.5 Coder 32B Instruct | yes | Sep 2024 | 128k | Coding tasks | - | default temp=0.2, top_p=0.1 |
πΊπΈ OpenAI | o1 | no | Oct 2023 | 128k | Good overall performance, reasoning | no streaming | default |
πΊπΈ OpenAI | o1-mini | no | Oct 2023 | 128k | Fast overall performance, reasoning | no streaming | default |
πΊπΈ OpenAI | GPT-4o | no | Oct 2023 | 128k | Good overall performance, vision | - | default |
πΊπΈ OpenAI | GPT-4o Mini | no | Oct 2023 | 128k | Fast overall performance, vision | - | default |
πΊπΈ OpenAI | GPT-4 | no | Sep 2021 | 8k | - | outdated | - |
πΊπΈ OpenAI | GPT-3.5 | no | Sep 2021 | 16k | - | outdated | - |
Meta Llama 3.1 8B Instruct
The standard model we recommend. It is the most lightweight with the fastest performance and good results across all benchmarks. It is sufficient for general conversations and assistance.
Meta Llama 3.3 70B Instruct
Achieves good overall performance, on par with GPT-4, but with a much larger context window and more recent knowledge cutoff. Best in English comprehension and further linguistic reasoning, such as translations, understanding dialects, slang, colloquialism and creative writing.
DeepSeek R1
Developed by the Chinese company DeepSeek (ζ·±εΊ¦ζ±η΄’), DeepSeek R1 is the first highly-capable open-weights reasoning model ever to be released. It holds the record for best overall performances among open models, on par with OpenAI’s GPT-4o or even o1.
DeepSeek models, including R1, have been reported to produce biased responses in favor of the Chinese government.
DeepSeek R1 Distill Llama 70B
Developed by the Chinese company DeepSeek (ζ·±εΊ¦ζ±η΄’), DeepSeek R1 Distill Llama 70B is a dense model distilled from DeepSeek-R1 but based on LLama 3.3 70B, in order to fit the capabilities and performance of R1 into a 70B parameter-size model.
Llama 3.1 SauerkrautLM 70B Instruct
SauerkrautLM is trained by VAGOsolutions on Llama 3.1 70B specifically for prompts in German.
Llama 3.1 Nemotron 70B Instruct
Nemotron is a model based on Llama 3.1 70B model, finetuned by Nvidia to improve overall peformance.
Mistral Large Instruct
Developed by Mistral AI, Mistral Large Instruct 2407 is a dense language model with 123B parameters. It achieves great benchmarking scores in general performance, code and reasoning, and instruction following. It is also multi-lingual and supports many European and Asian languages.
Codestral 22B
Codestral 22B was developed by Mistral AI specifically for the goal of code completion. It was trained on more than 80 different programming languages, including Python, SQL, bash, C++, Java, and PHP. It uses a context window of 32k for evaluation of large code generating, and can fit on one GPU of our cluster.
Qwen 2.5 72B Instruct
Built by Alibaba Cloud, Qwen 2.5 72B Instruct is another large model with benchmark scores slightly higher than Mistral Large Instruct. Qwen is trained on more recent data, and is thus also better suited for global affairs. It is great for multilingual prompts, but it also has remarkable performance in mathematics and logic.
Qwen 2.5 72B VL Instruct
A powerful Vision Language Model (VLM) with competitive performance in both langauge and image comprehension tasks.
Qwen 2.5 Coder 32B Instruct
Qwen 2.5 Coder 32B Instruct is a code-specific LLM based on Qwen 2.5. It has one of the highest scores on code-related tasks, on par with OpenAI’s GPT-4o, and is recommended for code generation, code reasoning and code fixing.
InternVL2.5 8B MPO
A lightweight, fast and powerful Vision Language Model (VLM), developed by OpenGVLab. It builds upon InternVL2.5 8B and Mixed Preference Optimization (MPO).
OpenAI o1
OpenAI’s o1-class models were developed to perform complex reasoning tasks. These models have an iterative thought process, similar to Deepseek R1, and therefore take their time to process internally before responding to the user. Unlike DeepSeek R1, the thought process for o1 models are not shown to the user.
OpenAI o1 Mini
This was developed as a more cost-effective and faster alternative to o1.
OpenAI GPT-4o
GPT 4o (“o” for “omni”) is a general-purpose model developed by OpenAI. This model improves on the relatively older GPT 4, and supports vision (image input) too.
OpenAI GPT-4o Mini
This was developed as a more cost-effective and faster alternative to GPT 4o.
OpenAI GPT-3.5 and GPT-4
These models are outdated and not recommended anymore.
OpenAI models are provided by Microsoft and Chat AI only relays the contents of your messages to their servers. Microsoft adheres to GDPR and is contractually bound not to use this data for training or marketing purposes, but they store messsages for up to 30 days. We therefore recommend the open-weight models, hosted by us, to ensure the highest security and data privacy.