Available Models

Current Models

Chat AI provides a large assortment of state-of-the-art open-weight Large Language Models (LLMs) which are hosted on our platform with the highest standards of data protection. The data sent to these models, including the prompts and message contents, are never stored at any location on our systems. Additionally, Chat AI offers models hosted externally such as OpenAI’s o1, GPT-4o, and GPT-4.

Available models are regularly upgraded as newer, more capable ones are released. We choose models based on their performance across various LLM benchmarks, such as HumanEval, MATH, HellaSwag, MMLU, etc. Certain models are more capable at specific tasks and with specific settings, which are described below.

OrganizationModelOpenKnowledge cutoffContext window in tokensAdvantagesLimitationsRecommended settings
πŸ‡ΊπŸ‡Έ MetaLlama 3.1 8B InstructyesDec 2023128kFastest overall performance-default
πŸ‡¨πŸ‡³ OpenGVLabInternVL2.5 8B MPOyesSep 202132kVision, lightweight and fast-default
πŸ‡¨πŸ‡³ DeepSeekDeepSeek R1yesDec 202332kGreat overall performance,
reasoning and problem-solving
Political biasdefault
πŸ‡¨πŸ‡³ DeepSeekDeepSeek R1 Distill Llama 70ByesDec 202332kGood overall performance,
faster than R1
Political biasdefault
temp=0.7, top_p=0.8
πŸ‡ΊπŸ‡Έ MetaLlama 3.3 70B InstructyesDec 2023128kGood overall performance,
reasoning and creative writing
-default
temp=0.7, top_p=0.8
πŸ‡©πŸ‡ͺ VAGOsolutions x MetaLlama 3.1 SauerkrautLM 70B InstructyesDec 2023128kGerman language skills-default
πŸ‡ΊπŸ‡Έ NVIDIA x MetaLlama 3.1 Nemotron 70B InstructyesDec 2023128kGood overall performance-default
πŸ‡«πŸ‡· MistralMistral Large InstructyesJul 2024128kGood overall performance,
coding and multilingual reasoning
-default
πŸ‡«πŸ‡· MistralCodestral 22ByesLate 202132kCoding tasks-temp=0.2, top_p=0.1
temp=0.6, top_p=0.7
πŸ‡ΊπŸ‡Έ intfloat x MistralE5 Mistral 7B Instructyes-4096EmbeddingsAPI Only-
πŸ‡¨πŸ‡³ Alibaba CloudQwen 2.5 72B InstructyesSep 2024128kGood overall performance,
multilingual, global affairs, logic
-default
temp=0.2, top_p=0.1
πŸ‡¨πŸ‡³ Alibaba CloudQwen 2.5 VL 72B InstructyesSep 202490kVision, multilingual-default
πŸ‡¨πŸ‡³ Alibaba CloudQwen 2.5 Coder 32B InstructyesSep 2024128kCoding tasks-default
temp=0.2, top_p=0.1
πŸ‡ΊπŸ‡Έ OpenAIo1noOct 2023128kGood overall performance, reasoningno streamingdefault
πŸ‡ΊπŸ‡Έ OpenAIo1-mininoOct 2023128kFast overall performance, reasoningno streamingdefault
πŸ‡ΊπŸ‡Έ OpenAIGPT-4onoOct 2023128kGood overall performance, vision-default
πŸ‡ΊπŸ‡Έ OpenAIGPT-4o MininoOct 2023128kFast overall performance, vision-default
πŸ‡ΊπŸ‡Έ OpenAIGPT-4noSep 20218k-outdated-
πŸ‡ΊπŸ‡Έ OpenAIGPT-3.5noSep 202116k-outdated-

Meta Llama 3.1 8B Instruct

The standard model we recommend. It is the most lightweight with the fastest performance and good results across all benchmarks. It is sufficient for general conversations and assistance.

Meta Llama 3.3 70B Instruct

Achieves good overall performance, on par with GPT-4, but with a much larger context window and more recent knowledge cutoff. Best in English comprehension and further linguistic reasoning, such as translations, understanding dialects, slang, colloquialism and creative writing.

DeepSeek R1

Developed by the Chinese company DeepSeek (深度求紒), DeepSeek R1 is the first highly-capable open-weights reasoning model ever to be released. It holds the record for best overall performances among open models, on par with OpenAI’s GPT-4o or even o1.

Warning

DeepSeek models, including R1, have been reported to produce biased responses in favor of the Chinese government.

DeepSeek R1 Distill Llama 70B

Developed by the Chinese company DeepSeek (深度求紒), DeepSeek R1 Distill Llama 70B is a dense model distilled from DeepSeek-R1 but based on LLama 3.3 70B, in order to fit the capabilities and performance of R1 into a 70B parameter-size model.

Llama 3.1 SauerkrautLM 70B Instruct

SauerkrautLM is trained by VAGOsolutions on Llama 3.1 70B specifically for prompts in German.

Llama 3.1 Nemotron 70B Instruct

Nemotron is a model based on Llama 3.1 70B model, finetuned by Nvidia to improve overall peformance.

Mistral Large Instruct

Developed by Mistral AI, Mistral Large Instruct 2407 is a dense language model with 123B parameters. It achieves great benchmarking scores in general performance, code and reasoning, and instruction following. It is also multi-lingual and supports many European and Asian languages.

Codestral 22B

Codestral 22B was developed by Mistral AI specifically for the goal of code completion. It was trained on more than 80 different programming languages, including Python, SQL, bash, C++, Java, and PHP. It uses a context window of 32k for evaluation of large code generating, and can fit on one GPU of our cluster.

Qwen 2.5 72B Instruct

Built by Alibaba Cloud, Qwen 2.5 72B Instruct is another large model with benchmark scores slightly higher than Mistral Large Instruct. Qwen is trained on more recent data, and is thus also better suited for global affairs. It is great for multilingual prompts, but it also has remarkable performance in mathematics and logic.

Qwen 2.5 72B VL Instruct

A powerful Vision Language Model (VLM) with competitive performance in both langauge and image comprehension tasks.

Qwen 2.5 Coder 32B Instruct

Qwen 2.5 Coder 32B Instruct is a code-specific LLM based on Qwen 2.5. It has one of the highest scores on code-related tasks, on par with OpenAI’s GPT-4o, and is recommended for code generation, code reasoning and code fixing.

InternVL2.5 8B MPO

A lightweight, fast and powerful Vision Language Model (VLM), developed by OpenGVLab. It builds upon InternVL2.5 8B and Mixed Preference Optimization (MPO).

OpenAI o1

OpenAI’s o1-class models were developed to perform complex reasoning tasks. These models have an iterative thought process, similar to Deepseek R1, and therefore take their time to process internally before responding to the user. Unlike DeepSeek R1, the thought process for o1 models are not shown to the user.

OpenAI o1 Mini

This was developed as a more cost-effective and faster alternative to o1.

OpenAI GPT-4o

GPT 4o (“o” for “omni”) is a general-purpose model developed by OpenAI. This model improves on the relatively older GPT 4, and supports vision (image input) too.

OpenAI GPT-4o Mini

This was developed as a more cost-effective and faster alternative to GPT 4o.

OpenAI GPT-3.5 and GPT-4

These models are outdated and not recommended anymore.

Warning

OpenAI models are provided by Microsoft and Chat AI only relays the contents of your messages to their servers. Microsoft adheres to GDPR and is contractually bound not to use this data for training or marketing purposes, but they store messsages for up to 30 days. We therefore recommend the open-weight models, hosted by us, to ensure the highest security and data privacy.