Available Models
Current Models
Chat AI currently hosts a large assortment of high-quality, open-source models. All models except external models such as ChatGPT are self-hosted with the guarantee of the highest standards of data protection. These models run completely on our hardware and don’t store any user data.
Models requiring more GPUs may take longer to allocate resources and print responses. Models are updated on-demand as newer versions are released. All models are chosen for their high performance across multiple or specific LLM benchmarks, such as HumanEval, MATH, HellaSwag, MMLU, etc. Certain models are optimal for certain tasks according to the benchmarks. A model performs better at a particular task if its completion options are configured accordingly.
Model Name | Developer | OS | Knowledge cutoff | Context window | Advantages | Possible config. for advantages |
---|---|---|---|---|---|---|
Llama 3.1 8B Instruct | Meta | yes | December 2023 | 128k tokens | Fastest general usage | default (Temp:0.5; top_P:0.5) |
Llama 3.1 70B Instruct | Meta | yes | December 2023 | 128k tokens | Great overall performance Multilingual reasoning and creative writing | default Temp:0.7; top_P:0.8 |
Llama 3.1 SauerkrautLM 70B Instruct | VAGOsolutions x Meta | yes | December 2023 | 128k tokens | German overall usage | default |
Llama 3.1 Nemotron 70B Instruct | NVIDIA x Meta | yes | December 2023 | 128k tokens | Overall improvments over Llama 3.1 70B | default |
Mistral Large Instruct | Mistral | yes | July 2024 | 128k tokens | Great overall performance Coding and multilingual reasoning | default |
Codestral 22B Instruct | Mistral | yes | Late 2021 | 33k tokens | Writing, editing, fixing and commenting code Exploratory programming | Temp:0.2; top_P:0.1 Temp:0.6; top_P:0.7 |
E5 Mistral 7B Instruct | Intfloat x Mistral | yes | - | 4096 tokens | embeddings, only API | - |
Qwen 2.5 72B Instruct | Alibaba Cloud | yes | September 2024 | 128k tokens | Global affairs, Chinese, overall usage. mathematics, logic | default Temp:0.2; top_P:0.1 |
Qwen 2 VL 72B Instruct | Alibaba Cloud | yes | June 2023 | 32k tokens | VLM, Chinese overall usage | default |
InternVL2 8B | OpenGVLab | yes | September 2021 | 32k | VLM, small and fast | default |
ChatGPT 3.5 | OpenAI | no | September 2021 | 16k tokens | Large compute | default |
ChatGPT 4 | OpenAI | no | September 2021 | 8k tokens | Large compute | default |
ChatGPT 4o-Mini | OpenAI | no | September 2021 | 128k tokens | VLM, cost effective | default |
Meta
Llama 3.1-8B Instruct
The standard model we recommend. It is the most lightweight with the fastest performance and good results across all benchmarks. It is sufficient for general conversations and assistance.
Llama 3.1-70B Instruct
Trained the same way as the above model, but with many more parameters. Achieves the best overall performance of our models, on par with ChatGPT 4, but with a much larger context window and more recent knowledge cutoff. Best in English comprehension and further linguistic reasoning, such as translations, understanding dialects/slang//colloquialism and creative writing.
Llama 3.1 SauerkrautLM 70B Instruct
SauerkrautLM is trained by VAGOsolutions on the above model specifically for prompts in German.
Llama 3.1 Nemotron 70B Instruct
NVIDIA has finetuned the Llama 3.1 70B model and achieved overall improvements on many LLM benchmarks.
Mistral AI
Mistral Large Instruct
Mistral Large Instruct 2407 is a dense language model with 123B parameters. It achieves state-of-the-art benchmarking scores in general performance, code and reasoning, and instruction following. It is also multi-lingual and supports many European and Asian languages.
Codestral 22B
Codestral 22B is developed specifically for the goal of code completion. It was trained on more than 80 different programming languages, including Python, SQL, bash, C++, Java, and PHP. It uses a context window of 33k for evaluation of large code generating, and can fit on one GPU of our cluster. We recommend using Codestral for all coding purposes.
Alibaba Cloud
Qwen 2.5 72B Instruct
Another large model with benchmark scores slightly above Mistral Large Instruct. Qwen is trained on the most recent data however, and is thus better suited for global affairs. It is the best model for prompts in Chinese, and it also has exceptional performance in mathematics and logic.
Qwen 2 72B VL Instruct
A powerful VLM model with competetive performance in both langauge and image comprehension tasks.
OpenGVLab
InternVL2 8B
A small but poweful VLM model from the chinese company OpenGVLab.
OpenAI
ChatGPT 3.5 and 4
These models should be the most familiar to users. The external OpenAI models are hosted by Microsoft and have some limitations. While Microsoft is contractually bound not to use user data for training or marketing purposes and adheres to GDPR, they do store user inquiries for up to 30 days.
Despite these limitations, the external OpenAI models have their benefits. ChatGPT 3.5 has benchmark results similar to the Llama 3.1 8B Instruct, and ChatGPT 4 has results similar to our larger models. They have a large amount of compute dedicated to them, making them potentially more accessible during periods of high demand. However, they may experience a small delay due to the need for an external connection. We highly recommend the self-hosted, open-sourced models to ensure the most security and data privacy.
ChatGPT 4o Mini
ChatGPT 4o Mini is designed as a more cost-effective and faster alternative to the other ChatGPT models. Moreover, it is a VLM and accepts both text and image inputs.