Available Models

Chat AI provides a large assortment of state-of-the-art open-weight Large Language Models (LLMs) which are hosted on our platform with the highest standards of data protection. The data sent to these models, including the prompts and message contents, are never stored at any location on our systems. Additionally, Chat AI offers models hosted externally such as OpenAI’s GPT-5, GPT-4o, and o3.

Available models are regularly upgraded as newer, more capable ones are released. We select models to include in our services based on user demand, cost, and performance across various benchmarks, such as HumanEval, MATH, HellaSwag, MMLU, etc. Certain models are more capable at specific tasks and with specific settings, which are described below to the best of our knowledge.


List of open-weight models, hosted by GWDG

OrganizationModelOpenKnowledge cutoffContext window in tokensAdvantagesLimitationsRecommended settings
๐Ÿ‡จ๐Ÿ‡ญ Swiss AIApertus 70B Instruct 2509yesApr 202465kFully open-source, Multilingual-temp=0.8
top_p=0.9
๐Ÿ‡จ๐Ÿ‡ณ DeepSeekDeepSeek R1 0528yesDec 202332kGreat overall performance,
reasoning and problem-solving
Censorship, political biasdefault
๐Ÿ‡จ๐Ÿ‡ณ DeepSeekDeepSeek R1 Distill Llama 70ByesDec 202332kGood overall performance,
faster than R1
Censorship, political biasdefault
temp=0.7, top_p=0.8
๐Ÿ‡บ๐Ÿ‡ธ GoogleGemma 3 27B InstructyesMar 2024128kVision, great overall performance-default
๐Ÿ‡ธ๐Ÿ‡ฌ Z.aiGLM-4.7yesApr 2025200kGreat performance-temp=1.0
top_p=0.95
๐Ÿ‡จ๐Ÿ‡ณ OpenGVLabInternVL 3.5 30B A3ByesAug 202540kVision, lightweight and fast-default
๐Ÿ‡ฉ๐Ÿ‡ช VAGOsolutions x MetaLlama 3.1 SauerkrautLM 70B InstructyesDec 2023128kGerman language skills-default
๐Ÿ‡บ๐Ÿ‡ธ GoogleMedGemma 27B InstructyesMar 2024128kVision, medical knowledge-default
๐Ÿ‡บ๐Ÿ‡ธ MetaLlama 3.1 8B InstructyesDec 2023128kFast overall performance-default
๐Ÿ‡บ๐Ÿ‡ธ MetaLlama 3.3 70B InstructyesDec 2023128kGood overall performance,
reasoning and creative writing
-default
temp=0.7, top_p=0.8
๐Ÿ‡ซ๐Ÿ‡ท MistralMistral Large InstructyesJul 2024128kGood overall performance,
coding and multilingual reasoning
-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT OSS 120ByesJun 2024128kGreat overall performance, fast-default
๐Ÿ‡จ๐Ÿ‡ณ Alibaba CloudQwen 3 235B A22B Thinking 2507yesApr 2025222kGreat overall performance,
reasoning
-temp=0.6, top_p=0.95
๐Ÿ‡จ๐Ÿ‡ณ Alibaba CloudQwen 3 30B A3B Instruct 2507yesMar 2025256kGood performance, fast-temp=0.6, top_p=0.95
๐Ÿ‡จ๐Ÿ‡ณ Alibaba CloudQwen 3 30B A3B Thinking 2507yesMar 2025256kGood performance, reasoning-temp=0.6, top_p=0.95
๐Ÿ‡จ๐Ÿ‡ณ Alibaba CloudQwen 3 32ByesSep 202432kGood overall performance,
multilingual, logic
-default
๐Ÿ‡จ๐Ÿ‡ณ Alibaba CloudQwen 3 Coder 30B A3B InstructyesMar 2025256kCoding-temp=0.7, top_p=0.8
๐Ÿ‡จ๐Ÿ‡ณ Alibaba CloudQwen 3 Omni 30B A3B InstructyesMar 2025256kMultimodal-default
๐Ÿ‡จ๐Ÿ‡ณ Alibaba CloudQwen 3 VL 30B A3B InstructyesMar 2025262kVision-default
๐Ÿ‡ฉ๐Ÿ‡ช OpenGPT-XTeuken 7B Instruct ResearchyesSep 2024128kEuropean languages-default
๐Ÿ‡บ๐Ÿ‡ธ intfloat x MistralE5 Mistral 7B Instructyes-4096EmbeddingsAPI Only-

List of external models, hosted by external providers

OrganizationModelOpenKnowledge cutoffContext window in tokensAdvantagesLimitationsRecommended settings
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-5.2 ChatnoAug 2025400kGreat overall performance-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-5.2noAug 2025400kGreat overall performance-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-5.1 ChatnoSep 2024400kGreat overall performance-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-5.1noSep 2024400kGreat overall performance-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-5 ChatnoJun 2024400kGood overall performance-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-5noJun 2024400kGood overall performance, reasoning-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-5 MininoJun 2024400kFast overall performance-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-5 NanonoJun 2024400kFastest overall performance-default
๐Ÿ‡บ๐Ÿ‡ธ OpenAIo3noOct 2023200k-outdateddefault
๐Ÿ‡บ๐Ÿ‡ธ OpenAIo3-mininoOct 2023200k-outdateddefault
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-4onoOct 2023128k-outdateddefault
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-4o MininoOct 2023128k-outdateddefault
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-4.1noJune 20241M-outdateddefault
๐Ÿ‡บ๐Ÿ‡ธ OpenAIGPT-4.1 MininoJune 20241M-outdateddefault

Open-weight models, hosted by GWDG

The models listed in this section are hosted on our platform with the highest standards of data protection. The data sent to these models, including the prompts and message contents, are never stored at any location on our systems.

Apertus 70B Instruct

Apertus is a fully open language model designed to push the boundaries of transparent and compliant AI. It supports over 1,800 languages and a context window size of up to 65,536 tokens, using only fully compliant and open training data. The model achieves comparable performance to closed-source models while respecting opt-out consent of data owners. It was pretrained on 15T tokens with a staged curriculum of web, code, and math data.

DeepSeek R1 0528

Developed by the Chinese company DeepSeek (ๆทฑๅบฆๆฑ‚็ดข), DeepSeek R1 was the first highly-capable open-weights reasoning model to be released. In the latest update, DeepSeek R1 0528, its depth of reasoning and inference capabilities has increased. Although very large and quite slow, it achieves one of the best overall performances among open models.

Warning

DeepSeek models, including R1, have been reported to produce politically biased responses, and censor certain topics that are sensitive for the Chinese government.

DeepSeek R1 Distill Llama 70B

Developed by the Chinese company DeepSeek (ๆทฑๅบฆๆฑ‚็ดข), DeepSeek R1 Distill Llama 70B is a dense model distilled from DeepSeek-R1 but based on LLama 3.3 70B, in order to fit the capabilities and performance of R1 into a 70B parameter-size model.

Google Gemma 3 27B Instruct

Gemma is Google’s family of light, open-weights models developed with the same research used in the development of its commercial Gemini model series. Gemma 3 27B Instruct is quite fast and thanks to its support for vision (image input), it is a great choice for all sorts of conversations.

GLM-4.7

GLM-4.7 is a coding-focused model that delivers significant improvements over its predecessor in multilingual agentic coding and terminal-based tasks. It achieves strong performance on SWE-bench, SWE-bench Multilingual, and Terminal Bench 2.0. GLM-4.7 also excels at tool use, web browsing, and mathematical reasoning, with notable gains on benchmarks like HLE and ฯ„ยฒ-Bench.

InternVL 3.5 30B-A3B

InternVL 3.5 30B-A3B is a lightweight, fast and powerful multimodal model developed by OpenGVLab. It significantly advances versatility, reasoning capability, and efficiency, by featuring a Visual Resolution Router (ViR) for dynamic visual token adjustment and Decoupled Vision-Language Deployment (DvD) for efficient GPU load balancing, achieving up to 4ร— inference speedup compared to its predecessor. The model excels at multimodal reasoning, OCR, document understanding, multi-image comprehension, video understanding, GUI tasks, and embodied agency.

Llama 3.1 SauerkrautLM 70B Instruct

SauerkrautLM is trained by VAGOsolutions on Llama 3.1 70B specifically for prompts in German.

Google MedGemma 27B Instruct

MedGemma 27B Instruct is a variant of Gemma 3 suitable for medical text and image comprehension. It has been trained on a variety of medical image data, including chest X-rays, dermatology images, ophthalmology images, and histopathology slides, as well as medical text, such as medical question-answer pairs, and FHIR-based electronic health record data. MedGemma variants have been evaluated on a range of clinically relevant benchmarks to illustrate their baseline performance.

Meta Llama 3.1 8B Instruct

The standard model we recommend. It is the most lightweight with the fastest performance and good results across all benchmarks. It is sufficient for general conversations and assistance.

Meta Llama 3.3 70B Instruct

Achieves good overall performance, on par with GPT-4, but with a much larger context window and more recent knowledge cutoff. Best in English comprehension and further linguistic reasoning, such as translations, understanding dialects, slang, colloquialism and creative writing.

Mistral Large Instruct

Developed by Mistral AI, Mistral Large Instruct 2407 is a dense language model with 123B parameters. It achieves great benchmarking scores in general performance, code and reasoning, and instruction following. It is also multi-lingual and supports many European and Asian languages.

OpenAI GPT OSS 120B

In August 2025, OpenAI released the gpt-oss model series, consisting of two open-weight LLMs that are optimized for faster inference with state-of-the-art performance across many domains, including reasoning and tool use. According to OpenAI, the gpt-oss-120b model achieves near-parity with OpenAI o4-mini on core reasoning benchmarks.

Qwen 3 235B A22B Thinking 2507

Expanding on Qwen 3 235B A22B, one of the best-performing models of the Qwen 3 series, Qwen 3 235B A22B Thinking 2507 has a significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks. It is an MoE model with 235B total parameters and 22B activated parameters, and achieves state-of-the-art results among open-weights thinking models.

Qwen 3 30B A3B Instruct 2507

This MoE model features 30.5B total parameters with 3.3B activated parameters for efficient inference. It delivers significant improvements in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage, with better alignment for subjective and open-ended tasks. The model supports a 256K native context length and operates in non-thinking mode, achieving strong performance across knowledge, reasoning, coding, and multilingual benchmarks.

Qwen 3 30B A3B Thinking 2507

The Thinking variant of Qwen 3 30B A3B is optimized for complex reasoning tasks. It excels at mathematical problem-solving (AIME25, HMMT25), logical reasoning (ZebraLogic), and coding challenges, while maintaining strong performance on knowledge benchmarks like MMLU-Pro and GPQA. This model also demonstrates strong alignment capabilities, scoring high on IFEval, Arena-Hard v2, and creative writing benchmarks.

Qwen 3 32B

Qwen 3 32B is a large dense model developed by Alibaba Cloud released in April 2025. It supports reasoning and outperforms or is at least on par with other strong reasoning models such as OpenAI o1 and DeepSeek R1.

Qwen 3 Coder 30B A3B Instruct

Qwen 3 Coder 30B A3B Instruct is a specialized coding model that achieves strong performance on agentic coding, browser-use, and other foundational coding tasks among open models.

Qwen 3 Omni 30B A3B Instruct

Qwen3 Omni is a natively multilingual omni-modal foundation model that processes text, images, audio, and video. It achieves state-of-the-art performance on many audio/video benchmarks with ASR, audio understanding, and voice conversation performance comparable to Gemini 2.5 Pro. The model features a novel MoE-based Thinkerโ€“Talker architecture with AuT pretraining, supports 119 text languages, 19 speech input languages, and enables low-latency interaction with flexible control via system prompts.

Qwen 3 VL 30B A3B Instruct

Qwen3 VL is the most powerful vision-language model in the Qwen series to date, featuring comprehensive upgrades across visual perception, reasoning, and agent capabilities. This MoE model (30B total, 3B active) excels as a visual agent that can operate PC/mobile GUIs, generate code from images/videos (Draw.io/HTML/CSS/JS), and perform advanced spatial reasoning with 2D and 3D grounding. It supports native 256K context for long-form video understanding, recognizes a wide range of entities including celebrities, anime, products, and landmarks, and offers robust OCR across 32 languages. Text understanding is on par with pure LLMs, enabling seamless text-vision fusion.

OpenGPT-X Teuken 7B Instruct Research

OpenGPT-X is a research project funded by the German Federal Ministry of Economics and Climate Protection (BMWK) and led by Fraunhofer, Forschungszentrum Jรผlich, TU Dresden, and DFKI. Teuken 7B Instruct Research v0.4 is an instruction-tuned 7B parameter multilingual LLM pre-trained with 4T tokens, focusing on covering all 24 EU languages and reflecting European values.


External models, hosted by external providers

Warning

These OpenAI models are hosted on Microsoft Azure, and Chat AI only relays the contents of your messages to their servers. Microsoft adheres to GDPR and is contractually bound not to use this data for training or marketing purposes, but they may store messages for up to 30 days. We therefore recommend the open-weight models, hosted by us, to ensure the highest security and data privacy.

OpenAI GPT-5, 5.1, and 5.2 Series

OpenAI’s GPT-5, 5.1, and 5.2 series models achieve state-of-the-art performance across various benchmarks. The series consists of the following models along with their intended use cases:

  • OpenAI GPT-5/5.1/5.2 Chat: Non-reasoning model. Designed for advanced, natural, multimodal, and context-aware conversations.
  • OpenAI GPT-5/5.1/5.2: Reasoning model. Designed for logic-heavy and multi-step tasks.
  • OpenAI GPT-5 Mini: A lightweight variant of GPT-5 for cost-sensitive applications.
  • OpenAI GPT-5 Nano: A highly optimized variant of GPT-5. Ideal for applications requiring low latency.

OpenAI GPT-4o

GPT 4o (“o” for “omni”) is a general-purpose model developed by OpenAI. This model improves on the relatively older GPT 4, and supports vision (image input) too.

OpenAI GPT-4o Mini

This was developed as a more cost-effective and faster alternative to GPT 4o.

OpenAI GPT-4.1

OpenAI’s GPT-4.1-class models improve on the older GPT-4 series. These models also outperform GPT-4o and GPT-4o Mini, especially in coding and instruction following. They have a large context window size of 1M tokens, with improved long-context comprehension, and an updated knowledge cutoff of June 2024.

OpenAI GPT-4.1 Mini

This was developed as a more cost-effective and faster alternative to GPT-4.1.

OpenAI o1 and o1 Mini

OpenAI’s o1-class models were developed to perform complex reasoning tasks. These models have now been superceded by the o3-series, and are therefore no longer recommended.

OpenAI o3

Released in April 2025, OpenAI’s o3-class models were developed to perform complex reasoning tasks across the domains of coding, math, science, visual perception, and more. These models have an iterative thought process, and therefore take their time to process internally before responding to the user. The thought process for o3 models are not shown to the user.

OpenAI o3 Mini

This was developed as a more cost-effective and faster alternative to o3.