Commercial Models

GWDG offers access to selected commercial foundation models (e.g. OpenAI GPT, Anthropic Claude) via Microsoft Azure. For AI model access via SAIA, including locally hosted and external models, see SAIA. This section documents access to the commercial offering and related Agentic Coding Tools .

Warning

Commercial models are external cloud services. Do not assume the same data locality guarantees as for locally hosted services such as SAIA. AI systems can hallucinate, and sensitive or confidential data should only be processed if that is permitted for your use case.

Info

Commercial access is billed usage. The initial monthly budget communicated during onboarding is not a hard spending limit. There are currently no technical safeguards that prevent costs from exceeding that amount, so users are responsible for monitoring their own usage. For institutional contracts, optional access to external models, and limiting access, see Institutional Access to AI Services.

Access Options

The commercial model offering consists of three complementary access options. A1 has two variants: locally hosted open-weight models and externally hosted models through the Chat AI and SAIA service layer.

OptionAccess pathAPI featuresTypical use
A1 - Chat AI and SAIA ecosystemChat AI and SAIAOpenAI-compatible v1 APIs, including Chat CompletionsA1.OS for locally hosted open-weight models; A1.EM for external commercial models through the GWDG-controlled service layer
A2 - Microsoft Foundry endpoint (e.g. for Codex, Claude Code)Direct commercial endpoint provisioned for the user, project, or institutionResponses APIAPI-based tools that require Responses API support, especially Agentic Coding; not recommended for sensitive data
A3 - license managementVendor team or enterprise licenses procured through GWDGDepends on the vendor licenseNative vendor applications, including desktop or mobile apps

SAIA provides OpenAI-compatible v1 endpoints such as /v1/chat/completions. It does not provide the Responses API. Tools that require the Responses API need A2 access instead.

How to get Access?

To request access to commercial or externally hosted models, contact support@gwdg.de. Please include whether you need A1, A2, or A3 access.

Current Models

The current commercial portfolio includes the following model families:

  • OpenAI GPT-5.5
  • Anthropic Claude Sonnet 4.6
  • Anthropic Claude Opus 4.7

For models available through Chat AI and SAIA, including locally hosted open-weight models and external models, see Available Models.

Data Protection Profile

For A1.OS, requests are processed with locally hosted open-weight models in GWDG infrastructure. For A1.EM, requests are routed through the GWDG-controlled Chat AI and SAIA service layer to external providers. SAIA itself does not provide persistent storage of request contents through the Responses API.

For A2, requests go directly to the Microsoft Foundry endpoint. The Responses API can keep conversation state server-side for a limited time. Azure OpenAI Responses API response data is retained for 30 days by default. Extended prompt cache retention can keep cached prefixes active for up to 24 hours. See the Microsoft references on Responses API conversation storage and prompt caching. Because of the Cloud Act, it cannot be excluded that the manufacturers may use the data or be required to provide it to the US government. This access path is therefore intended for use cases where this trade-off is accepted, for example agentic coding on non-sensitive code bases.

For A3, the data protection profile depends on the selected vendor license and the vendor’s product terms.

Administration and Activation for A1

The administration of A1 and A2 is based on the GWDG Identity Management (IdM). Both access permissions and token limits are controlled through the IdM. Because the IdM is multi-tenant, institutes can manage their own users, groups, and quotas independently.

Role Model

  • Users consume their assigned quotas and request top-ups when needed.
  • Users can review their own consumption.
  • Users can create API keys with their own sub-limits via self-service.
  • Institute administrators set and adjust limits via the IdM, review and approve top-up requests, assign consumption to cost centres, and see the consumption of their own tenant group.
  • GWDG operates the platform.

Delegation and Multi-Tenancy

The multi-tenant design lets institutes pass resources on to individuals or projects and request additional resources when needed β€” for example, for a specific project. Top-ups can be performed by authorised administrators.

Administration and Activation for A2

Administration of A2 follows the same model as A1, with the following differences:

  • A2 uses soft limits that, for example, trigger an email to the user at 50 % or 100 % consumption. The reason is that hard cost limits per request are not always available from the providers (see Quotas and Budgeting).
  • Self-service for creating API keys with sub-limits is not yet available for A2; this feature is currently restricted to A1. A web portal for self-service is planned.
  • A2 is not enabled by default but is unlocked for specific use cases where the data-protection trade-off of the Responses API (see Data Protection Profile) is consciously accepted.

Quotas and Budgeting

Users receive a base quota that can be tied to an IdM group. The quota is not free of charge β€” it is procured at the institute level and made available to users. The granularity defaults to monthly budgets, but other intervals or one-off quotas can be configured at the institute’s discretion.

When a user reaches their hard or soft limit, additional quota can be assigned by an institute administrator without immediately triggering a formal top-up. Justifications for over-limit requests can be reviewed by administrators to track which use cases are driving demand.

Users can create multiple API keys. For A1, each key can carry its own sub-limit up to the user’s assigned quota. This separates tools cleanly β€” for example one key for a production script and one for an experiment β€” and protects against runaway consumption from a single application. It also makes per-project cost attribution possible.

Payment Models

The payment model is intentionally flexible to cover both project-funded usage and small-scale needs.

  • Pay-per-use (default) β€” billing is based on consumed API tokens, mapping the actual provider costs directly to usage. Institutes or users top up an account, from which consumption is debited; once the credit is used up, it can be topped up again. This works analogously to the familiar print credit model at universities.
  • Subscription β€” a fixed monthly or yearly quota, suitable for working groups with regular and stable usage.
  • Cost-centre booking β€” via the IdM’s multi-tenancy, consumption can be booked to cost centres, including arrangements where an institute settles consumption first and bills internally afterwards.
  • Micro-payment β€” small top-ups on demand without administrative overhead, intended for users with small budgets, especially students.

Agentic Coding