SAIA
SAIA is our Scalable Artificial Intelligence (AI) Accelerator that hosts our AI services. Such services include Chat AI and CoCo AI, with more to be added soon. SAIA API (application programming interface) keys can be requested and used to access the services from within your code. API keys are not necessary to use the Chat AI web interface.
API Request
If a user has an API key, they can use the available models from within their terminal or python scripts. To get access to an API key, go to the KISSKI LLM Service page and click on “Book”. There you will find a form to fill out with your credentials and intentions with the API key. Please use the same email address as is assigned to your AcademicCloud account. Once received, DO NOT share your API key with other users!
API Usage
The API service is compatible with the OpenAI API standard. We provide the following endpoints:
/chat/completions
/completions
/embeddings
API Minimal Example
You can use your API key to access Chat AI directly from your terminal. Here is an example of how to do text completion with the API.
curl -i -X POST \
--url https://chat-ai.academiccloud.de/v1/completions \
--header 'Accept: application/json' \
--header 'Authorization: Bearer <api_key>' \
--header 'Content-Type: application/json'\
--data '{
"model": "meta-llama-3.1-8b-instruct",
"prompt": "San Francisco is a",
"max_tokens": 7,
"temperature": 0
}'
Ensure to replace <api_key>
with your own API key.
API Model Names
For more information on the respective models see the model list.
Model Name | Capabilities |
---|---|
‘meta-llama-3.1-8b-instruct’ | text |
‘meta-llama-3.1-70b-instruct’ | text |
’llama-3.1-sauerkrautlm-70b-instruct' | text |
’llama-3.1-nemotron-70b-instruct' | text |
‘codestral-22b’ | text |
‘mistral-large-instruct’ | text |
‘qwen-2.5-72b-instruct’ | text |
‘qwen-2-vl-72b-instruct’ | text, image |
‘internvl2-8b’ | text, image |
’e5-mistral-7b-instruct' | embeddings |
The list of available models can also be retrieved via the following command.
curl -X POST \
--url https://chat-ai.academiccloud.de/v1/models \
--header 'Accept: application/json' \
--header 'Authorization: Bearer <api_key>' \
--header 'Content-Type: application/json'
API Usage
The OpenAI GPT 3.5 and 4 models are not available for API usage. For configuring your own requests in greater detail, such as setting the frequency_penalty
,seed
,max_tokens
and more, refer to the openai API reference page.
Chat
It is possible to import an entire conversation into your command. This conversation can be from a previous session with the same model or another, or between you and a friend/colleague if you would like to ask them more questions (just be sure to update your system prompt to say “You are a friend/colleague trying to explain something you said that was confusing”).
curl -i -N -X POST \
--url https://chat-ai.academiccloud.de/v1/chat/completions \
--header 'Accept: application/json' \
--header 'Authorization: Bearer <api_key>' \
--header 'Content-Type: application/json'\
--data '{
"model": "meta-llama-3.1-8b-instruct",
"messages": [{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"How tall is the Eiffel tower?"},{"role":"assistant","content":"The Eiffel Tower stands at a height of 324 meters (1,063 feet) above ground level. However, if you include the radio antenna on top, the total height is 330 meters (1,083 feet)."},{"role":"user","content":"Are there restaurants?"}],
"temperature": 0
}'
For ease of usage, you can access the Chat AI models by executing a Python file, for example, by pasting the below code into the file.
from openai import OpenAI
# API configuration
api_key = '<api_key>' # Replace with your API key
base_url = "https://chat-ai.academiccloud.de/v1"
model = "meta-llama-3.1-8b-instruct" # Choose any available model
# Start OpenAI client
client = OpenAI(
api_key = api_key,
base_url = base_url
)
# Get response
chat_completion = client.chat.completions.create(
messages=[{"role":"system","content":"You are a helpful assistant"},{"role":"user","content":"How tall is the Eiffel tower?"},{"role":"assistant","content":"The Eiffel Tower stands at a height of 324 meters (1,063 feet) above ground level. However, if you include the radio antenna on top, the total height is 330 meters (1,083 feet)."},{"role":"user","content":"Are there restaurants?"}],
model= model,
)
# Print full response as JSON
print(chat_completion) # You can extract the response text from the JSON object
In certain cases, a long response can be expected from the model, which may take long with the above method, since the entire response gets generated first and then printed to the screen. Streaming could be used instead to retrieve the response proactively as it is being generated.
from openai import OpenAI
# API configuration
api_key = '<api_key>' # Replace with your API key
base_url = "https://chat-ai.academiccloud.de/v1"
model = "meta-llama-3.1-8b-instruct" # Choose any available model
# Start OpenAI client
client = OpenAI(
api_key = api_key,
base_url = base_url
)
# Get stream
stream = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Name the capital city of each country on earth, and describe its main attraction",
}
],
model = model ,
stream = True
)
# Print out the response
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")
If you use Visual Studio Code or Jetbrains as your IDE, the recommended way to maximise your API key ease of usage, particularly for code completion, is to install the Continue plugin and set the configurations accordingly. Refer to CoCo AI for further details.
Image
The API specification is compatible with the OpenAI Image API. However, fetching images from the web is not supported and must be uploaded as part of the requests.
See the following minimal example in Python.
import base64
from openai import OpenAI
# API configuration
api_key = '<api_key>' # Replace with your API key
base_url = "https://chat-ai.academiccloud.de/v1"
model = "internvl2-8b" # Choose any available model
# Start OpenAI client
client = OpenAI(
api_key = api_key,
base_url = base_url,
)
# Function to encode the image
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
# Path to your image
image_path = "test-image.png"
# Getting the base64 string
base64_image = encode_image(image_path)
response = client.chat.completions.create(
model = model,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?",
},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
},
},
],
}
],
)
print(response.choices[0])
Embeddings
Embeddings are only available via the API and support the same API as the OpenAI Embeddings API.
See the following minimal example.
curl https://chat-ai.academiccloud.de/v1/embeddings \
-H "Authorization: Bearer <api_key>" \
-H "Content-Type: application/json" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "e5-mistral-7b-instruct",
"encoding_format": "float"
}'
See the following code example for developing RAG applications with llamaindex: https://gitlab-ce.gwdg.de/hpc-team-public/chat-ai-llamaindex-examples
Developer reference
The GitHub repositories SAIA-Hub, SAIA-HPC and of Chat AI provide all the components for the architecture in the diagram above.
Further services
If you have more questions, feel free to contact us at support@gwdg.de.