-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathCITATION.cff
More file actions
64 lines (63 loc) · 2.3 KB
/
CITATION.cff
File metadata and controls
64 lines (63 loc) · 2.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: SAIA
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Ali
family-names: Doosthosseini
email: ali.doosthosseini@uni-goettingen.de
affiliation: University of Göttingen
orcid: 'https://orcid.org/0000-0002-0654-1268'
- given-names: Jonathan
family-names: Decker
email: jonathan.decker@uni-goettingen.de
affiliation: University of Göttingen
orcid: 'https://orcid.org/0000-0002-7384-7304'
- given-names: Hendrik
family-names: Nolte
email: hendrik.nolte@gwdg.de
affiliation: GWDG
orcid: 'https://orcid.org/0000-0003-2138-8510'
- given-names: Julian
name-particle: M.
family-names: Kunkel
email: julian.kunkel@gwdg.de
affiliation: GWDG
orcid: 'https://orcid.org/0000-0002-6915-1179'
identifiers:
- type: doi
value: 10.21203/rs.3.rs-6648693/v1
- type: url
value: 'https://www.researchsquare.com/article/rs-6648693/v1'
repository-code: 'https://github.com/gwdg/chat-ai'
url: 'https://chat-ai.academiccloud.de'
abstract: >-
Recent developments indicate a shift toward web services
that employ ever larger AI models, e.g., Large Language
Models (LLMs), requiring powerful hardware for inference.
High-Performance Computing (HPC) systems are commonly
equipped with such hardware for the purpose of large scale
computation tasks. However, HPC infrastructure is
inherently unsuitable for hosting real-time web services
due to network, security and scheduling constraints. While
various efforts exist to integrate external scheduling
solutions, these often require compromises in terms of
security or usability for existing HPC users. In this
paper, we present SAIA, a Slurm-native platform consisting
of a scheduler and a proxy. The scheduler interacts with
Slurm to ensure the availability and scalability of
services, while the proxy provides external access, which
is secured via confined SSH commands. We have demonstrated
SAIA’s applicability by deploying a large-scale LLM web
service that has served over 50,000 users.
keywords:
- AI
- HPC
- Slurm
license: GPL-3.0
version: v0.8.1
date-released: '2024-02-22'