saia-hub/CITATION.cff at main · gwdg/saia-hub · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: SAIA
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Ali
    family-names: Doosthosseini
    email: ali.doosthosseini@uni-goettingen.de
    affiliation: University of Göttingen
    orcid: 'https://orcid.org/0000-0002-0654-1268'
  - given-names: Jonathan
    family-names: Decker
    email: jonathan.decker@uni-goettingen.de
    affiliation: University of Göttingen
    orcid: 'https://orcid.org/0000-0002-7384-7304'
  - given-names: Hendrik
    family-names: Nolte
    email: hendrik.nolte@gwdg.de
    affiliation: GWDG
    orcid: 'https://orcid.org/0000-0003-2138-8510'
  - given-names: Julian
    name-particle: M.
    family-names: Kunkel
    email: julian.kunkel@gwdg.de
    affiliation: GWDG
    orcid: 'https://orcid.org/0000-0002-6915-1179'
identifiers:
  - type: doi
    value: 10.21203/rs.3.rs-6648693/v1
  - type: url
    value: 'https://www.researchsquare.com/article/rs-6648693/v1'
repository-code: 'https://github.com/gwdg/chat-ai'
url: 'https://chat-ai.academiccloud.de'
abstract: >-
  Recent developments indicate a shift toward web services
  that employ ever larger AI models, e.g., Large Language
  Models (LLMs), requiring powerful hardware for inference.
  High-Performance Computing (HPC) systems are commonly
  equipped with such hardware for the purpose of large scale
  computation tasks. However, HPC infrastructure is
  inherently unsuitable for hosting real-time web services
  due to network, security and scheduling constraints. While
  various efforts exist to integrate external scheduling
  solutions, these often require compromises in terms of
  security or usability for existing HPC users. In this
  paper, we present SAIA, a Slurm-native platform consisting
  of a scheduler and a proxy. The scheduler interacts with
  Slurm to ensure the availability and scalability of
  services, while the proxy provides external access, which
  is secured via confined SSH commands. We have demonstrated
  SAIA’s applicability by deploying a large-scale LLM web
  service that has served over 50,000 users.
keywords:
  - AI
  - HPC
  - Slurm
license: GPL-3.0
version: v0.8.1
date-released: '2024-02-22'