|
| 1 | +# Architecture Documentation |
| 2 | + |
| 3 | +## System Architecture |
| 4 | + |
| 5 | +Release Controller is a Kubernetes-native system that uses Custom Resources (ReleasePayloads) and ImageStreams to manage OpenShift releases. The architecture follows a controller pattern where multiple controllers work together to orchestrate the release process. |
| 6 | + |
| 7 | +### High-Level Architecture Diagram |
| 8 | + |
| 9 | +```mermaid |
| 10 | +graph TB |
| 11 | + subgraph "Input Sources" |
| 12 | + ART[ART ImageStreams<br/>ocp/4.15-art-latest] |
| 13 | + CI[CI Builds] |
| 14 | + end |
| 15 | + |
| 16 | + subgraph "Release Controller System" |
| 17 | + RC[Release Controller<br/>Core Orchestrator] |
| 18 | + API[Release Controller API<br/>Web UI & REST API] |
| 19 | + RPC[Release Payload Controller<br/>CRD Manager] |
| 20 | + RRC[Release Reimport Controller<br/>Reimport Handler] |
| 21 | + end |
| 22 | + |
| 23 | + subgraph "Kubernetes Cluster" |
| 24 | + IS[ImageStreams<br/>Source & Release] |
| 25 | + RP[ReleasePayloads<br/>CRs] |
| 26 | + JOBS[Kubernetes Jobs<br/>Release Creation] |
| 27 | + PJ[ProwJobs<br/>Verification Tests] |
| 28 | + end |
| 29 | + |
| 30 | + subgraph "Output" |
| 31 | + REGISTRY[Container Registry<br/>Published Releases] |
| 32 | + WEB[Web Interface<br/>Release Information] |
| 33 | + JIRA[Jira<br/>Issue Tracking] |
| 34 | + end |
| 35 | + |
| 36 | + ART -->|Updates| IS |
| 37 | + CI -->|Creates| IS |
| 38 | + RC -->|Monitors| IS |
| 39 | + RC -->|Creates| RP |
| 40 | + RC -->|Launches| JOBS |
| 41 | + RC -->|Creates| PJ |
| 42 | + RC -->|Mirrors| IS |
| 43 | + |
| 44 | + RPC -->|Manages| RP |
| 45 | + RPC -->|Monitors| PJ |
| 46 | + RPC -->|Updates| RP |
| 47 | + |
| 48 | + RRC -->|Reimports| IS |
| 49 | + |
| 50 | + API -->|Reads| IS |
| 51 | + API -->|Reads| RP |
| 52 | + API -->|Serves| WEB |
| 53 | + |
| 54 | + JOBS -->|Publishes| REGISTRY |
| 55 | + RC -->|Updates| JIRA |
| 56 | + |
| 57 | + style RC fill:#e1f5ff |
| 58 | + style RPC fill:#fff4e1 |
| 59 | + style API fill:#e8f5e9 |
| 60 | +``` |
| 61 | + |
| 62 | +## Component Architecture |
| 63 | + |
| 64 | +### Release Creation Flow |
| 65 | + |
| 66 | +```mermaid |
| 67 | +sequenceDiagram |
| 68 | + participant ART as ART ImageStream |
| 69 | + participant RC as Release Controller |
| 70 | + participant RP as ReleasePayload |
| 71 | + participant Job as Creation Job |
| 72 | + participant Registry as Container Registry |
| 73 | + participant PJ as ProwJobs |
| 74 | + |
| 75 | + ART->>RC: ImageStream Updated |
| 76 | + RC->>RC: Check Release Config |
| 77 | + RC->>RP: Create ReleasePayload |
| 78 | + RC->>Job: Launch Creation Job |
| 79 | + Job->>Job: Assemble Release |
| 80 | + Job->>Registry: Push Release Image |
| 81 | + Job->>RC: Update Status |
| 82 | + RC->>PJ: Launch Verification Jobs |
| 83 | + PJ->>RC: Report Results |
| 84 | + RC->>RP: Update Payload Status |
| 85 | +``` |
| 86 | + |
| 87 | +### Release Verification Flow |
| 88 | + |
| 89 | +```mermaid |
| 90 | +graph TD |
| 91 | + A[Release Created] --> B[ReleasePayload Created] |
| 92 | + B --> C[Verification Jobs Launched] |
| 93 | + C --> D{All Jobs Pass?} |
| 94 | + D -->|Yes| E[Payload Accepted] |
| 95 | + D -->|No| F[Payload Rejected] |
| 96 | + E --> G[Release Published] |
| 97 | + F --> H[Release Blocked] |
| 98 | + |
| 99 | + style E fill:#e8f5e9 |
| 100 | + style F fill:#ffebee |
| 101 | +``` |
| 102 | + |
| 103 | +### Data Flow Diagram |
| 104 | + |
| 105 | +```mermaid |
| 106 | +flowchart TD |
| 107 | + subgraph "Input Sources" |
| 108 | + ARTStreams[ART ImageStreams] |
| 109 | + CIBuilds[CI Build Outputs] |
| 110 | + Config[Release Configs] |
| 111 | + end |
| 112 | + |
| 113 | + subgraph "Processing" |
| 114 | + RC[Release Controller] |
| 115 | + RPC[Release Payload Controller] |
| 116 | + RRC[Release Reimport Controller] |
| 117 | + end |
| 118 | + |
| 119 | + subgraph "Storage" |
| 120 | + ImageStreams[ImageStreams] |
| 121 | + ReleasePayloads[ReleasePayloads] |
| 122 | + GCS[GCS Artifacts] |
| 123 | + Audit[Audit Logs] |
| 124 | + end |
| 125 | + |
| 126 | + subgraph "Output" |
| 127 | + Registry[Container Registry] |
| 128 | + WebUI[Web Interface] |
| 129 | + Jira[Jira Issues] |
| 130 | + end |
| 131 | + |
| 132 | + ARTStreams --> RC |
| 133 | + CIBuilds --> RC |
| 134 | + Config --> RC |
| 135 | + |
| 136 | + RC --> ImageStreams |
| 137 | + RC --> ReleasePayloads |
| 138 | + RC --> GCS |
| 139 | + RC --> Audit |
| 140 | + |
| 141 | + RPC --> ReleasePayloads |
| 142 | + RRC --> ImageStreams |
| 143 | + |
| 144 | + ImageStreams --> Registry |
| 145 | + ReleasePayloads --> WebUI |
| 146 | + RC --> Jira |
| 147 | + |
| 148 | + style RC fill:#e1f5ff |
| 149 | + style RPC fill:#fff4e1 |
| 150 | +``` |
| 151 | + |
| 152 | +## Deployment Architecture |
| 153 | + |
| 154 | +### Production Deployment |
| 155 | + |
| 156 | +```mermaid |
| 157 | +graph TB |
| 158 | + subgraph "Release Controller Namespace" |
| 159 | + RCPod[Release Controller Pod] |
| 160 | + APIPod[API Server Pod] |
| 161 | + RPCPod[Release Payload Controller Pod] |
| 162 | + RRCPod[Reimport Controller Pod] |
| 163 | + end |
| 164 | + |
| 165 | + subgraph "Kubernetes Cluster" |
| 166 | + ImageStreams[ImageStreams] |
| 167 | + ReleasePayloads[ReleasePayloads] |
| 168 | + Jobs[Kubernetes Jobs] |
| 169 | + ProwJobs[ProwJobs] |
| 170 | + end |
| 171 | + |
| 172 | + subgraph "External Services" |
| 173 | + Registry[Container Registry] |
| 174 | + GCS[Google Cloud Storage] |
| 175 | + Jira[Jira] |
| 176 | + Prow[Prow CI] |
| 177 | + end |
| 178 | + |
| 179 | + RCPod --> ImageStreams |
| 180 | + RCPod --> ReleasePayloads |
| 181 | + RCPod --> Jobs |
| 182 | + RCPod --> ProwJobs |
| 183 | + |
| 184 | + RPCPod --> ReleasePayloads |
| 185 | + RRCPod --> ImageStreams |
| 186 | + |
| 187 | + APIPod --> ImageStreams |
| 188 | + APIPod --> ReleasePayloads |
| 189 | + |
| 190 | + Jobs --> Registry |
| 191 | + RCPod --> GCS |
| 192 | + RCPod --> Jira |
| 193 | + ProwJobs --> Prow |
| 194 | + |
| 195 | + style RCPod fill:#e1f5ff |
| 196 | + style RPCPod fill:#fff4e1 |
| 197 | +``` |
| 198 | + |
| 199 | +### Controller Architecture |
| 200 | + |
| 201 | +The release-payload-controller runs multiple sub-controllers: |
| 202 | + |
| 203 | +```mermaid |
| 204 | +graph LR |
| 205 | + subgraph "release-payload-controller" |
| 206 | + PCC[Payload Creation Controller] |
| 207 | + PMC[Payload Mirror Controller] |
| 208 | + PJC[ProwJob Controller] |
| 209 | + PAC[Payload Accepted Controller] |
| 210 | + PRC[Payload Rejected Controller] |
| 211 | + PVC[Payload Verification Controller] |
| 212 | + end |
| 213 | + |
| 214 | + subgraph "Kubernetes API" |
| 215 | + RP[ReleasePayloads] |
| 216 | + Jobs[Jobs] |
| 217 | + PJ[ProwJobs] |
| 218 | + end |
| 219 | + |
| 220 | + PCC --> RP |
| 221 | + PCC --> Jobs |
| 222 | + PMC --> RP |
| 223 | + PMC --> Jobs |
| 224 | + PJC --> PJ |
| 225 | + PJC --> RP |
| 226 | + PAC --> RP |
| 227 | + PRC --> RP |
| 228 | + PVC --> RP |
| 229 | + |
| 230 | + style PCC fill:#e1f5ff |
| 231 | + style PJC fill:#fff4e1 |
| 232 | +``` |
| 233 | + |
| 234 | +## Component Interaction Diagram |
| 235 | + |
| 236 | +```mermaid |
| 237 | +graph TB |
| 238 | + subgraph "Monitoring Layer" |
| 239 | + RC[Release Controller] |
| 240 | + end |
| 241 | + |
| 242 | + subgraph "Management Layer" |
| 243 | + RPC[Release Payload Controller] |
| 244 | + RRC[Release Reimport Controller] |
| 245 | + end |
| 246 | + |
| 247 | + subgraph "Presentation Layer" |
| 248 | + API[Release Controller API] |
| 249 | + end |
| 250 | + |
| 251 | + subgraph "Integration Layer" |
| 252 | + Jira[Jira Integration] |
| 253 | + Signer[Release Signer] |
| 254 | + Audit[Audit Backend] |
| 255 | + end |
| 256 | + |
| 257 | + RC --> RPC |
| 258 | + RC --> RRC |
| 259 | + RC --> API |
| 260 | + RC --> Jira |
| 261 | + RC --> Signer |
| 262 | + RC --> Audit |
| 263 | + |
| 264 | + RPC --> RC |
| 265 | + API --> RC |
| 266 | + |
| 267 | + style RC fill:#e1f5ff |
| 268 | + style RPC fill:#fff4e1 |
| 269 | +``` |
| 270 | + |
| 271 | +## Key Design Patterns |
| 272 | + |
| 273 | +### 1. Controller Pattern |
| 274 | +All components follow the Kubernetes controller pattern: |
| 275 | +- Watch for resource changes |
| 276 | +- Reconcile desired state |
| 277 | +- Handle errors and retries gracefully |
| 278 | +- Update resource status |
| 279 | + |
| 280 | +### 2. Custom Resources |
| 281 | +ReleasePayload CRD represents release state: |
| 282 | +- Tracks release creation progress |
| 283 | +- Manages verification status |
| 284 | +- Handles acceptance/rejection |
| 285 | +- Stores release metadata |
| 286 | + |
| 287 | +### 3. ImageStream-Based |
| 288 | +Uses OpenShift ImageStreams for: |
| 289 | +- Source image tracking |
| 290 | +- Release image creation |
| 291 | +- Image mirroring |
| 292 | +- Point-in-time snapshots |
| 293 | + |
| 294 | +### 4. Job-Based Execution |
| 295 | +Release operations execute as Kubernetes Jobs: |
| 296 | +- Release creation jobs |
| 297 | +- Mirror jobs |
| 298 | +- Verification jobs |
| 299 | +- Independent and retryable |
| 300 | + |
| 301 | +### 5. Event-Driven |
| 302 | +System responds to events: |
| 303 | +- ImageStream updates trigger releases |
| 304 | +- Job completions trigger status updates |
| 305 | +- ReleasePayload changes trigger actions |
| 306 | + |
| 307 | +## Security Architecture |
| 308 | + |
| 309 | +```mermaid |
| 310 | +graph TB |
| 311 | + subgraph "Authentication" |
| 312 | + ServiceAccounts[K8s Service Accounts] |
| 313 | + RegistryAuth[Registry Authentication] |
| 314 | + end |
| 315 | + |
| 316 | + subgraph "Authorization" |
| 317 | + RBAC[Kubernetes RBAC] |
| 318 | + ImageStreamPermissions[ImageStream Permissions] |
| 319 | + end |
| 320 | + |
| 321 | + subgraph "Security Features" |
| 322 | + Signing[Release Signing] |
| 323 | + Audit[Audit Logging] |
| 324 | + Verification[Release Verification] |
| 325 | + end |
| 326 | + |
| 327 | + ServiceAccounts --> RBAC |
| 328 | + RegistryAuth --> ImageStreamPermissions |
| 329 | + |
| 330 | + RBAC --> Signing |
| 331 | + ImageStreamPermissions --> Audit |
| 332 | + Signing --> Verification |
| 333 | + |
| 334 | + style Signing fill:#e1f5ff |
| 335 | + style Audit fill:#fff4e1 |
| 336 | +``` |
| 337 | + |
| 338 | +## Scalability Considerations |
| 339 | + |
| 340 | +1. **Horizontal Scaling**: Controllers can be scaled horizontally |
| 341 | +2. **Job Parallelization**: Multiple release jobs can run concurrently |
| 342 | +3. **Caching**: ImageStream informers cache resource state |
| 343 | +4. **Resource Management**: Garbage collection manages old releases |
| 344 | +5. **Efficient API Usage**: Informers and watches for Kubernetes API |
| 345 | + |
| 346 | +## Monitoring and Observability |
| 347 | + |
| 348 | +```mermaid |
| 349 | +graph LR |
| 350 | + subgraph "Metrics Collection" |
| 351 | + Prometheus[Prometheus] |
| 352 | + Metrics[Metrics Endpoints] |
| 353 | + end |
| 354 | + |
| 355 | + subgraph "Logging" |
| 356 | + K8sLogs[Kubernetes Logs] |
| 357 | + AuditLogs[Audit Logs] |
| 358 | + end |
| 359 | + |
| 360 | + subgraph "Tracing" |
| 361 | + OpenTelemetry[OpenTelemetry] |
| 362 | + end |
| 363 | + |
| 364 | + subgraph "Dashboards" |
| 365 | + Grafana[Grafana] |
| 366 | + WebUI[Web UI] |
| 367 | + end |
| 368 | + |
| 369 | + Prometheus --> Grafana |
| 370 | + Metrics --> Prometheus |
| 371 | + K8sLogs --> Grafana |
| 372 | + AuditLogs --> Grafana |
| 373 | + OpenTelemetry --> Grafana |
| 374 | + WebUI --> Metrics |
| 375 | + |
| 376 | + style Prometheus fill:#e1f5ff |
| 377 | + style Grafana fill:#fff4e1 |
| 378 | +``` |
| 379 | + |
0 commit comments