Skip to content

CUMULUS-4788: Move replication for different tables into separate modules#4403

Merged
indiejames merged 6 commits intomasterfrom
cumulus-4788-split-replication-service
Apr 17, 2026
Merged

CUMULUS-4788: Move replication for different tables into separate modules#4403
indiejames merged 6 commits intomasterfrom
cumulus-4788-split-replication-service

Conversation

@indiejames
Copy link
Copy Markdown
Contributor

Splits ECS replication service into multiple services to prevent resource exhaustion

Changes

  • Replication service terraform moved to sub-modules
  • Multiple services not instantiated - one for each replication table group

PR Checklist

  • Update CHANGELOG
  • Unit tests
  • Ad-hoc testing - Deploy changes and test manually
  • Integration tests

📝 Note:
For most pull requests, please Squash and merge to maintain a clean and readable commit history.

@indiejames indiejames force-pushed the cumulus-4788-split-replication-service branch from 44f6d0c to 257d138 Compare April 15, 2026 19:29
Copy link
Copy Markdown

@chris-durbin chris-durbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of variables that need to be set up - do we have a tfvars example template that could be filled in similar to other cumulus deployments? I'm also not sure of the mechanics for figuring out and setting all of the top level deployment variables.

}

variable "ecs_infrastructure_role" {
type = object({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add descriptions for these 4 variables

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


locals {
full_name = "${var.prefix}-replication"
# full_name = "${var.prefix}-replication"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

name = "${var.prefix}-CumulusIcebergReplicationECSCluster"
tags = var.tags
}
# resource "aws_security_group" "no_ingress_all_egress" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

description = "Subnets for database cluster. Requires at least 2 across multiple AZs"
type = list(string)
variable "subnet" {
description = "Subnet for Fargate tasks"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth a note on why you are using a single subnet (I think you said an EBS limitation).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

type = object({ bucket = string, key = string, region = string })
}

variable "iceberg_s3_bucket" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we aren't creating these buckets in Terraform - is the plan to create them via the script that sets up the Iceberg tables?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that was what you said, but we can create them in the terraform if you want.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think a terraform variable here and created by the loading script is probably the most flexible - we can always change our minds later.

kafka_image = var.kafka_image
connect_image = var.connect_image
bootstrap_image = var.bootstrap_image
pg_db = "postgres"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the db name should be coming from an output of the RDS module.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it can? The terraform is not setting up the dB, it it? That's some script running migrations, I tihnk

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we should make it a variable to be passed in instead of hardcoding to postgres.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

variable "slot_name" {
description = "The name for the replication slot"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable needs a better name or at least a better description.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe something like replication_table_group?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -0,0 +1,19 @@
output "task_execution_role" {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add descriptions for the outputs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@indiejames
Copy link
Copy Markdown
Contributor Author

There are a lot of variables that need to be set up - do we have a tfvars example template that could be filled in similar to other cumulus deployments? I'm also not sure of the mechanics for figuring out and setting all of the top level deployment variables.

There already is a tfvars example file in the rds-iceberg-replications-tf. I'll add the new variables for this PR to it.

@indiejames indiejames force-pushed the cumulus-4788-split-replication-service branch from 01e060a to 87337a6 Compare April 17, 2026 13:26
@indiejames indiejames merged commit 8a58854 into master Apr 17, 2026
9 checks passed
@indiejames indiejames deleted the cumulus-4788-split-replication-service branch April 17, 2026 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants