Skip to content

[BUG] Hardcoded auto_expand_replicas 0-20 causes yellow cluster state when same_shard.host setting is enabled #1559

@Pigueiras

Description

@Pigueiras

What is the bug?
The commit 43b15dac5c80b97a0a6a1300e5239a57a0cccd30 introduces a hardcoded auto_expand_replicas: 0-20 when creating new indices.

This causes issues in clusters where the setting:

"cluster.routing.allocation.same_shard.host": "true"

is enabled. In these environments, not all replicas can be allocated (due to the shard allocation restriction), which leads to persistent yellow cluster states whenever the Security Analytics plugin creates a new index.

Having an index template like:

GET /_index_template/disable-autoexpand-sap
{
  "index_templates": [
    {
      "name": "disable-autoexpand-sap",
      "index_template": {
        "index_patterns": [
          ".opensearch-sap*"
        ],
        "template": {
          "settings": {
            "index": {
              "auto_expand_replicas": "false",
              "number_of_replicas": 1
            }
          }
        },
        "composed_of": [],
        "priority": 0
      }
    }
  ]
}

Doesn't help since the values are hardcoded when creating the index 😢

How can one reproduce the bug?

If you run several opensearch nodes inside the same host you can simply enable the setting:

PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.same_shard.host": "true"
  }
}

And then probably installing the plugin would create the `.opensearch-sap-* indices with the X replicas and the cluster will go to yellow because it can't allocate

What is the expected behavior?

I can think in some solutions:

  • Do not set number_of_replicas explicitly, let the cluster default apply.
  • Set a lower default, (e.g, 0-2), which is more conservative and less likely to fail.
  • Make the value configurable, either through plugin settings or an environment variable.

Please let me know which approach you'd prefer, and I’d be happy to open a PR with the proposed fix.

What is your host/environment?

  • OS: Alma9
  • Version 2.18.0

Do you have any screenshots?
Not applicable

Do you have any additional context?
Nope :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions