Skip to content

DAOS-17306 doc: self-healing properties, interactive rebuild#18023

Open
kccain wants to merge 4 commits intomasterfrom
kccain/daos_17306_doc
Open

DAOS-17306 doc: self-healing properties, interactive rebuild#18023
kccain wants to merge 4 commits intomasterfrom
kccain/daos_17306_doc

Conversation

@kccain
Copy link
Copy Markdown
Contributor

@kccain kccain commented Apr 15, 2026

For the DAOS version 2.8 release, add two major sections to the DAOS Administrator's Guide:

  • self-healing properties / policy controls (DAOS-17306)
  • explicit / interactive rebuild control (DAOS-17281)

Doc-only: true

Signed-off-by: Kenneth Cain kenneth.cain@hpe.com
Signed-off-by: Tom Nabarro thomas.nabarro@hpe.com

Steps for the author:

  • Commit message follows the guidelines.
  • Appropriate Features or Test-tag pragmas were used.
  • Appropriate Functional Test Stages were run.
  • At least two positive code reviews including at least one code owner from each category referenced in the PR.
  • Testing is complete. If necessary, forced-landing label added and a reason added in a comment.

After all prior steps are complete:

  • Gatekeeper requested (daos-gatekeeper added as a reviewer).

@kccain kccain force-pushed the kccain/daos_17306_doc branch from b137cd4 to 185c293 Compare April 15, 2026 19:07
@github-actions
Copy link
Copy Markdown

Ticket title is 'Enable/disable auto recovery'
Status is 'Resolved'
Labels: '2.8pp'
https://daosio.atlassian.net/browse/DAOS-17306

kccain and others added 2 commits April 16, 2026 13:44
For the DAOS version 2.8 release, add two major sections to the
DAOS Administrator's Guide:
- self-healing properties / policy controls (DAOS-17306)
- explicit / interactive rebuild control (DAOS-17281)

Doc-only: true

Signed-off-by: Kenneth Cain <kenneth.cain@hpe.com>
Signed-off-by: Tom Nabarro <thomas.nabarro@hpe.com>
output (introduced in PR #17371 / DAOS 2.6).

The new section explains:
- Field values (normal vs degraded)
- When to check this field
- Example usage with exclude-only self-heal policies
- How to verify exclusion completed when auto-rebuild is disabled

Updated pool query examples throughout to show the Data redundancy
field for consistency with DAOS 2.6+ output.

Particularly useful for scenarios where system.self_heal is set to
exclude,pool_exclude or pool self_heal has exclude bit set without
rebuild, to confirm exclusion has occurred.

Related-to: #17371
Doc-only: true
Signed-off-by: Tom Nabarro <thomas.nabarro@hpe.com>
@tanabarr tanabarr force-pushed the kccain/daos_17306_doc branch from 4cc9216 to caf846e Compare April 16, 2026 12:46
@tanabarr
Copy link
Copy Markdown
Contributor

had to force push to clean up the commit list, Hope that's okay

Comment thread docs/admin/self_healing.md Outdated
Comment thread docs/admin/self_healing.md Outdated
- Rebuild busy, 42 objs, 21 recs
- Data redundancy: degraded
```

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I thought of, but no change requested
there are other cases like drain that would show Rebuild busy but Data redundancy: normal
And I guess extend might show something similar. But it's probably not worth enumerating too many cases actually.

Comment thread docs/admin/self_healing.md Outdated
Comment thread docs/admin/self_healing.md Outdated
tanabarr
tanabarr previously approved these changes Apr 18, 2026
Comment thread docs/admin/rebuild_controls.md
Comment thread docs/admin/rebuild_controls.md
Comment thread docs/admin/rebuild_controls.md
Comment thread docs/admin/rebuild_controls.md Outdated
Comment thread docs/admin/rebuild_controls.md Outdated
Comment thread docs/admin/rebuild_controls.md
Comment thread docs/admin/self_healing.md Outdated
@daosbuild3
Copy link
Copy Markdown
Collaborator

Test stage Functional Hardware Medium MD on SSD completed with status UNSTABLE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net/job/daos-stack/job/daos//view/change-requests/job/PR-18023/5/testReport/

Co-authored-by: Ken Cain <kenneth.cain@hpe.com>
Signed-off-by: Tom Nabarro <thomas.nabarro@hpe.com>
@tanabarr tanabarr force-pushed the kccain/daos_17306_doc branch from b61f74c to 8b9ad4b Compare April 23, 2026 11:07
Doc-only: true

Signed-off-by: Kenneth Cain <kenneth.cain@hpe.com>
Signed-off-by: Tom Nabarro <thomas.nabarro@hpe.com>
@tanabarr tanabarr force-pushed the kccain/daos_17306_doc branch from 8b9ad4b to 12c2f3a Compare April 23, 2026 13:05
@kccain kccain marked this pull request as ready for review April 23, 2026 15:40
@kccain kccain requested a review from a team as a code owner April 23, 2026 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants