Skip to content

Commit e7df539

Browse files
author
yawzhang
committed
Fix race condition in GC
When GC resets move_to_chunk via purge_reserved_chunk(), stale repl_reqs may still exist and be cleaned up by background gc_repl_reqs(). This causes two race conditions: 1. Stale rreq frees blk on NEW allocator after reset (wrong allocator) 2. Stale rreq frees blk on OLD allocator during reset, accessing destroyed superblock and causing crash
1 parent 97f2794 commit e7df539

File tree

3 files changed

+12
-2
lines changed

3 files changed

+12
-2
lines changed

conanfile.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111
class HomeObjectConan(ConanFile):
1212
name = "homeobject"
13-
version = "4.1.4"
13+
version = "4.1.5"
1414

1515
homepage = "https://github.com/eBay/HomeObject"
1616
description = "Blob Store built on HomeStore"

src/lib/homestore_backend/gc_manager.cpp

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1064,6 +1064,16 @@ bool GCManager::pdev_gc_actor::purge_reserved_chunk(chunk_id_t chunk, const uint
10641064
"chunk_id={} is a reserved chunk, expected to have a GC state, but actuall state is {} ", chunk,
10651065
vchunk->m_state);
10661066

1067+
// Clear all rreqs on move_to_chunk BEFORE purge_reserved_chunk() resets the allocator.
1068+
// This prevents race conditions where:
1069+
// 1. Stale rreq frees blk on NEW allocator after reset (wrong allocator)
1070+
// 2. Stale rreq frees blk on OLD allocator during reset (accessing destroyed superblock)
1071+
// NOTE: This introduces potential double-free risk with gc_repl_reqs() background thread.
1072+
// See https://github.com/eBay/HomeObject/issues/401.
1073+
GCLOGD(task_id, pg_id, NO_SHARD_ID, "clear all rreqs on chunk={} before resetting it", vchunk->get_chunk_id());
1074+
auto hs_pg = m_hs_home_object->get_hs_pg(pg_id);
1075+
hs_pg->repl_dev_->clear_chunk_req(vchunk->get_chunk_id());
1076+
10671077
GCLOGD(task_id, pg_id, NO_SHARD_ID, "reset chunk={} before using it for gc", vchunk->get_chunk_id());
10681078
vchunk->reset(); // reset the chunk to make sure it is empty
10691079

src/lib/homestore_backend/heap_chunk_selector.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ csharedChunk HeapChunkSelector::select_specific_chunk(const pg_id_t pg_id, const
130130

131131
if (chunk->m_state == ChunkState::GC) {
132132
LOGDEBUGMOD(homeobject, "v_chunk_id={} for pg={} is pchunk_id={}, in GC state, wait and retry!",
133-
chunk->get_chunk_id(), v_chunk_id, pg_id);
133+
v_chunk_id, pg_id, chunk->get_chunk_id());
134134
} else {
135135
if (chunk->m_state == ChunkState::AVAILABLE) {
136136
chunk->m_state = ChunkState::INUSE;

0 commit comments

Comments
 (0)