Copying GC implementation

Current GC progress:
- [x] Objects
- [x] Strings
- [x] Arrays
- [x] ByteArrays
- [x] Basic message allocator GC
- [x] Basic GC testing
- [x] Add `gc_check` in every place where we allocate memory
- [x] More rigorous testing
- [x] Closures (like objects)
- [x] Hashes (WIP Marco)
- [ ] Actors need to wait on receiver when message allocator is full
- [ ] Automatic message allocator expansion
- [ ] Double-check that all example programs work reliably with GC

I'm just sharing this plan/thoughts in case people want to comment. Plush doesn't yet have a garbage collector. The language has actors, which are independent interpreters running in separate threads. I've designed the system to try and minimize locking. There is no global VM lock, and currently each actor has its own allocator. I wanted to avoid a situation where actors would need to lock on a global allocator each time memory needs to be allocated.

When sending messages to another actor, the data needs to be copied. At the moment, the way this is done is that each actor has a mailbox allocator for incoming messages. This makes it possible for the actor sending the message to lock on the mailbox allocator of the receiver. That means the receiver can continue its execution uninterrupted while messages are being copied into its mailbox.

In terms of garbage collection, I'd like to go for a simple approach. I want to avoid generational garbage collection for now. We could either go with mark-and-sweep or a copying GC. There are two big advantage with a copying GC. The first is that bump allocation is very fast. The second is that collection time is proportional to the amount of live memory. These things could be an advantage in a language that allocates a lot of temporary data, especially since modern CPUs are very fast at copying memory.

I'm trying to design the system so that garbage collection for one actor can run without interrupting the other actors. I would expect that even if an actor has several hundred megabytes of live data, collection should take less than 10 milliseconds, meaning no visible penalty in the context an interactive game.

There is already logic to copy objects for sending a message in deepcopy.rs, and so maybe some of that code could be reused or adapted for a copying GC. In theory, you could traverse and copy objects from an old allocator into a new allocator.

There is a tricky bit, which is that when an actor receives a message, we need to transfer ownership of this message from the mailbox allocator to the receiving actor. Ideally we'd like to be able to do that without copying this data. This poses a bit of a challenge because we have a situation where the mailbox allocator can get full and need to trigger its own GC. In that case it may need to synchronize with the actor. Similarly, if the main allocator of an actor is full, it may need to trigger GC. I suppose this is maybe a simpler situation because if an actor's main GC trigger, we can just not copy, not move the data from its mailbox allocator.

Another tricky aspect is when an actor calls into host functions. This may cause data to be allocated. If a host function can allocate new data, this could trigger a GC. In this case we need to be careful to make sure the GC is aware of all the roots so that we don't end up with objects accidentally being collected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Copying GC implementation #25

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Copying GC implementation #25

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions