I noticed some ergonomic inconvenience, or maybe I am using it the wrong way.
asynch::Client takes only W, asynch::SimpleClient takes RW as single transport.
In case of e.g. BufferedUart this is fine, because both Read and Write traits are implemented.
In case of DMA, this is not true. I have to split the UART and manually provide a buffer to get a RX with the Read trait. See https://docs.embassy.dev/embassy-stm32/git/stm32wb55rg/usart/struct.UartRx.html#method.into_ring_buffered
So in order to use asynch::SimpleClient I then need to wrap my new TX and RX into a struct which again implements Read and Write and acts as transport.
If asynch::SimpleClient had separate parameters for Read and Write this would not be necessary.