The only missing thing is that you do not need to perform those b-tree (e.g. index or clustered table) writes synchronously. Asynchronous I/O does not influence the transaction response time (and throughput), unless the system I/O capacity is exceeded.
LSM and its variants are, more ore less, like another implementation of bufferpools — to collect the changes in RAM and then write the changed data in consistent chunks. If the changed chunks are scattered through the disk, we will have either random writes, or the need to re-build the consistent state of data at startup (not so bad, but should be kept manageable in size).