# Storage Benchmarks Generated on 2026-03-27T12:08:47+09:00 with: ```bash nix run ./nix/test-cluster#cluster -- bench-storage ``` ## CoronaFS Cluster network baseline, measured with `iperf3` from `node04` to `node01` before the storage tests: | Metric | Result | |---|---:| | TCP throughput | 45.92 MiB/s | | TCP retransmits | 193 | Measured from `node04`. Local worker disk is the baseline. CoronaFS now has two relevant data paths in the lab: the controller export sourced from `node01`, and the node-local export materialized onto the worker that actually attaches the mutable VM disk. | Metric | Local Disk | Controller Export | Node-local Export | |---|---:|---:|---:| | Sequential write | 679.05 MiB/s | 30.35 MiB/s | 395.06 MiB/s | | Sequential read | 2723.40 MiB/s | 42.70 MiB/s | 709.14 MiB/s | | 4k random read | 16958 IOPS | 2034 IOPS | 5087 IOPS | | 4k queued random read (`iodepth=32`) | 106026 IOPS | 14261 IOPS | 28898 IOPS | Queue-depth profile (`libaio`, `iodepth=32`) from the same worker: | Metric | Local Disk | Controller Export | Node-local Export | |---|---:|---:|---:| | Depth-32 write | 3417.45 MiB/s | 39.26 MiB/s | 178.04 MiB/s | | Depth-32 read | 12996.47 MiB/s | 55.71 MiB/s | 112.88 MiB/s | Node-local materialization timing and target-node steady-state read path: | Metric | Result | |---|---:| | Node04 materialize latency | 9.23 s | | Node05 materialize latency | 5.82 s | | Node05 node-local sequential read | 709.14 MiB/s | PlasmaVMC now prefers the worker-local CoronaFS export for mutable node-local volumes, even when the underlying materialization is a qcow2 overlay. The VM runtime section below is therefore the closest end-to-end proxy for real local-attach VM I/O, while the node-local export numbers remain useful for CoronaFS service consumers and for diagnosing exporter overhead. ## LightningStor Measured from `node03` against the S3-compatible endpoint on `node01`. The object path exercised the distributed backend with replication across the worker storage nodes. Cluster network baseline for this client, measured with `iperf3` from `node03` to `node01` before the storage tests: | Metric | Result | |---|---:| | TCP throughput | 45.99 MiB/s | | TCP retransmits | 207 | ### Large-object path | Metric | Result | |---|---:| | Object size | 256 MiB | | Upload throughput | 18.20 MiB/s | | Download throughput | 39.21 MiB/s | ### Small-object batch Measured as 32 objects of 4 MiB each (128 MiB total). | Metric | Result | |---|---:| | Batch upload throughput | 18.96 MiB/s | | Batch download throughput | 39.88 MiB/s | | PUT rate | 4.74 objects/s | | GET rate | 9.97 objects/s | ### Parallel small-object batch Measured as the same 32 objects of 4 MiB each, but with 8 concurrent client jobs from `node03`. | Metric | Result | |---|---:| | Parallel batch upload throughput | 16.23 MiB/s | | Parallel batch download throughput | 26.07 MiB/s | | Parallel PUT rate | 4.06 objects/s | | Parallel GET rate | 6.52 objects/s | ## VM Image Path Measured against the `PlasmaVMC -> LightningStor artifact -> CoronaFS-backed managed volume` clone path on `node01`. | Metric | Result | |---|---:| | Guest image artifact size | 2017 MiB | | Guest image virtual size | 4096 MiB | | `CreateImage` latency | 66.49 s | | First image-backed `CreateVolume` latency | 16.86 s | | Second image-backed `CreateVolume` latency | 0.12 s | ## VM Runtime Path Measured against the real `StartVm -> qemu attach -> guest boot -> guest fio` path on a worker node, using a CoronaFS-backed root disk and data disk. | Metric | Result | |---|---:| | `StartVm` to qemu attach | 0.60 s | | `StartVm` to guest benchmark result | 35.69 s | | Guest sequential write | 123.49252223968506 MiB/s | | Guest sequential read | 1492.7113695144653 MiB/s | | Guest 4k random read | 25550 IOPS | ## Assessment - CoronaFS controller-export reads are currently 1.6% of the measured local-disk baseline on this nested-QEMU lab cluster. - CoronaFS controller-export 4k random reads are currently 12.0% of the measured local-disk baseline. - CoronaFS controller-export queued 4k random reads are currently 13.5% of the measured local queued-random-read baseline. - CoronaFS controller-export sequential reads are currently 93.0% of the measured node04->node01 TCP baseline, which isolates the centralized source path from raw cluster-network limits. - CoronaFS controller-export depth-32 reads are currently 0.4% of the local depth-32 baseline. - CoronaFS node-local reads are currently 26.0% of the measured local-disk baseline, which is the more relevant steady-state signal for mutable VM disks after attachment. - CoronaFS node-local 4k random reads are currently 30.0% of the measured local-disk baseline. - CoronaFS node-local queued 4k random reads are currently 27.3% of the measured local queued-random-read baseline. - CoronaFS node-local depth-32 reads are currently 0.9% of the local depth-32 baseline. - The target worker's node-local read path is 26.0% of the measured local sequential-read baseline after materialization, which is the better proxy for restart and migration steady state than the old shared-export read. - PlasmaVMC now attaches writable node-local volumes through the worker-local CoronaFS export, so the guest-runtime section should be treated as the real local VM steady-state path rather than the node-local export numbers alone. - CoronaFS single-depth writes remain sensitive to the nested-QEMU/VDE lab transport, so the queued-depth and guest-runtime numbers are still the more reliable proxy for real VM workload behavior than the single-stream write figure alone. - The central export path is now best understood as a source/materialization path; the worker-local export is the path that should determine VM-disk readiness going forward. - LightningStor's replicated S3 path is working correctly, but 18.20 MiB/s upload and 39.21 MiB/s download are still lab-grade numbers rather than strong object-store throughput. - LightningStor large-object downloads are currently 85.3% of the same node04->node01 TCP baseline, which indicates how much of the headroom is being lost above the raw network path. - The current S3 frontend tuning baseline is the built-in 16 MiB streaming threshold with multipart PUT/FETCH concurrency of 8; that combination is the best default observed on this lab cluster so far. - LightningStor uploads should be read against the replication write quorum and the same ~45.99 MiB/s lab network ceiling; this environment still limits end-to-end throughput well before modern bare-metal NICs would. - LightningStor's small-object batch path is also functional, but 4.74 PUT/s and 9.97 GET/s still indicate a lab cluster rather than a tuned object-storage deployment. - The parallel small-object profile is the more relevant control-plane/object-ingest signal; it currently reaches 4.06 PUT/s and 6.52 GET/s. - The VM image section measures clone/materialization cost, not guest runtime I/O. - The PlasmaVMC local image-backed clone fast path is now active again; a 0.12 s second clone indicates the CoronaFS qcow2 backing-file path is being hit on node01 rather than falling back to eager raw materialization. - The VM runtime section is the real `PlasmaVMC + CoronaFS + QEMU virtio-blk + guest kernel` path; use it to judge whether QEMU/NBD tuning is helping. - The local sequential-write baseline is noisy in this environment, so the read and random-read deltas are the more reliable signal.