7.3 KiB
Storage Benchmarks
Generated on 2026-03-27T12:08:47+09:00 with:
nix run ./nix/test-cluster#cluster -- bench-storage
CoronaFS
Cluster network baseline, measured with iperf3 from node04 to node01 before the storage tests:
| Metric | Result |
|---|---|
| TCP throughput | 45.92 MiB/s |
| TCP retransmits | 193 |
Measured from node04.
Local worker disk is the baseline. CoronaFS now has two relevant data paths in the lab: the controller export sourced from node01, and the node-local export materialized onto the worker that actually attaches the mutable VM disk.
| Metric | Local Disk | Controller Export | Node-local Export |
|---|---|---|---|
| Sequential write | 679.05 MiB/s | 30.35 MiB/s | 395.06 MiB/s |
| Sequential read | 2723.40 MiB/s | 42.70 MiB/s | 709.14 MiB/s |
| 4k random read | 16958 IOPS | 2034 IOPS | 5087 IOPS |
4k queued random read (iodepth=32) |
106026 IOPS | 14261 IOPS | 28898 IOPS |
Queue-depth profile (libaio, iodepth=32) from the same worker:
| Metric | Local Disk | Controller Export | Node-local Export |
|---|---|---|---|
| Depth-32 write | 3417.45 MiB/s | 39.26 MiB/s | 178.04 MiB/s |
| Depth-32 read | 12996.47 MiB/s | 55.71 MiB/s | 112.88 MiB/s |
Node-local materialization timing and target-node steady-state read path:
| Metric | Result |
|---|---|
| Node04 materialize latency | 9.23 s |
| Node05 materialize latency | 5.82 s |
| Node05 node-local sequential read | 709.14 MiB/s |
PlasmaVMC now prefers the worker-local CoronaFS export for mutable node-local volumes, even when the underlying materialization is a qcow2 overlay. The VM runtime section below is therefore the closest end-to-end proxy for real local-attach VM I/O, while the node-local export numbers remain useful for CoronaFS service consumers and for diagnosing exporter overhead.
LightningStor
Measured from node03 against the S3-compatible endpoint on node01.
The object path exercised the distributed backend with replication across the worker storage nodes.
Cluster network baseline for this client, measured with iperf3 from node03 to node01 before the storage tests:
| Metric | Result |
|---|---|
| TCP throughput | 45.99 MiB/s |
| TCP retransmits | 207 |
Large-object path
| Metric | Result |
|---|---|
| Object size | 256 MiB |
| Upload throughput | 18.20 MiB/s |
| Download throughput | 39.21 MiB/s |
Small-object batch
Measured as 32 objects of 4 MiB each (128 MiB total).
| Metric | Result |
|---|---|
| Batch upload throughput | 18.96 MiB/s |
| Batch download throughput | 39.88 MiB/s |
| PUT rate | 4.74 objects/s |
| GET rate | 9.97 objects/s |
Parallel small-object batch
Measured as the same 32 objects of 4 MiB each, but with 8 concurrent client jobs from node03.
| Metric | Result |
|---|---|
| Parallel batch upload throughput | 16.23 MiB/s |
| Parallel batch download throughput | 26.07 MiB/s |
| Parallel PUT rate | 4.06 objects/s |
| Parallel GET rate | 6.52 objects/s |
VM Image Path
Measured against the PlasmaVMC -> LightningStor artifact -> CoronaFS-backed managed volume clone path on node01.
| Metric | Result |
|---|---|
| Guest image artifact size | 2017 MiB |
| Guest image virtual size | 4096 MiB |
CreateImage latency |
66.49 s |
First image-backed CreateVolume latency |
16.86 s |
Second image-backed CreateVolume latency |
0.12 s |
VM Runtime Path
Measured against the real StartVm -> qemu attach -> guest boot -> guest fio path on a worker node, using a CoronaFS-backed root disk and data disk.
| Metric | Result |
|---|---|
StartVm to qemu attach |
0.60 s |
StartVm to guest benchmark result |
35.69 s |
| Guest sequential write | 123.49252223968506 MiB/s |
| Guest sequential read | 1492.7113695144653 MiB/s |
| Guest 4k random read | 25550 IOPS |
Assessment
- CoronaFS controller-export reads are currently 1.6% of the measured local-disk baseline on this nested-QEMU lab cluster.
- CoronaFS controller-export 4k random reads are currently 12.0% of the measured local-disk baseline.
- CoronaFS controller-export queued 4k random reads are currently 13.5% of the measured local queued-random-read baseline.
- CoronaFS controller-export sequential reads are currently 93.0% of the measured node04->node01 TCP baseline, which isolates the centralized source path from raw cluster-network limits.
- CoronaFS controller-export depth-32 reads are currently 0.4% of the local depth-32 baseline.
- CoronaFS node-local reads are currently 26.0% of the measured local-disk baseline, which is the more relevant steady-state signal for mutable VM disks after attachment.
- CoronaFS node-local 4k random reads are currently 30.0% of the measured local-disk baseline.
- CoronaFS node-local queued 4k random reads are currently 27.3% of the measured local queued-random-read baseline.
- CoronaFS node-local depth-32 reads are currently 0.9% of the local depth-32 baseline.
- The target worker's node-local read path is 26.0% of the measured local sequential-read baseline after materialization, which is the better proxy for restart and migration steady state than the old shared-export read.
- PlasmaVMC now attaches writable node-local volumes through the worker-local CoronaFS export, so the guest-runtime section should be treated as the real local VM steady-state path rather than the node-local export numbers alone.
- CoronaFS single-depth writes remain sensitive to the nested-QEMU/VDE lab transport, so the queued-depth and guest-runtime numbers are still the more reliable proxy for real VM workload behavior than the single-stream write figure alone.
- The central export path is now best understood as a source/materialization path; the worker-local export is the path that should determine VM-disk readiness going forward.
- LightningStor's replicated S3 path is working correctly, but 18.20 MiB/s upload and 39.21 MiB/s download are still lab-grade numbers rather than strong object-store throughput.
- LightningStor large-object downloads are currently 85.3% of the same node04->node01 TCP baseline, which indicates how much of the headroom is being lost above the raw network path.
- The current S3 frontend tuning baseline is the built-in 16 MiB streaming threshold with multipart PUT/FETCH concurrency of 8; that combination is the best default observed on this lab cluster so far.
- LightningStor uploads should be read against the replication write quorum and the same ~45.99 MiB/s lab network ceiling; this environment still limits end-to-end throughput well before modern bare-metal NICs would.
- LightningStor's small-object batch path is also functional, but 4.74 PUT/s and 9.97 GET/s still indicate a lab cluster rather than a tuned object-storage deployment.
- The parallel small-object profile is the more relevant control-plane/object-ingest signal; it currently reaches 4.06 PUT/s and 6.52 GET/s.
- The VM image section measures clone/materialization cost, not guest runtime I/O.
- The PlasmaVMC local image-backed clone fast path is now active again; a 0.12 s second clone indicates the CoronaFS qcow2 backing-file path is being hit on node01 rather than falling back to eager raw materialization.
- The VM runtime section is the real
PlasmaVMC + CoronaFS + QEMU virtio-blk + guest kernelpath; use it to judge whether QEMU/NBD tuning is helping. - The local sequential-write baseline is noisy in this environment, so the read and random-read deltas are the more reliable signal.