photoncloud-monorepo/docs/storage-benchmarks.baseline.md

4.5 KiB

Storage Benchmarks

Generated on 2026-03-10T20:02:00+09:00 with:

nix run ./nix/test-cluster#cluster -- fresh-bench-storage

CoronaFS

Cluster network baseline, measured with iperf3 from node04 to node01 before the storage tests:

Metric Result
TCP throughput 22.83 MiB/s
TCP retransmits 78

Measured from node04. Local worker disk is the baseline. CoronaFS is the shared block volume path used for mutable VM disks, exported from node01 over NBD.

Metric Local Disk CoronaFS
Sequential write 26.36 MiB/s 5.24 MiB/s
Sequential read 348.77 MiB/s 10.08 MiB/s
4k random read 1243 IOPS 145 IOPS

Queue-depth profile (libaio, iodepth=32) from the same worker:

Metric Local Disk CoronaFS
Depth-32 write 27.12 MiB/s 11.42 MiB/s
Depth-32 read 4797.47 MiB/s 10.06 MiB/s

Cross-worker shared-volume visibility, measured by writing on node04 and reading from node05 over the same CoronaFS NBD export:

Metric Result
Cross-worker sequential read 17.72 MiB/s

LightningStor

Measured from node03 against the S3-compatible endpoint on node01. The object path exercised the distributed backend with replication across the worker storage nodes.

Cluster network baseline for this client, measured with iperf3 from node03 to node01 before the storage tests:

Metric Result
TCP throughput 18.35 MiB/s
TCP retransmits 78

Large-object path

Metric Result
Object size 256 MiB
Upload throughput 8.11 MiB/s
Download throughput 7.54 MiB/s

Small-object batch

Measured as 32 objects of 4 MiB each (128 MiB total).

Metric Result
Batch upload throughput 0.81 MiB/s
Batch download throughput 0.83 MiB/s
PUT rate 0.20 objects/s
GET rate 0.21 objects/s

Parallel small-object batch

Measured as the same 32 objects of 4 MiB each, but with 8 concurrent client jobs from node03.

Metric Result
Parallel batch upload throughput 3.03 MiB/s
Parallel batch download throughput 2.89 MiB/s
Parallel PUT rate 0.76 objects/s
Parallel GET rate 0.72 objects/s

VM Image Path

Measured against the real PlasmaVMC -> LightningStor artifact -> CoronaFS-backed managed volume path on node01.

Metric Result
Guest image artifact size 2017 MiB
Guest image virtual size 4096 MiB
CreateImage latency 176.03 s
First image-backed CreateVolume latency 76.51 s
Second image-backed CreateVolume latency 170.49 s

Assessment

  • CoronaFS shared-volume reads are currently 2.9% of the measured local-disk baseline on this nested-QEMU lab cluster.
  • CoronaFS 4k random reads are currently 11.7% of the measured local-disk baseline.
  • CoronaFS cross-worker reads are currently 5.1% of the measured local-disk sequential-read baseline, which is the more relevant signal for VM restart and migration paths.
  • CoronaFS sequential reads are currently 44.2% of the measured node04->node01 TCP baseline, which helps separate NBD/export overhead from raw cluster-network limits.
  • CoronaFS depth-32 reads are currently 0.2% of the local depth-32 baseline, which is a better proxy for queued guest I/O than the single-depth path.
  • The shared-volume path is functionally correct for mutable VM disks and migration tests, but its read-side throughput is still too low to call production-ready for heavier VM workloads.
  • LightningStor's replicated S3 path is working correctly, but 8.11 MiB/s upload and 7.54 MiB/s download are still lab-grade numbers rather than strong object-store throughput.
  • LightningStor large-object downloads are currently 41.1% of the same node04->node01 TCP baseline, which indicates how much of the headroom is being lost above the raw network path.
  • LightningStor's small-object batch path is also functional, but 0.20 PUT/s and 0.21 GET/s still indicate a lab cluster rather than a tuned object-storage deployment.
  • The parallel small-object profile is the more relevant control-plane/object-ingest signal; it currently reaches 0.76 PUT/s and 0.72 GET/s.
  • The VM image path is now measured directly rather than inferred. The cold CreateVolume path includes artifact fetch plus CoronaFS population; the warm CreateVolume path isolates repeated CoronaFS population from an already cached image.
  • The local sequential-write baseline is noisy in this environment, so the read and random-read deltas are the more reliable signal.