107 lines
4.5 KiB
Markdown
107 lines
4.5 KiB
Markdown
# Storage Benchmarks
|
|
|
|
Generated on 2026-03-10T20:02:00+09:00 with:
|
|
|
|
```bash
|
|
nix run ./nix/test-cluster#cluster -- fresh-bench-storage
|
|
```
|
|
|
|
## CoronaFS
|
|
|
|
Cluster network baseline, measured with `iperf3` from `node04` to `node01` before the storage tests:
|
|
|
|
| Metric | Result |
|
|
|---|---:|
|
|
| TCP throughput | 22.83 MiB/s |
|
|
| TCP retransmits | 78 |
|
|
|
|
Measured from `node04`.
|
|
Local worker disk is the baseline. CoronaFS is the shared block volume path used for mutable VM disks, exported from `node01` over NBD.
|
|
|
|
| Metric | Local Disk | CoronaFS |
|
|
|---|---:|---:|
|
|
| Sequential write | 26.36 MiB/s | 5.24 MiB/s |
|
|
| Sequential read | 348.77 MiB/s | 10.08 MiB/s |
|
|
| 4k random read | 1243 IOPS | 145 IOPS |
|
|
|
|
Queue-depth profile (`libaio`, `iodepth=32`) from the same worker:
|
|
|
|
| Metric | Local Disk | CoronaFS |
|
|
|---|---:|---:|
|
|
| Depth-32 write | 27.12 MiB/s | 11.42 MiB/s |
|
|
| Depth-32 read | 4797.47 MiB/s | 10.06 MiB/s |
|
|
|
|
Cross-worker shared-volume visibility, measured by writing on `node04` and reading from `node05` over the same CoronaFS NBD export:
|
|
|
|
| Metric | Result |
|
|
|---|---:|
|
|
| Cross-worker sequential read | 17.72 MiB/s |
|
|
|
|
## LightningStor
|
|
|
|
Measured from `node03` against the S3-compatible endpoint on `node01`.
|
|
The object path exercised the distributed backend with replication across the worker storage nodes.
|
|
|
|
Cluster network baseline for this client, measured with `iperf3` from `node03` to `node01` before the storage tests:
|
|
|
|
| Metric | Result |
|
|
|---|---:|
|
|
| TCP throughput | 18.35 MiB/s |
|
|
| TCP retransmits | 78 |
|
|
|
|
### Large-object path
|
|
|
|
| Metric | Result |
|
|
|---|---:|
|
|
| Object size | 256 MiB |
|
|
| Upload throughput | 8.11 MiB/s |
|
|
| Download throughput | 7.54 MiB/s |
|
|
|
|
### Small-object batch
|
|
|
|
Measured as 32 objects of 4 MiB each (128 MiB total).
|
|
|
|
| Metric | Result |
|
|
|---|---:|
|
|
| Batch upload throughput | 0.81 MiB/s |
|
|
| Batch download throughput | 0.83 MiB/s |
|
|
| PUT rate | 0.20 objects/s |
|
|
| GET rate | 0.21 objects/s |
|
|
|
|
### Parallel small-object batch
|
|
|
|
Measured as the same 32 objects of 4 MiB each, but with 8 concurrent client jobs from `node03`.
|
|
|
|
| Metric | Result |
|
|
|---|---:|
|
|
| Parallel batch upload throughput | 3.03 MiB/s |
|
|
| Parallel batch download throughput | 2.89 MiB/s |
|
|
| Parallel PUT rate | 0.76 objects/s |
|
|
| Parallel GET rate | 0.72 objects/s |
|
|
|
|
## VM Image Path
|
|
|
|
Measured against the real `PlasmaVMC -> LightningStor artifact -> CoronaFS-backed managed volume` path on `node01`.
|
|
|
|
| Metric | Result |
|
|
|---|---:|
|
|
| Guest image artifact size | 2017 MiB |
|
|
| Guest image virtual size | 4096 MiB |
|
|
| `CreateImage` latency | 176.03 s |
|
|
| First image-backed `CreateVolume` latency | 76.51 s |
|
|
| Second image-backed `CreateVolume` latency | 170.49 s |
|
|
|
|
## Assessment
|
|
|
|
- CoronaFS shared-volume reads are currently 2.9% of the measured local-disk baseline on this nested-QEMU lab cluster.
|
|
- CoronaFS 4k random reads are currently 11.7% of the measured local-disk baseline.
|
|
- CoronaFS cross-worker reads are currently 5.1% of the measured local-disk sequential-read baseline, which is the more relevant signal for VM restart and migration paths.
|
|
- CoronaFS sequential reads are currently 44.2% of the measured node04->node01 TCP baseline, which helps separate NBD/export overhead from raw cluster-network limits.
|
|
- CoronaFS depth-32 reads are currently 0.2% of the local depth-32 baseline, which is a better proxy for queued guest I/O than the single-depth path.
|
|
- The shared-volume path is functionally correct for mutable VM disks and migration tests, but its read-side throughput is still too low to call production-ready for heavier VM workloads.
|
|
- LightningStor's replicated S3 path is working correctly, but 8.11 MiB/s upload and 7.54 MiB/s download are still lab-grade numbers rather than strong object-store throughput.
|
|
- LightningStor large-object downloads are currently 41.1% of the same node04->node01 TCP baseline, which indicates how much of the headroom is being lost above the raw network path.
|
|
- LightningStor's small-object batch path is also functional, but 0.20 PUT/s and 0.21 GET/s still indicate a lab cluster rather than a tuned object-storage deployment.
|
|
- The parallel small-object profile is the more relevant control-plane/object-ingest signal; it currently reaches 0.76 PUT/s and 0.72 GET/s.
|
|
- The VM image path is now measured directly rather than inferred. The cold `CreateVolume` path includes artifact fetch plus CoronaFS population; the warm `CreateVolume` path isolates repeated CoronaFS population from an already cached image.
|
|
- The local sequential-write baseline is noisy in this environment, so the read and random-read deltas are the more reliable signal.
|