- netboot-base.nix with SSH key auth - Launch scripts for node01/02/03 - Node configuration.nix and disko.nix - Nix modules for first-boot automation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
948 lines
30 KiB
Markdown
948 lines
30 KiB
Markdown
# LightningStor Specification
|
|
|
|
> Version: 1.0 | Status: Draft | Last Updated: 2025-12-08
|
|
|
|
## 1. Overview
|
|
|
|
### 1.1 Purpose
|
|
LightningStor is an S3-compatible object storage service providing durable, scalable blob storage for the cloud platform. It enables applications to store and retrieve any amount of data with high availability, supporting standard S3 API operations alongside a native gRPC interface for internal services.
|
|
|
|
The name "LightningStor" reflects fast, reliable storage with the "Lightning" prefix denoting speed and the cloud platform family branding.
|
|
|
|
### 1.2 Scope
|
|
- **In scope**: S3-compatible API (GET, PUT, DELETE, LIST, multipart upload), bucket management, object versioning, object metadata, access control via aegis, multi-tenant isolation (org/project scoped buckets), chunked internal storage, presigned URLs
|
|
- **Out of scope**: CDN/edge caching, full S3 feature parity (bucket policies, lifecycle rules - planned), cross-region replication, S3 Select, Glacier-tier storage
|
|
|
|
### 1.3 Design Goals
|
|
- **S3 API compatibility**: Support standard S3 clients (AWS SDK, s3cmd, rclone)
|
|
- **Multi-tenant from day one**: Bucket scoping to org/project with aegis integration
|
|
- **Pluggable storage backends**: Abstract storage layer for local FS, distributed storage
|
|
- **High throughput**: Chunked storage for large objects, parallel uploads
|
|
- **Cloud-native**: gRPC internal API, Prometheus metrics, health checks
|
|
- **Consistent metadata**: Chainfire for bucket/object metadata with strong consistency
|
|
|
|
## 2. Architecture
|
|
|
|
### 2.1 Crate Structure
|
|
```
|
|
lightningstor/
|
|
├── crates/
|
|
│ ├── lightningstor-api/ # gRPC service implementations
|
|
│ ├── lightningstor-client/ # Rust client library
|
|
│ ├── lightningstor-s3/ # S3-compatible HTTP layer
|
|
│ ├── lightningstor-server/ # Server binary
|
|
│ ├── lightningstor-storage/ # Storage backend abstraction
|
|
│ └── lightningstor-types/ # Shared types
|
|
└── proto/
|
|
└── lightningstor.proto # gRPC API definitions
|
|
```
|
|
|
|
### 2.2 Component Topology
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ LightningStor Server │
|
|
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
|
│ │ lightningstor- │ │ lightningstor- │ │ lightningstor- │ │
|
|
│ │ s3 │ │ api │ │ storage │ │
|
|
│ │ (HTTP/REST) │ │ (gRPC) │ │ (backend) │ │
|
|
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
|
|
│ │ │ │ │
|
|
│ └────────────────────┼────────────────────┘ │
|
|
│ │ │
|
|
│ ┌──────▼──────┐ │
|
|
│ │ Core │ │
|
|
│ │ (objects, │ │
|
|
│ │ buckets) │ │
|
|
│ └──────┬──────┘ │
|
|
└────────────────────────────────┼────────────────────────────────┘
|
|
│
|
|
┌────────────┼────────────┐
|
|
▼ ▼ ▼
|
|
┌───────────┐ ┌───────────┐ ┌───────────┐
|
|
│ Chainfire │ │ Blob │ │ Aegis │
|
|
│(metadata) │ │ Storage │ │ (IAM) │
|
|
└───────────┘ └───────────┘ └───────────┘
|
|
```
|
|
|
|
### 2.3 Data Flow
|
|
```
|
|
[S3 Client] → [S3 HTTP Layer] → [Core Service] → [Storage Backend]
|
|
│ │
|
|
[gRPC Client] → [gRPC API] ──────────┘ │
|
|
│ │
|
|
▼ ▼
|
|
[Chainfire] [Blob Store]
|
|
(metadata) (object data)
|
|
```
|
|
|
|
### 2.4 Dependencies
|
|
| Crate | Version | Purpose |
|
|
|-------|---------|---------|
|
|
| tokio | 1.x | Async runtime |
|
|
| tonic | 0.12 | gRPC framework |
|
|
| axum | 0.7 | S3 HTTP API |
|
|
| prost | 0.13 | Protocol buffers |
|
|
| aws-sigv4 | 1.x | S3 signature verification |
|
|
| uuid | 1.x | Object/chunk identifiers |
|
|
| sha2 | 0.10 | Content checksums |
|
|
| dashmap | 6.x | Concurrent caches |
|
|
|
|
## 3. Core Concepts
|
|
|
|
### 3.1 Bucket
|
|
A container for objects, scoped to an organization and optionally a project.
|
|
|
|
```rust
|
|
pub struct Bucket {
|
|
pub name: String, // Globally unique within scope
|
|
pub org_id: String, // Owner organization
|
|
pub project_id: Option<String>, // Optional project scope
|
|
pub versioning: VersioningConfig, // Versioning state
|
|
pub created_at: u64, // Creation timestamp (Unix ms)
|
|
pub updated_at: u64, // Last modification
|
|
pub created_by: String, // Principal ID
|
|
pub storage_class: StorageClass, // Default storage class
|
|
pub quota: Option<BucketQuota>, // Size/object limits
|
|
pub metadata: HashMap<String, String>,
|
|
pub tags: HashMap<String, String>,
|
|
}
|
|
|
|
pub enum VersioningConfig {
|
|
Disabled, // No versioning (default)
|
|
Enabled, // Keep all versions
|
|
Suspended, // Stop versioning, keep existing versions
|
|
}
|
|
|
|
pub enum StorageClass {
|
|
Standard, // Default, high availability
|
|
ReducedRedundancy, // Lower durability, lower cost (future)
|
|
Archive, // Cold storage (future)
|
|
}
|
|
|
|
pub struct BucketQuota {
|
|
pub max_size_bytes: Option<u64>,
|
|
pub max_objects: Option<u64>,
|
|
}
|
|
```
|
|
|
|
**Bucket Naming Rules**:
|
|
- 3-63 characters
|
|
- Lowercase letters, numbers, hyphens
|
|
- Must start with letter or number
|
|
- Unique within org (or project if project-scoped)
|
|
- Fully qualified name: `{org_id}/{project_id}/{name}` or `{org_id}/{name}`
|
|
|
|
### 3.2 Object
|
|
A stored blob with metadata, identified by a key within a bucket.
|
|
|
|
```rust
|
|
pub struct Object {
|
|
pub bucket: String, // Parent bucket name
|
|
pub key: String, // Object key (path-like)
|
|
pub version_id: Option<String>, // Version identifier
|
|
pub size: u64, // Content size in bytes
|
|
pub etag: String, // Content hash (MD5 or composite)
|
|
pub content_type: String, // MIME type
|
|
pub content_encoding: Option<String>,
|
|
pub checksum: ObjectChecksum, // SHA256 or other
|
|
pub storage_class: StorageClass,
|
|
pub metadata: HashMap<String, String>, // User metadata (x-amz-meta-*)
|
|
pub created_at: u64,
|
|
pub updated_at: u64,
|
|
pub delete_marker: bool, // Versioned delete marker
|
|
pub chunks: Vec<ChunkRef>, // Internal storage references
|
|
}
|
|
|
|
pub struct ObjectChecksum {
|
|
pub algorithm: ChecksumAlgorithm,
|
|
pub value: String, // Hex-encoded
|
|
}
|
|
|
|
pub enum ChecksumAlgorithm {
|
|
Sha256,
|
|
Sha1,
|
|
Md5,
|
|
Crc32,
|
|
Crc32c,
|
|
}
|
|
|
|
pub struct ChunkRef {
|
|
pub chunk_id: String, // UUID
|
|
pub offset: u64, // Offset in object
|
|
pub size: u64, // Chunk size
|
|
pub checksum: String, // Chunk checksum
|
|
}
|
|
```
|
|
|
|
### 3.3 ObjectKey
|
|
Composite key identifying an object within the storage namespace.
|
|
|
|
```rust
|
|
pub struct ObjectKey {
|
|
pub org_id: String,
|
|
pub project_id: Option<String>,
|
|
pub bucket: String,
|
|
pub key: String,
|
|
pub version_id: Option<String>,
|
|
}
|
|
|
|
impl ObjectKey {
|
|
/// Storage path: org/{org}/project/{proj}/bucket/{bucket}/key/{key}
|
|
pub fn to_storage_path(&self) -> String;
|
|
|
|
/// Parse from storage path
|
|
pub fn from_storage_path(path: &str) -> Result<Self>;
|
|
|
|
/// S3-style ARN: arn:lightningstror:{org}:{project}:{bucket}/{key}
|
|
pub fn to_arn(&self) -> String;
|
|
}
|
|
```
|
|
|
|
### 3.4 MultipartUpload
|
|
State for chunked uploads of large objects.
|
|
|
|
```rust
|
|
pub struct MultipartUpload {
|
|
pub upload_id: String, // UUID
|
|
pub bucket: String,
|
|
pub key: String,
|
|
pub org_id: String,
|
|
pub project_id: Option<String>,
|
|
pub initiated_at: u64,
|
|
pub initiated_by: String, // Principal ID
|
|
pub storage_class: StorageClass,
|
|
pub metadata: HashMap<String, String>,
|
|
pub parts: Vec<UploadPart>,
|
|
pub status: UploadStatus,
|
|
}
|
|
|
|
pub struct UploadPart {
|
|
pub part_number: u32, // 1-10000
|
|
pub etag: String, // Part content hash
|
|
pub size: u64,
|
|
pub chunk_id: String, // Storage reference
|
|
pub uploaded_at: u64,
|
|
}
|
|
|
|
pub enum UploadStatus {
|
|
InProgress,
|
|
Completing,
|
|
Completed,
|
|
Aborted,
|
|
}
|
|
```
|
|
|
|
**Multipart Upload Limits**:
|
|
- Part size: 5 MiB - 5 GiB
|
|
- Part count: 1 - 10,000
|
|
- Object size: up to 5 TiB
|
|
- Upload timeout: 7 days (configurable)
|
|
|
|
## 4. API
|
|
|
|
### 4.1 gRPC Services
|
|
|
|
#### Object Service (`lightningstor.v1.ObjectService`)
|
|
```protobuf
|
|
service ObjectService {
|
|
// Object operations
|
|
rpc PutObject(PutObjectRequest) returns (PutObjectResponse);
|
|
rpc GetObject(GetObjectRequest) returns (stream GetObjectResponse);
|
|
rpc HeadObject(HeadObjectRequest) returns (HeadObjectResponse);
|
|
rpc DeleteObject(DeleteObjectRequest) returns (DeleteObjectResponse);
|
|
rpc DeleteObjects(DeleteObjectsRequest) returns (DeleteObjectsResponse);
|
|
rpc CopyObject(CopyObjectRequest) returns (CopyObjectResponse);
|
|
|
|
// Listing
|
|
rpc ListObjects(ListObjectsRequest) returns (ListObjectsResponse);
|
|
rpc ListObjectVersions(ListObjectVersionsRequest) returns (ListObjectVersionsResponse);
|
|
|
|
// Multipart
|
|
rpc CreateMultipartUpload(CreateMultipartUploadRequest) returns (CreateMultipartUploadResponse);
|
|
rpc UploadPart(stream UploadPartRequest) returns (UploadPartResponse);
|
|
rpc CompleteMultipartUpload(CompleteMultipartUploadRequest) returns (CompleteMultipartUploadResponse);
|
|
rpc AbortMultipartUpload(AbortMultipartUploadRequest) returns (AbortMultipartUploadResponse);
|
|
rpc ListMultipartUploads(ListMultipartUploadsRequest) returns (ListMultipartUploadsResponse);
|
|
rpc ListParts(ListPartsRequest) returns (ListPartsResponse);
|
|
}
|
|
|
|
message PutObjectRequest {
|
|
string bucket = 1;
|
|
string key = 2;
|
|
bytes content = 3; // For small objects
|
|
string content_type = 4;
|
|
map<string, string> metadata = 5;
|
|
ChecksumAlgorithm checksum_algorithm = 6;
|
|
string checksum_value = 7; // Pre-computed by client
|
|
}
|
|
|
|
message GetObjectRequest {
|
|
string bucket = 1;
|
|
string key = 2;
|
|
optional string version_id = 3;
|
|
optional string range = 4; // "bytes=0-1023"
|
|
optional string if_match = 5; // ETag condition
|
|
optional string if_none_match = 6;
|
|
}
|
|
|
|
message GetObjectResponse {
|
|
ObjectMetadata metadata = 1; // First message only
|
|
bytes content = 2; // Streamed chunks
|
|
}
|
|
```
|
|
|
|
#### Bucket Service (`lightningstor.v1.BucketService`)
|
|
```protobuf
|
|
service BucketService {
|
|
rpc CreateBucket(CreateBucketRequest) returns (Bucket);
|
|
rpc HeadBucket(HeadBucketRequest) returns (HeadBucketResponse);
|
|
rpc DeleteBucket(DeleteBucketRequest) returns (DeleteBucketResponse);
|
|
rpc ListBuckets(ListBucketsRequest) returns (ListBucketsResponse);
|
|
|
|
// Versioning
|
|
rpc GetBucketVersioning(GetBucketVersioningRequest) returns (VersioningConfig);
|
|
rpc PutBucketVersioning(PutBucketVersioningRequest) returns (VersioningConfig);
|
|
|
|
// Tagging
|
|
rpc GetBucketTagging(GetBucketTaggingRequest) returns (BucketTagging);
|
|
rpc PutBucketTagging(PutBucketTaggingRequest) returns (BucketTagging);
|
|
rpc DeleteBucketTagging(DeleteBucketTaggingRequest) returns (Empty);
|
|
}
|
|
|
|
message CreateBucketRequest {
|
|
string name = 1;
|
|
string org_id = 2;
|
|
optional string project_id = 3;
|
|
VersioningConfig versioning = 4;
|
|
StorageClass storage_class = 5;
|
|
map<string, string> tags = 6;
|
|
}
|
|
```
|
|
|
|
### 4.2 S3-Compatible HTTP API
|
|
|
|
The S3 HTTP layer (`lightningstor-s3`) exposes standard S3 REST endpoints.
|
|
|
|
**Bucket Operations**:
|
|
```
|
|
PUT /{bucket} CreateBucket
|
|
HEAD /{bucket} HeadBucket
|
|
DELETE /{bucket} DeleteBucket
|
|
GET / ListBuckets
|
|
GET /{bucket}?versioning GetBucketVersioning
|
|
PUT /{bucket}?versioning PutBucketVersioning
|
|
```
|
|
|
|
**Object Operations**:
|
|
```
|
|
PUT /{bucket}/{key} PutObject
|
|
GET /{bucket}/{key} GetObject
|
|
HEAD /{bucket}/{key} HeadObject
|
|
DELETE /{bucket}/{key} DeleteObject
|
|
POST /{bucket}?delete DeleteObjects (bulk)
|
|
PUT /{bucket}/{key}?copy CopyObject
|
|
GET /{bucket}?list-type=2 ListObjectsV2
|
|
GET /{bucket}?versions ListObjectVersions
|
|
```
|
|
|
|
**Multipart Upload**:
|
|
```
|
|
POST /{bucket}/{key}?uploads CreateMultipartUpload
|
|
PUT /{bucket}/{key}?partNumber=N&uploadId=X UploadPart
|
|
POST /{bucket}/{key}?uploadId=X CompleteMultipartUpload
|
|
DELETE /{bucket}/{key}?uploadId=X AbortMultipartUpload
|
|
GET /{bucket}?uploads ListMultipartUploads
|
|
GET /{bucket}/{key}?uploadId=X ListParts
|
|
```
|
|
|
|
**Presigned URLs**:
|
|
```
|
|
GET /{bucket}/{key}?X-Amz-Algorithm=AWS4-HMAC-SHA256&...
|
|
PUT /{bucket}/{key}?X-Amz-Algorithm=AWS4-HMAC-SHA256&...
|
|
```
|
|
|
|
### 4.3 Authentication
|
|
|
|
**S3 Signature V4**:
|
|
- AWS Signature Version 4 for S3 HTTP API
|
|
- Access Key ID mapped to aegis service account
|
|
- Secret Access Key stored in aegis as credential
|
|
|
|
```rust
|
|
pub struct S3Credentials {
|
|
pub access_key_id: String, // Mapped to principal
|
|
pub secret_access_key: String, // Stored encrypted
|
|
pub principal_id: String, // aegis principal reference
|
|
pub org_id: String,
|
|
pub project_id: Option<String>,
|
|
pub created_at: u64,
|
|
pub expires_at: Option<u64>,
|
|
}
|
|
```
|
|
|
|
**gRPC Authentication**:
|
|
- aegis internal tokens (mTLS for service-to-service)
|
|
- Bearer token in `authorization` metadata
|
|
|
|
### 4.4 Client Library
|
|
```rust
|
|
use lightningstor_client::LightningStorClient;
|
|
|
|
let client = LightningStorClient::connect("http://127.0.0.1:9000").await?;
|
|
|
|
// Create bucket
|
|
client.create_bucket(CreateBucketRequest {
|
|
name: "my-bucket".into(),
|
|
org_id: "org-1".into(),
|
|
project_id: Some("proj-1".into()),
|
|
..Default::default()
|
|
}).await?;
|
|
|
|
// Put object
|
|
client.put_object(PutObjectRequest {
|
|
bucket: "my-bucket".into(),
|
|
key: "path/to/object.txt".into(),
|
|
content: b"Hello, World!".to_vec(),
|
|
content_type: "text/plain".into(),
|
|
..Default::default()
|
|
}).await?;
|
|
|
|
// Get object (streaming)
|
|
let mut stream = client.get_object(GetObjectRequest {
|
|
bucket: "my-bucket".into(),
|
|
key: "path/to/object.txt".into(),
|
|
..Default::default()
|
|
}).await?;
|
|
|
|
let mut content = Vec::new();
|
|
while let Some(chunk) = stream.next().await {
|
|
content.extend(chunk?.content);
|
|
}
|
|
|
|
// Multipart upload for large files
|
|
let upload = client.create_multipart_upload(CreateMultipartUploadRequest {
|
|
bucket: "my-bucket".into(),
|
|
key: "large-file.bin".into(),
|
|
..Default::default()
|
|
}).await?;
|
|
|
|
let mut parts = Vec::new();
|
|
for (i, chunk) in file_chunks.enumerate() {
|
|
let part = client.upload_part(upload.upload_id.clone(), i as u32 + 1, chunk).await?;
|
|
parts.push(part);
|
|
}
|
|
|
|
client.complete_multipart_upload(upload.upload_id, parts).await?;
|
|
```
|
|
|
|
## 5. Storage Backend
|
|
|
|
### 5.1 Backend Trait
|
|
```rust
|
|
#[async_trait]
|
|
pub trait StorageBackend: Send + Sync {
|
|
/// Store a chunk of data
|
|
async fn put_chunk(&self, chunk_id: &str, data: &[u8]) -> Result<()>;
|
|
|
|
/// Retrieve a chunk
|
|
async fn get_chunk(&self, chunk_id: &str) -> Result<Vec<u8>>;
|
|
|
|
/// Retrieve a range within a chunk
|
|
async fn get_chunk_range(&self, chunk_id: &str, offset: u64, length: u64) -> Result<Vec<u8>>;
|
|
|
|
/// Delete a chunk
|
|
async fn delete_chunk(&self, chunk_id: &str) -> Result<bool>;
|
|
|
|
/// Check if chunk exists
|
|
async fn chunk_exists(&self, chunk_id: &str) -> Result<bool>;
|
|
|
|
/// Backend capabilities
|
|
fn capabilities(&self) -> BackendCapabilities;
|
|
}
|
|
|
|
pub struct BackendCapabilities {
|
|
pub max_chunk_size: u64,
|
|
pub supports_range_reads: bool,
|
|
pub supports_streaming: bool,
|
|
pub durability: DurabilityLevel,
|
|
}
|
|
|
|
pub enum DurabilityLevel {
|
|
Local, // Single node
|
|
Replicated(u32), // N-way replication
|
|
ErasureCoded, // EC with configurable params
|
|
}
|
|
```
|
|
|
|
### 5.2 Backend Implementations
|
|
|
|
**Local Filesystem** (development/single-node):
|
|
```rust
|
|
pub struct LocalFsBackend {
|
|
base_path: PathBuf,
|
|
shard_depth: u8, // Directory sharding depth
|
|
}
|
|
|
|
// Storage layout:
|
|
// {base_path}/{shard1}/{shard2}/{chunk_id}
|
|
// e.g., /data/chunks/ab/cd/abcd1234-...
|
|
```
|
|
|
|
**Distributed Backend** (production):
|
|
```rust
|
|
pub struct DistributedBackend {
|
|
nodes: Vec<NodeEndpoint>,
|
|
replication_factor: u32,
|
|
placement_strategy: PlacementStrategy,
|
|
}
|
|
|
|
pub enum PlacementStrategy {
|
|
Random,
|
|
ConsistentHash,
|
|
ZoneAware { zones: Vec<Zone> },
|
|
}
|
|
```
|
|
|
|
### 5.3 Chunk Management
|
|
```rust
|
|
pub struct ChunkManager {
|
|
backend: Arc<dyn StorageBackend>,
|
|
chunk_size: u64, // Default: 8 MiB
|
|
min_chunk_size: u64, // 1 MiB
|
|
max_chunk_size: u64, // 64 MiB
|
|
}
|
|
|
|
impl ChunkManager {
|
|
/// Split content into chunks and store
|
|
pub async fn store_object(&self, content: &[u8]) -> Result<Vec<ChunkRef>>;
|
|
|
|
/// Retrieve object content from chunks
|
|
pub async fn retrieve_object(&self, chunks: &[ChunkRef]) -> Result<Vec<u8>>;
|
|
|
|
/// Retrieve range across chunks
|
|
pub async fn retrieve_range(
|
|
&self,
|
|
chunks: &[ChunkRef],
|
|
offset: u64,
|
|
length: u64
|
|
) -> Result<Vec<u8>>;
|
|
|
|
/// Delete all chunks for an object
|
|
pub async fn delete_object(&self, chunks: &[ChunkRef]) -> Result<()>;
|
|
}
|
|
```
|
|
|
|
## 6. Metadata Storage
|
|
|
|
### 6.1 Chainfire Key Schema
|
|
|
|
**Buckets**:
|
|
```
|
|
lightningstor/buckets/{org_id}/{bucket_name} # Bucket record
|
|
lightningstor/buckets/{org_id}/{project_id}/{bucket_name} # Project-scoped
|
|
lightningstor/buckets/by-project/{project_id}/{bucket_name} # Project index
|
|
```
|
|
|
|
**Objects**:
|
|
```
|
|
lightningstor/objects/{org_id}/{bucket}/{key} # Current version
|
|
lightningstor/objects/{org_id}/{bucket}/{key}/v/{version_id} # Specific version
|
|
lightningstor/objects/{org_id}/{bucket}/{key}/versions # Version list
|
|
```
|
|
|
|
**Multipart Uploads**:
|
|
```
|
|
lightningstor/uploads/{upload_id} # Upload record
|
|
lightningstor/uploads/by-bucket/{bucket}/{upload_id} # Bucket index
|
|
lightningstor/uploads/{upload_id}/parts/{part_number} # Part records
|
|
```
|
|
|
|
**S3 Credentials**:
|
|
```
|
|
lightningstor/credentials/{access_key_id} # Credential lookup
|
|
lightningstor/credentials/by-principal/{principal_id}/{key_id} # Principal index
|
|
```
|
|
|
|
### 6.2 Object Listing
|
|
```rust
|
|
pub struct ListObjectsRequest {
|
|
pub bucket: String,
|
|
pub prefix: Option<String>,
|
|
pub delimiter: Option<String>, // For hierarchy (usually "/")
|
|
pub max_keys: u32, // Default: 1000
|
|
pub continuation_token: Option<String>,
|
|
pub start_after: Option<String>,
|
|
}
|
|
|
|
pub struct ListObjectsResponse {
|
|
pub contents: Vec<ObjectSummary>,
|
|
pub common_prefixes: Vec<String>, // "Directories" when using delimiter
|
|
pub is_truncated: bool,
|
|
pub next_continuation_token: Option<String>,
|
|
}
|
|
```
|
|
|
|
## 7. Multi-Tenancy
|
|
|
|
### 7.1 Scope Hierarchy
|
|
```
|
|
System (platform operators)
|
|
└─ Organization (tenant boundary)
|
|
├─ Org-level buckets (shared across projects)
|
|
└─ Project (workload isolation)
|
|
└─ Project-level buckets
|
|
```
|
|
|
|
### 7.2 Bucket Scoping
|
|
```rust
|
|
pub enum BucketScope {
|
|
/// Bucket accessible to all projects in org
|
|
Organization { org_id: String },
|
|
|
|
/// Bucket scoped to specific project
|
|
Project { org_id: String, project_id: String },
|
|
}
|
|
|
|
impl Bucket {
|
|
pub fn scope(&self) -> BucketScope {
|
|
match &self.project_id {
|
|
Some(pid) => BucketScope::Project {
|
|
org_id: self.org_id.clone(),
|
|
project_id: pid.clone()
|
|
},
|
|
None => BucketScope::Organization {
|
|
org_id: self.org_id.clone()
|
|
},
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 7.3 Access Control Integration
|
|
```rust
|
|
// aegis action patterns for lightningstor
|
|
const ACTIONS: &[&str] = &[
|
|
"lightningstor:buckets:create",
|
|
"lightningstor:buckets:get",
|
|
"lightningstor:buckets:list",
|
|
"lightningstor:buckets:delete",
|
|
"lightningstor:objects:put",
|
|
"lightningstor:objects:get",
|
|
"lightningstor:objects:list",
|
|
"lightningstor:objects:delete",
|
|
"lightningstor:objects:copy",
|
|
"lightningstor:uploads:create",
|
|
"lightningstor:uploads:complete",
|
|
"lightningstor:uploads:abort",
|
|
];
|
|
|
|
// Resource path format
|
|
// org/{org_id}/project/{project_id}/bucket/{bucket_name}
|
|
// org/{org_id}/project/{project_id}/bucket/{bucket_name}/object/{key}
|
|
|
|
async fn authorize_object_access(
|
|
iam: &IamClient,
|
|
principal: &PrincipalRef,
|
|
action: &str,
|
|
bucket: &Bucket,
|
|
key: Option<&str>,
|
|
) -> Result<()> {
|
|
let resource = ResourceRef {
|
|
kind: match key {
|
|
Some(_) => "object".into(),
|
|
None => "bucket".into(),
|
|
},
|
|
id: key.unwrap_or(&bucket.name).into(),
|
|
org_id: bucket.org_id.clone(),
|
|
project_id: bucket.project_id.clone().unwrap_or_default(),
|
|
..Default::default()
|
|
};
|
|
|
|
let allowed = iam.authorize(principal, action, &resource).await?;
|
|
if !allowed {
|
|
return Err(Error::AccessDenied);
|
|
}
|
|
Ok(())
|
|
}
|
|
```
|
|
|
|
### 7.4 Quotas
|
|
```rust
|
|
pub struct StorageQuota {
|
|
pub scope: BucketScope,
|
|
pub limits: StorageLimits,
|
|
pub usage: StorageUsage,
|
|
}
|
|
|
|
pub struct StorageLimits {
|
|
pub max_buckets: Option<u32>,
|
|
pub max_total_size_bytes: Option<u64>,
|
|
pub max_objects_per_bucket: Option<u64>,
|
|
pub max_object_size_bytes: Option<u64>,
|
|
}
|
|
|
|
pub struct StorageUsage {
|
|
pub bucket_count: u32,
|
|
pub total_size_bytes: u64,
|
|
pub object_count: u64,
|
|
}
|
|
```
|
|
|
|
## 8. Configuration
|
|
|
|
### 8.1 Config File Format (TOML)
|
|
```toml
|
|
[server]
|
|
grpc_addr = "0.0.0.0:9001"
|
|
s3_addr = "0.0.0.0:9000"
|
|
|
|
[server.tls]
|
|
cert_file = "/etc/lightningstor/tls/server.crt"
|
|
key_file = "/etc/lightningstor/tls/server.key"
|
|
ca_file = "/etc/lightningstor/tls/ca.crt"
|
|
|
|
[metadata]
|
|
backend = "chainfire"
|
|
chainfire_endpoints = ["http://chainfire-1:2379", "http://chainfire-2:2379"]
|
|
|
|
[storage]
|
|
backend = "local" # "local" | "distributed"
|
|
data_dir = "/var/lib/lightningstor/data"
|
|
chunk_size_bytes = 8388608 # 8 MiB
|
|
shard_depth = 2
|
|
|
|
[storage.distributed]
|
|
# For distributed backend
|
|
nodes = ["http://store-1:9002", "http://store-2:9002", "http://store-3:9002"]
|
|
replication_factor = 3
|
|
placement_strategy = "consistent_hash"
|
|
|
|
[iam]
|
|
endpoint = "http://aegis:9090"
|
|
service_account = "lightningstor"
|
|
token_path = "/var/run/secrets/iam/token"
|
|
|
|
[s3]
|
|
region = "us-east-1" # Default region for S3 compat
|
|
signature_version = "v4"
|
|
presigned_url_ttl_seconds = 3600
|
|
|
|
[limits]
|
|
max_object_size_bytes = 5497558138880 # 5 TiB
|
|
max_multipart_parts = 10000
|
|
multipart_upload_timeout_hours = 168 # 7 days
|
|
max_keys_per_list = 1000
|
|
|
|
[logging]
|
|
level = "info"
|
|
format = "json"
|
|
```
|
|
|
|
### 8.2 Environment Variables
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `LIGHTNINGSTOR_CONFIG` | - | Config file path |
|
|
| `LIGHTNINGSTOR_GRPC_ADDR` | `0.0.0.0:9001` | gRPC listen address |
|
|
| `LIGHTNINGSTOR_S3_ADDR` | `0.0.0.0:9000` | S3 HTTP listen address |
|
|
| `LIGHTNINGSTOR_LOG_LEVEL` | `info` | Log level |
|
|
| `LIGHTNINGSTOR_DATA_DIR` | `/var/lib/lightningstor` | Data directory |
|
|
|
|
### 8.3 CLI Arguments
|
|
```
|
|
lightningstor-server [OPTIONS]
|
|
-c, --config <PATH> Config file path
|
|
--grpc-addr <ADDR> gRPC listen address
|
|
--s3-addr <ADDR> S3 HTTP listen address
|
|
-l, --log-level <LEVEL> Log level
|
|
-h, --help Print help
|
|
-V, --version Print version
|
|
```
|
|
|
|
## 9. Security
|
|
|
|
### 9.1 Authentication
|
|
|
|
**S3 HTTP API**:
|
|
- AWS Signature V4 verification
|
|
- Access key mapped to aegis principal
|
|
- Request signing with secret key
|
|
|
|
**gRPC API**:
|
|
- mTLS for service-to-service
|
|
- aegis bearer tokens
|
|
- Optional API key header
|
|
|
|
### 9.2 Authorization
|
|
- All operations authorized via aegis
|
|
- Bucket-level and object-level permissions
|
|
- Scope enforcement (org/project boundaries)
|
|
- Owner-based access patterns supported
|
|
|
|
### 9.3 Data Security
|
|
- TLS 1.3 for all transport
|
|
- Server-side encryption at rest (planned)
|
|
- Client-side encryption supported
|
|
- Checksum verification on all operations
|
|
|
|
### 9.4 Audit
|
|
- All operations logged with principal, action, resource
|
|
- Integration with platform audit system
|
|
- S3 access logs (planned)
|
|
|
|
## 10. Operations
|
|
|
|
### 10.1 Deployment
|
|
|
|
**Single Node (Development)**:
|
|
```bash
|
|
lightningstor-server --config config.toml
|
|
```
|
|
|
|
**Production Cluster**:
|
|
```bash
|
|
# Multiple stateless API servers behind load balancer
|
|
lightningstor-server --config config.toml
|
|
|
|
# Shared metadata (Chainfire cluster)
|
|
# Shared blob storage (distributed backend or shared filesystem)
|
|
```
|
|
|
|
### 10.2 Monitoring
|
|
|
|
**Metrics (Prometheus)**:
|
|
| Metric | Type | Description |
|
|
|--------|------|-------------|
|
|
| `lightningstor_requests_total` | Counter | Total requests by operation |
|
|
| `lightningstor_request_duration_seconds` | Histogram | Request latency |
|
|
| `lightningstor_object_size_bytes` | Histogram | Object sizes |
|
|
| `lightningstor_objects_total` | Gauge | Total objects |
|
|
| `lightningstor_storage_bytes` | Gauge | Total storage used |
|
|
| `lightningstor_multipart_uploads_active` | Gauge | Active multipart uploads |
|
|
| `lightningstor_s3_errors_total` | Counter | S3 API errors by code |
|
|
|
|
**Health Endpoints**:
|
|
- `GET /health` - Liveness
|
|
- `GET /ready` - Readiness (metadata and storage connected)
|
|
|
|
### 10.3 Backup & Recovery
|
|
- **Metadata**: Chainfire snapshots
|
|
- **Blob data**: Backend-dependent replication
|
|
- **Cross-region**: Planned via bucket replication
|
|
|
|
## 11. Compatibility
|
|
|
|
### 11.1 API Versioning
|
|
- gRPC package: `lightningstor.v1`
|
|
- S3 API: Compatible with AWS S3 2006-03-01
|
|
- Semantic versioning for breaking changes
|
|
|
|
### 11.2 S3 Compatibility Matrix
|
|
|
|
| Operation | Status | Notes |
|
|
|-----------|--------|-------|
|
|
| PutObject | Supported | Including metadata |
|
|
| GetObject | Supported | Range requests supported |
|
|
| HeadObject | Supported | |
|
|
| DeleteObject | Supported | |
|
|
| DeleteObjects | Supported | Bulk delete |
|
|
| CopyObject | Supported | Same-bucket only initially |
|
|
| ListObjectsV2 | Supported | |
|
|
| ListObjectVersions | Supported | |
|
|
| CreateMultipartUpload | Supported | |
|
|
| UploadPart | Supported | |
|
|
| CompleteMultipartUpload | Supported | |
|
|
| AbortMultipartUpload | Supported | |
|
|
| ListMultipartUploads | Supported | |
|
|
| ListParts | Supported | |
|
|
| CreateBucket | Supported | |
|
|
| HeadBucket | Supported | |
|
|
| DeleteBucket | Supported | Must be empty |
|
|
| ListBuckets | Supported | |
|
|
| GetBucketVersioning | Supported | |
|
|
| PutBucketVersioning | Supported | |
|
|
| GetBucketTagging | Supported | |
|
|
| PutBucketTagging | Supported | |
|
|
| Presigned URLs | Supported | GET/PUT |
|
|
| Bucket Policy | Planned | |
|
|
| Lifecycle Rules | Planned | |
|
|
| Cross-Region Replication | Planned | |
|
|
| S3 Select | Not planned | |
|
|
|
|
### 11.3 SDK Compatibility
|
|
Tested with:
|
|
- AWS SDK (all languages)
|
|
- boto3 (Python)
|
|
- aws-sdk-rust
|
|
- s3cmd
|
|
- rclone
|
|
- MinIO client
|
|
|
|
## Appendix
|
|
|
|
### A. Error Codes
|
|
|
|
| Error | HTTP | Description |
|
|
|-------|------|-------------|
|
|
| NoSuchBucket | 404 | Bucket does not exist |
|
|
| NoSuchKey | 404 | Object does not exist |
|
|
| BucketAlreadyExists | 409 | Bucket name taken |
|
|
| BucketNotEmpty | 409 | Cannot delete non-empty bucket |
|
|
| AccessDenied | 403 | Permission denied |
|
|
| InvalidBucketName | 400 | Invalid bucket name format |
|
|
| InvalidArgument | 400 | Invalid request parameter |
|
|
| EntityTooLarge | 400 | Object exceeds size limit |
|
|
| InvalidPart | 400 | Invalid multipart part |
|
|
| InvalidPartOrder | 400 | Parts not in order |
|
|
| NoSuchUpload | 404 | Multipart upload not found |
|
|
| QuotaExceeded | 403 | Storage quota exceeded |
|
|
| InternalError | 500 | Server error |
|
|
|
|
### B. Port Assignments
|
|
| Port | Protocol | Purpose |
|
|
|------|----------|---------|
|
|
| 9000 | HTTP | S3-compatible API |
|
|
| 9001 | gRPC | Native API |
|
|
| 9002 | gRPC | Storage node (distributed) |
|
|
|
|
### C. Glossary
|
|
- **Bucket**: Container for objects, scoped to org/project
|
|
- **Object**: Stored blob with metadata, identified by key
|
|
- **Key**: Object identifier within bucket (path-like string)
|
|
- **Version**: Specific version of an object (when versioning enabled)
|
|
- **Chunk**: Internal storage unit for object data
|
|
- **Multipart Upload**: Chunked upload mechanism for large objects
|
|
- **ETag**: Entity tag (content hash) for cache validation
|
|
- **Presigned URL**: Time-limited URL for direct object access
|
|
|
|
### D. Integration Examples
|
|
|
|
**PlasmaVMC Image Storage**:
|
|
```rust
|
|
// Store VM image in LightningStor
|
|
let client = LightningStorClient::connect(config.image_store.endpoint).await?;
|
|
|
|
client.put_object(PutObjectRequest {
|
|
bucket: "vm-images".into(),
|
|
key: format!("{}/{}/{}", org_id, image_id, version),
|
|
content_type: "application/octet-stream".into(),
|
|
..Default::default()
|
|
}).await?;
|
|
```
|
|
|
|
**Backup Storage**:
|
|
```rust
|
|
// Store backups with versioning
|
|
client.create_bucket(CreateBucketRequest {
|
|
name: "backups".into(),
|
|
org_id: org_id.into(),
|
|
versioning: VersioningConfig::Enabled,
|
|
..Default::default()
|
|
}).await?;
|
|
```
|
|
|
|
### E. Performance Considerations
|
|
- **Chunk size**: 8 MiB default balances throughput and memory
|
|
- **Parallel uploads**: Multipart for objects > 100 MiB
|
|
- **Connection pooling**: Reuse gRPC/HTTP connections
|
|
- **Metadata caching**: Hot bucket/object metadata cached
|
|
- **Range requests**: Avoid full object reads for partial access
|