centra d2149b6249 fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth

- Replace form_urlencoded with RFC 3986 compliant URI encoding
- Implement aws_uri_encode() matching AWS SigV4 spec exactly
- Unreserved chars (A-Z,a-z,0-9,-,_,.,~) not encoded
- All other chars percent-encoded with uppercase hex
- Preserve slashes in paths, encode in query params
- Normalize empty paths to '/' per AWS spec
- Fix test expectations (body hash, HMAC values)
- Add comprehensive SigV4 signature determinism test

This fixes the canonicalization mismatch that caused signature
validation failures in T047. Auth can now be enabled for production.

Refs: T058.S1

2025-12-12 06:23:46 +09:00

39 KiB

Raw Blame History

K8s Hosting Architecture Research

Executive Summary

This document evaluates three architecture options for bringing Kubernetes hosting capabilities to PlasmaCloud: k3s-style architecture, k0s-style architecture, and a custom Rust implementation. After analyzing complexity, integration requirements, multi-tenant isolation, development timeline, and production reliability, we recommend adopting a k3s-style architecture with selective component replacement as the optimal path to MVP.

The k3s approach provides a battle-tested foundation with full Kubernetes API compatibility, enabling rapid time-to-market (3-4 months to MVP) while allowing strategic integration with PlasmaCloud components through standard interfaces (CNI, CSI, CRI, LoadBalancer controllers). Multi-tenant isolation requirements can be satisfied using namespace separation, RBAC, and network policies. While this approach involves some Go code (k3s itself, containerd), the integration points with PlasmaCloud's Rust components are well-defined through standard Kubernetes interfaces.

Option 1: k3s-style Architecture

Overview

k3s is a CNCF-certified lightweight Kubernetes distribution packaged as a single <70MB binary. It consolidates all Kubernetes control plane components (API server, scheduler, controller manager, kubelet, kube-proxy) into a single process with a unified binary, dramatically simplifying deployment and operations. Despite its lightweight nature, k3s maintains full Kubernetes API compatibility and supports both single-server and high-availability configurations.

Key Features

Single Binary Architecture

All control plane components run in a single Server or Agent process
Containerd handles container lifecycle functions (CRI integration)
Memory footprint: <512MB for control plane, <50MB for worker nodes
Fast deployment: typically under 30 seconds

Flexible Datastore Options

SQLite (default): Embedded, zero-configuration, suitable for single-server setups
Embedded etcd: For high-availability (HA) multi-server deployments
External datastores: MySQL, PostgreSQL, etcd (via Kine proxy layer)

Bundled Components

Container Runtime: containerd (embedded)
CNI: Flannel with VXLAN backend (default, replaceable)
Ingress: Traefik (default, replaceable)
Service Load Balancer: ServiceLB (Klipper-lb, replaceable)
DNS: CoreDNS
Helm Controller: Deploys Helm charts via CRDs

Component Flexibility All embedded components can be disabled, allowing replacement with custom implementations:

k3s server --disable traefik --disable servicelb --flannel-backend=none

Pros

Rapid Time-to-Market: Production-ready solution with minimal development effort
Battle-Tested: Used in thousands of production deployments (e.g., Chick-fil-A's 2000+ edge locations)
Full API Compatibility: 100% Kubernetes API coverage, certified by CNCF
Low Resource Overhead: Efficient resource usage suitable for both edge and cloud deployments
Easy Operations: Single binary simplifies upgrades, patching, and deployment automation
Proven Multi-Tenancy: Standard Kubernetes namespace/RBAC isolation patterns
Integration Points: Well-defined interfaces (CNI, CSI, CRI, Service controllers) for custom component integration
Active Ecosystem: Large community, regular updates, extensive documentation

Cons

Go Codebase: k3s and containerd are written in Go, not Rust (potential operational/debugging complexity)
Limited Control: Core components are opaque; debugging deep issues requires Go expertise
Component Coupling: While replaceable, default components are tightly integrated
Not Pure Rust: Doesn't align with PlasmaCloud's Rust-first philosophy
Overhead: Still carries full Kubernetes complexity internally despite simplified deployment

Integration Analysis

PlasmaVMC (Compute Backend)

Approach: Keep containerd as default CRI for container workloads
Alternative: Develop custom CRI implementation to run Pods as lightweight VMs (Firecracker/KVM)
Effort: High (6-8 weeks for custom CRI); Low (1 week if using containerd)
Recommendation: Start with containerd, consider custom CRI in Phase 2 for VM-based pod isolation

PrismNET (Pod Networking)

Approach: Replace Flannel with custom CNI plugin backed by PrismNET
Interface: Standard CNI 1.0.0 specification
Implementation: Rust binary + daemon for pod NIC creation, IPAM, routing via PrismNET SDN
Effort: 4-5 weeks (CNI plugin + PrismNET integration)
Benefits: Unified network control, OVN integration, advanced SDN features

FlashDNS (Service Discovery)

Approach: Replace CoreDNS or run as secondary DNS with custom controller
Implementation: K8s controller watches Services/Endpoints, updates FlashDNS records
Interface: Standard K8s informers/client-go (or kube-rs)
Effort: 2-3 weeks (controller + FlashDNS API integration)
Benefits: Pattern-based reverse DNS, unified DNS management

FiberLB (LoadBalancer Services)

Approach: Replace ServiceLB with custom LoadBalancer controller
Implementation: K8s controller watches Services (type=LoadBalancer), provisions FiberLB L4/L7 frontends
Interface: Standard Service controller pattern
Effort: 3-4 weeks (controller + FiberLB API integration)
Benefits: Advanced L7 features, unified load balancing

LightningStor (Persistent Volumes)

Approach: Develop CSI driver for LightningStor
Interface: CSI 1.x specification (ControllerService + NodeService)
Implementation: Rust CSI driver (gRPC server) + sidecar containers
Effort: 5-6 weeks (CSI driver + volume provisioning/attach/mount logic)
Benefits: Dynamic volume provisioning, snapshots, cloning

IAM (Authentication/RBAC)

Approach: K8s webhook authentication + custom authorizer backed by IAM
Implementation: Webhook server validates tokens via IAM, maps users to K8s RBAC roles
Interface: Standard K8s authentication/authorization webhooks
Effort: 3-4 weeks (webhook server + IAM integration + RBAC mapping)
Benefits: Unified identity, PlasmaCloud IAM policies enforced in K8s

Effort Estimate

Phase 1: MVP (3-4 months)

Week 1-2: k3s deployment, basic cluster setup, testing
Week 3-6: PrismNET CNI plugin development
Week 7-9: FiberLB LoadBalancer controller
Week 10-12: IAM authentication webhook
Week 13-14: Integration testing, documentation
Week 15-16: Beta testing, hardening

Phase 2: Advanced Features (2-3 months)

FlashDNS service discovery controller
LightningStor CSI driver
Custom CRI for VM-based pods (optional)
Multi-tenant isolation enhancements

Total: 5-7 months to production-ready platform

Option 2: k0s-style Architecture

Overview

k0s is an open-source, all-inclusive Kubernetes distribution distributed as a single binary but architected with strong component modularity. Unlike k3s's process consolidation, k0s runs components as separate processes supervised by the k0s binary, enabling true control plane/worker separation and flexible component replacement. The k0s approach emphasizes production-grade deployments with enhanced security isolation.

Key Features

Modular Component Architecture

k0s binary acts as process supervisor for control plane components
Components run as separate "naked" processes (not containers)
No kubelet or container runtime on controllers by default
Workers use containerd (high-level) + runc (low-level) by default

True Control Plane/Worker Separation

Controllers cannot run workloads (no kubelet by default)
Protects controllers from rogue workloads
Reduces control plane attack surface
Workers cannot access etcd directly (security isolation)

Flexible Component Replacement

Each component can be replaced independently
Clear boundaries between components
Easier to swap CNI, CSI, or other plugins
Supports custom infrastructure controllers

k0smotron Extension

Control plane runs on existing cluster
No direct networking between control/worker planes
Enhanced multi-tenant isolation
Suitable for hosted Kubernetes offerings

Pros

Production-Grade Design: True control/worker separation enhances security
Component Modularity: Easier to replace individual components without affecting others
Security Isolation: Workers cannot access etcd; controllers isolated from workloads
Battle-Tested: Used in enterprise production environments
Full API Compatibility: 100% Kubernetes API coverage, CNCF-certified
Clear Boundaries: Process-level separation simplifies understanding and debugging
Multi-Tenancy Ready: k0smotron provides excellent hosted K8s architecture
Integration Flexibility: Modular design makes PlasmaCloud component integration cleaner

Cons

Go Codebase: k0s is written in Go (same as k3s)
Higher Resource Usage: Separate processes consume more memory than k3s's unified approach
Complex Architecture: Process supervision adds operational complexity
Smaller Community: Less adoption than k3s, fewer community resources
Not Pure Rust: Doesn't align with Rust-first philosophy
Learning Curve: Unique architecture requires understanding k0s-specific patterns

Integration Analysis

PlasmaVMC (Compute Backend)

Approach: Replace containerd with custom CRI or run containerd for containers
Benefits: Modular design makes CRI replacement cleaner than k3s
Effort: 6-8 weeks for custom CRI (similar to k3s)
Recommendation: Modular architecture supports phased CRI replacement

PrismNET (Pod Networking)

Approach: Custom CNI plugin (same as k3s)
Benefits: Clean component boundary for CNI integration
Effort: 4-5 weeks (identical to k3s)
Advantages: k0s's modularity makes CNI swap more straightforward

FlashDNS (Service Discovery)

Approach: Controller watching Services/Endpoints (same as k3s)
Benefits: Process separation provides clearer integration point
Effort: 2-3 weeks (identical to k3s)

FiberLB (LoadBalancer Services)

Approach: Custom LoadBalancer controller (same as k3s)
Benefits: k0s's worker isolation protects FiberLB control plane
Effort: 3-4 weeks (identical to k3s)

LightningStor (Persistent Volumes)

Approach: CSI driver (same as k3s)
Benefits: Modular design simplifies CSI deployment
Effort: 5-6 weeks (identical to k3s)

IAM (Authentication/RBAC)

Approach: Authentication webhook (same as k3s)
Benefits: Control plane isolation enhances IAM security
Effort: 3-4 weeks (identical to k3s)

Effort Estimate

Phase 1: MVP (4-5 months)

Week 1-3: k0s deployment, cluster setup, understanding architecture
Week 4-7: PrismNET CNI plugin development
Week 8-10: FiberLB LoadBalancer controller
Week 11-13: IAM authentication webhook
Week 14-16: Integration testing, documentation
Week 17-18: Beta testing, hardening

Phase 2: Advanced Features (2-3 months)

FlashDNS service discovery controller
LightningStor CSI driver
k0smotron evaluation for multi-tenant isolation
Custom CRI exploration

Total: 6-8 months to production-ready platform

Note: Timeline is longer than k3s due to:

Smaller community (fewer examples/resources)
More complex architecture requiring deeper understanding
Less documentation for edge cases

Option 3: Custom Rust Implementation

Overview

Build a minimal Kubernetes API server and control plane components from scratch in Rust, implementing only essential APIs required for container orchestration. This approach provides maximum control and alignment with PlasmaCloud's Rust-first philosophy but requires significant development effort to reach production readiness.

Minimal K8s API Subset

Core APIs (Essential)

Core API Group (/api/v1)

Namespaces: Tenant isolation, resource grouping
Pods: Container specifications, lifecycle management
Services: Network service discovery, load balancing
ConfigMaps: Configuration data injection
Secrets: Sensitive data storage
PersistentVolumes: Storage resources
PersistentVolumeClaims: Storage requests
Nodes: Worker node registration and status
Events: Audit trail and debugging

Apps API Group (/apis/apps/v1)

Deployments: Declarative pod management, rolling updates
StatefulSets: Stateful applications with stable network IDs
DaemonSets: One pod per node (logging, monitoring agents)

Batch API Group (/apis/batch/v1)

Jobs: Run-to-completion workloads
CronJobs: Scheduled job execution

RBAC API Group (/apis/rbac.authorization.k8s.io/v1)

Roles/RoleBindings: Namespace-scoped permissions
ClusterRoles/ClusterRoleBindings: Cluster-wide permissions

Networking API Group (/apis/networking.k8s.io/v1)

NetworkPolicies: Pod-to-pod traffic control
Ingress: HTTP/HTTPS routing (optional for MVP)

Storage API Group (/apis/storage.k8s.io/v1)

StorageClasses: Dynamic volume provisioning
VolumeAttachments: Volume lifecycle management

Total Estimate: ~25-30 API resource types (vs. 50+ in full Kubernetes)

Architecture Design

Component Stack

API Server (Rust)
- RESTful API endpoint (actix-web/axum)
- Authentication/authorization (IAM integration)
- Admission controllers
- OpenAPI spec generation
- Watch API (WebSocket for resource changes)
Controller Manager (Rust)
- Deployment controller (replica management)
- Service controller (endpoint management)
- Job controller (batch workload management)
- Built using kube-rs runtime abstractions
Scheduler (Rust)
- Pod-to-node assignment
- Resource-aware scheduling (CPU, memory, storage)
- Affinity/anti-affinity rules
- Extensible filter/score framework
Kubelet (Rust or adapt existing)
- Pod lifecycle management on nodes
- CRI client for container runtime (containerd/PlasmaVMC)
- Volume mounting (CSI client)
- Health checks (liveness/readiness probes)
- Challenge: Complex component, may need to use existing Go kubelet
Datastore (FlareDB or etcd)
- Cluster state storage
- Watch API support (real-time change notifications)
- Strong consistency guarantees
- Option A: Use FlareDB (Rust, PlasmaCloud-native)
- Option B: Use embedded etcd (proven, standard)
Integration Components
- CNI plugin for PrismNET (same as other options)
- CSI driver for LightningStor (same as other options)
- LoadBalancer controller for FiberLB (same as other options)

Libraries and Ecosystem

kube-rs: Kubernetes client library (API bindings, controller runtime)
k8s-openapi: Auto-generated Rust bindings for K8s API types
krator: Operator framework built on kube-rs
Krustlet: Example Kubelet implementation in Rust (WebAssembly focus)

Pros

Pure Rust: Full alignment with PlasmaCloud philosophy (memory safety, performance, maintainability)
Maximum Control: Complete ownership of codebase, no black boxes
Minimal Complexity: Only implement APIs actually needed, no legacy cruft
Deep Integration: Native integration with Chainfire, FlareDB, IAM at code level
Optimized for PlasmaCloud: Architecture tailored to our specific use cases
No Go Dependencies: Eliminate Go runtime, simplify operations
Learning Experience: Team gains deep Kubernetes knowledge
Differentiation: Unique selling point (Rust-native K8s platform)

Cons

Extreme Development Effort: 12-18 months to MVP, 24+ months to production-grade
Not Battle-Tested: Zero production deployments, high risk of bugs
API Compatibility: Non-standard behavior breaks kubectl, Helm, operators
Ecosystem Compatibility: Most K8s tools assume full API compliance
Maintenance Burden: Ongoing effort to maintain, fix bugs, add features
Talent Acquisition: Hard to hire K8s experts willing to work on custom implementation
Client Tools: May need custom kubectl/client libraries if APIs diverge
Certification: No CNCF certification, potential customer concerns
Kubelet Challenge: Rewriting kubelet is extremely complex (1000s of edge cases)

Integration Analysis

PlasmaVMC (Compute Backend)

Approach: Custom kubelet with native PlasmaVMC integration or CRI interface
Benefits: Deep integration, pods-as-VMs native support
Effort: 10-12 weeks (if using CRI abstraction), 20+ weeks (if custom kubelet)
Risk: High complexity, many edge cases in pod lifecycle

PrismNET (Pod Networking)

Approach: Native integration in kubelet or standard CNI plugin
Benefits: Tight coupling possible, eliminate CNI overhead
Effort: 4-5 weeks (CNI plugin), 8-10 weeks (native integration)
Recommendation: Start with CNI for compatibility

FlashDNS (Service Discovery)

Approach: Service controller with native FlashDNS API calls
Benefits: Direct integration, no intermediate DNS server
Effort: 3-4 weeks (controller)
Advantages: Tighter integration than CoreDNS replacement

FiberLB (LoadBalancer Services)

Approach: Service controller with native FiberLB API calls
Benefits: First-class PlasmaCloud integration
Effort: 3-4 weeks (controller)
Advantages: Native load balancer support

LightningStor (Persistent Volumes)

Approach: Native volume plugin or CSI driver
Benefits: Simplified architecture without CSI overhead
Effort: 6-8 weeks (native plugin), 5-6 weeks (CSI driver)
Recommendation: CSI driver for compatibility with K8s ecosystem tools

IAM (Authentication/RBAC)

Approach: Native IAM integration in API server authentication layer
Benefits: Zero-hop authentication, unified permissions model
Effort: 2-3 weeks (direct integration vs. webhook)
Advantages: Cleanest IAM integration possible

Effort Estimate

Phase 1: Core API Server (6-8 months)

Months 1-2: API server framework, authentication, basic CRUD for core resources
Months 3-4: Controller manager (Deployment, Service, Job controllers)
Months 5-6: Scheduler (basic resource-aware scheduling)
Months 7-8: Testing, bug fixing, integration with IAM/FlareDB

Phase 2: Kubelet and Runtime (6-8 months)

Months 9-11: Kubelet implementation (pod lifecycle, CRI client)
Months 12-13: CNI integration (PrismNET plugin)
Months 14-15: Volume management (CSI or native LightningStor)
Months 16: Testing, bug fixing

Phase 3: Production Hardening (6-8 months)

Months 17-19: LoadBalancer controller, DNS controller
Months 20-21: Advanced features (StatefulSets, DaemonSets, CronJobs)
Months 22-24: Production testing, performance tuning, edge case handling

Total: 18-24 months to production-ready platform

Risk Factors

Kubelet complexity may extend timeline by 3-6 months
API compatibility issues may require rework
Performance optimization may take longer than expected
Production bugs will require ongoing maintenance team

Integration Points

PlasmaVMC (Compute)

Common Approach Across Options

Use Container Runtime Interface (CRI) for abstraction
containerd as default runtime (mature, battle-tested)
Phase 2: Custom CRI implementation for VM-based pods

CRI Integration Details

Interface: gRPC protocol (RuntimeService + ImageService)
Operations: RunPodSandbox, CreateContainer, StartContainer, StopContainer, etc.
PlasmaVMC Adapter: Translate CRI calls to PlasmaVMC API (Firecracker/KVM)
Benefits: Pod-level isolation via VMs, stronger security boundaries

Implementation Options

Containerd (Low Risk): Use as-is, defer VM integration
CRI-PlasmaVMC (Medium Risk): Custom CRI shim, pods run as lightweight VMs
Native Integration (High Risk, Custom Implementation Only): Direct kubelet-PlasmaVMC coupling

PrismNET (Networking)

CNI Plugin Approach (Recommended)

Interface: CNI 1.0.0 specification (JSON-based stdin/stdout protocol)
Components:
- CNI binary (Rust): Creates pod veth pairs, assigns IPs, configures routing
- CNI daemon (Rust): Manages node-level networking, integrates with PrismNET API
PrismNET Integration: Daemon syncs pod network configs to PrismNET SDN controller
Features: VXLAN overlays, OVN integration, security groups, network policies

Implementation Steps

Implement CNI ADD/DEL/CHECK operations (pod lifecycle)
IPAM (IP address management) via PrismNET or local allocation
Routing table updates for pod reachability
Network policy enforcement (optional: eBPF for performance)

Benefits

Unified network management across PlasmaCloud
Leverage OVN capabilities for advanced networking
Standard interface (works with any K8s distribution)

FlashDNS (Service Discovery)

Controller Approach (Recommended)

Interface: Kubernetes Informer API (watch Services, Endpoints)
Implementation: Rust controller using kube-rs
Logic:
1. Watch Service objects for changes
2. Watch Endpoints objects (backend pod IPs)
3. Update FlashDNS records: <service>.<namespace>.svc.cluster.local → pod IPs
4. Support pattern-based reverse DNS lookups

Deployment Options

Replace CoreDNS: FlashDNS becomes authoritative DNS for cluster
Secondary DNS: CoreDNS delegates to FlashDNS, fallback for external queries
Hybrid: CoreDNS for K8s-standard queries, FlashDNS for PlasmaCloud-specific patterns

Benefits

Unified DNS management (PlasmaCloud VMs + K8s Services)
Pattern-based reverse DNS for debugging
Reduced DNS server overhead

FiberLB (Load Balancing)

Controller Approach (Recommended)

Interface: Kubernetes Informer API (watch Services type=LoadBalancer)
Implementation: Rust controller using kube-rs
Logic:
1. Watch Service objects with type: LoadBalancer
2. Provision FiberLB L4 or L7 load balancer
3. Assign external IP, configure backend pool (pod IPs from Endpoints)
4. Update Service .status.loadBalancer.ingress with assigned IP
5. Handle updates (backend changes, health checks)

Features

L4 (TCP/UDP) load balancing for standard Services
L7 (HTTP/HTTPS) load balancing with Ingress integration (optional)
Health checks (TCP/HTTP probes)
SSL termination, session affinity

Benefits

Unified load balancing across PlasmaCloud
Advanced L7 features unavailable in default ServiceLB/Traefik
Native integration with PlasmaCloud networking

LightningStor (Storage)

CSI Driver Approach (Recommended)

Interface: CSI 1.x specification (gRPC: ControllerService + NodeService + IdentityService)
Components:
- Controller Plugin: Runs on control plane, handles CreateVolume, DeleteVolume, ControllerPublishVolume
- Node Plugin: Runs on each worker, handles NodeStageVolume, NodePublishVolume (mount operations)
- Sidecar Containers: external-provisioner, external-attacher, node-driver-registrar (standard K8s components)

Implementation Steps

IdentityService: Driver name, capabilities
ControllerService: Volume CRUD operations (LightningStor API calls)
NodeService: Volume attach/mount on worker nodes (iSCSI or NBD)
StorageClass configuration: Parameters for LightningStor (replication, performance tier)

Features

Dynamic provisioning (PVCs automatically create volumes)
Volume snapshots
Volume cloning
Resize support (expand PVCs)

Benefits

Standard interface (works with any K8s distribution)
Ecosystem compatibility (backup tools, operators that use PVCs)
Unified storage management

IAM (Authentication/RBAC)

Webhook Approach (k3s/k0s)

Interface: Kubernetes authentication/authorization webhooks (HTTPS POST)
Implementation: Rust webhook server
Authentication Flow:
1. kubectl sends request with Bearer token to K8s API server
2. API server forwards token to IAM webhook
3. Webhook validates token via IAM, returns UserInfo (username, groups, UID)
4. API server uses UserInfo for RBAC checks

Authorization Integration (Optional)

Webhook: API server sends SubjectAccessReview to IAM
Logic: IAM evaluates PlasmaCloud policies, returns Allowed/Denied
Benefits: Unified policy enforcement across PlasmaCloud + K8s

RBAC Mapping

Map PlasmaCloud IAM roles to K8s RBAC roles
Synchronize permissions via controller
Example: plasmacloud:project:admin → K8s ClusterRole: admin

Native Integration (Custom Implementation)

Directly integrate IAM into API server authentication layer
Zero-hop authentication (no webhook latency)
Unified permissions model (single source of truth)

Benefits

Unified identity management
PlasmaCloud IAM policies enforced in K8s
Simplified user experience (single login)

Decision Matrix

Criteria	k3s-style	k0s-style	Custom Rust	Weight
Time to MVP	3-4 months ⭐⭐⭐⭐⭐	4-5 months ⭐⭐⭐⭐	18-24 months ⭐	25%
Production Reliability	Battle-tested ⭐⭐⭐⭐⭐	Battle-tested ⭐⭐⭐⭐⭐	Untested ⭐	20%
Integration Difficulty	Standard interfaces ⭐⭐⭐⭐	Standard interfaces ⭐⭐⭐⭐⭐	Native integration ⭐⭐⭐⭐⭐	15%
Multi-Tenant Isolation	K8s standard ⭐⭐⭐⭐	Enhanced (k0smotron) ⭐⭐⭐⭐⭐	Custom (flexible) ⭐⭐⭐⭐	15%
Complexity vs Control	Low complexity, less control ⭐⭐⭐	Medium complexity, medium control ⭐⭐⭐⭐	High complexity, full control ⭐⭐⭐⭐⭐	10%
Rust Alignment	Go codebase ⭐	Go codebase ⭐	Pure Rust ⭐⭐⭐⭐⭐	5%
API Compatibility	100% K8s API ⭐⭐⭐⭐⭐	100% K8s API ⭐⭐⭐⭐⭐	Partial API ⭐⭐	5%
Maintenance Burden	Low (upstream updates) ⭐⭐⭐⭐⭐	Low (upstream updates) ⭐⭐⭐⭐⭐	High (full ownership) ⭐	5%
Weighted Score	4.25	4.30	2.15	100%

Scoring: ⭐ (1) = Poor, ⭐⭐ (2) = Fair, ⭐⭐⭐ (3) = Good, ⭐⭐⭐⭐ (4) = Very Good, ⭐⭐⭐⭐⭐ (5) = Excellent

Detailed Analysis

Time to MVP (25% weight)

k3s wins with fastest path to market (3-4 months)
k0s slightly slower due to smaller community and more complex architecture
Custom implementation requires 18-24 months, unacceptable for MVP

Production Reliability (20% weight)

Both k3s and k0s are battle-tested with thousands of production deployments
Custom implementation has zero production track record, high risk

Integration Difficulty (15% weight)

k0s edges ahead with cleaner modular boundaries
Both k3s/k0s use standard interfaces (CNI, CSI, CRI, webhooks)
Custom implementation allows native integration but requires building everything

Multi-Tenant Isolation (15% weight)

k0s excels with k0smotron architecture (true control/worker plane separation)
k3s provides standard K8s namespace/RBAC isolation (sufficient for most use cases)
Custom implementation offers flexibility but requires building isolation mechanisms

Complexity vs Control (10% weight)

Custom implementation offers maximum control but extreme complexity
k0s provides good balance with modular architecture
k3s prioritizes simplicity over control

Rust Alignment (5% weight)

Only custom implementation aligns with Rust-first philosophy
Both k3s and k0s are Go-based (operational impact minimal with standard interfaces)

API Compatibility (5% weight)

k3s and k0s provide 100% K8s API compatibility (ecosystem compatibility)
Custom implementation likely has gaps (breaks kubectl, Helm, operators)

Maintenance Burden (5% weight)

k3s and k0s receive upstream updates, security patches
Custom implementation requires dedicated maintenance team

Recommendation

We recommend adopting a k3s-style architecture with selective component replacement as the optimal path to MVP.

Primary Recommendation: k3s-style Architecture

Rationale

Fastest Time to Market: 3-4 months to MVP vs. 4-5 months (k0s) or 18-24 months (custom)
Proven Reliability: Battle-tested in thousands of production deployments, including large-scale edge deployments
Full API Compatibility: 100% Kubernetes API coverage ensures ecosystem compatibility (kubectl, Helm, operators, monitoring tools)
Low Risk: Mature codebase with active community and regular security updates
Clean Integration Points: Standard interfaces (CNI, CSI, CRI, webhooks) allow PlasmaCloud component integration without forking k3s
Acceptable Trade-offs:
- Go codebase is acceptable given integration happens via standard interfaces
- Operations team doesn't need deep k3s internals knowledge for day-to-day tasks
- Debugging deep issues is rare with mature software

Implementation Strategy

Phase 1: MVP (3-4 months)

Deploy k3s with default components (containerd, Flannel, CoreDNS, Traefik)
Develop and deploy PrismNET CNI plugin (replace Flannel)
Develop and deploy FiberLB LoadBalancer controller (replace ServiceLB)
Develop and deploy IAM authentication webhook
Multi-tenant isolation: namespace separation + RBAC + network policies
Testing and documentation

Phase 2: Production Hardening (2-3 months) 7. Develop and deploy FlashDNS service discovery controller 8. Develop and deploy LightningStor CSI driver 9. HA setup with embedded etcd (multi-master) 10. Monitoring and logging integration 11. Production testing and performance tuning

Phase 3: Advanced Features (3-4 months, optional) 12. Custom CRI implementation for VM-based pods (integrate PlasmaVMC) 13. Enhanced multi-tenant isolation (dedicated control planes via vcluster or similar) 14. Advanced networking features (BGP, network policies) 15. Disaster recovery and backup

Component Replacement Strategy

Component	Default (k3s)	PlasmaCloud Replacement	Timeline
Container Runtime	containerd	Keep (or custom CRI Phase 3)	Phase 1 / Phase 3
CNI	Flannel	PrismNET CNI plugin	Phase 1 (Week 3-6)
DNS	CoreDNS	FlashDNS controller	Phase 2 (Week 17-19)
Load Balancer	ServiceLB	FiberLB controller	Phase 1 (Week 7-9)
Storage	local-path	LightningStor CSI driver	Phase 2 (Week 20-22)
Auth/RBAC	Static tokens	IAM webhook	Phase 1 (Week 10-12)

Multi-Tenant Isolation Strategy

Namespace Isolation: Each tenant gets dedicated namespace(s)
RBAC: Roles/RoleBindings restrict cross-tenant access
Network Policies: Block pod-to-pod communication across tenants
Resource Quotas: Prevent resource monopolization
Pod Security Standards: Enforce security baselines per tenant
Monitoring: Tenant-level metrics and logging with filtering

Risks and Mitigations

Risk	Mitigation
Go codebase (not Rust)	Use standard interfaces, minimize deep k3s interactions
Limited control over core	Fork only if absolutely necessary, contribute upstream when possible
Multi-tenant isolation gaps	Layer multiple isolation mechanisms (namespace + RBAC + NetworkPolicy)
Vendor lock-in to Rancher	k3s is open-source (Apache 2.0), can fork if needed

Alternative Recommendation: k0s-style Architecture

If the following conditions apply, consider k0s instead:

Enhanced security isolation is critical: k0smotron provides true control/worker plane separation
Timeline flexibility: 4-5 months to MVP is acceptable
Future-proofing: Modular architecture simplifies component replacement in Phase 3+
Hosted K8s offering: k0smotron architecture is ideal for multi-tenant hosted Kubernetes

Trade-offs vs. k3s:

Slower time to market (+1-2 months)
Smaller community (fewer resources for troubleshooting)
More complex architecture (higher learning curve)
Better modularity (easier component replacement)

Why Not Custom Rust Implementation?

Reject for MVP, consider for long-term differentiation:

Timeline unacceptable: 18-24 months to production-ready vs. 3-4 months (k3s)
High risk: Zero production deployments, unknown bugs, maintenance burden
Ecosystem incompatibility: Partial K8s API breaks kubectl, Helm, operators
Talent challenges: Hard to hire K8s experts for custom implementation
Opportunity cost: Engineering effort better spent on PlasmaCloud differentiators

Reconsider if:

Unique requirements that k3s/k0s cannot satisfy (unlikely given standard interfaces)
Long-term competitive advantage requires Rust-native K8s (2-3 year horizon)
Team has deep K8s internals expertise (kubelet, scheduler, controller-manager)

Compromise approach:

Start with k3s for MVP
Gradually replace components with Rust implementations (CNI, CSI, controllers)
Evaluate custom API server in Year 2-3 if strategic value is clear

Next Steps

If Recommendation Accepted (k3s-style Architecture)

Step 2 (S2): Architecture Design Document

Detailed PlasmaCloud K8s architecture diagram
Component interaction flows (API server → IAM, kubelet → PlasmaVMC, etc.)
Data flow diagrams (pod creation, service routing, volume provisioning)
Network architecture (pod networking, service networking, ingress)
Security architecture (authentication, authorization, network policies)
High-availability design (multi-master, etcd, load balancing)

Step 3 (S3): CNI Plugin Design

PrismNET CNI plugin specification
CNI binary interface (ADD/DEL/CHECK operations)
CNI daemon architecture (node networking, OVN integration)
IPAM strategy (PrismNET-based or local allocation)
Network policy enforcement approach (eBPF or iptables)
Testing plan (unit tests, integration tests with k3s)

Step 4 (S4): LoadBalancer Controller Design

FiberLB controller specification
Service watch logic (Informer pattern)
FiberLB provisioning API integration
Health check configuration
L4 vs. L7 decision criteria
Testing plan

Step 5 (S5): IAM Integration Design

Authentication webhook specification
Token validation flow (IAM API calls)
UserInfo mapping (IAM roles → K8s RBAC)
Authorization webhook (optional, future)
RBAC synchronization controller (optional)
Testing plan

Step 6 (S6): Implementation Roadmap

Week-by-week breakdown of Phase 1 work
Team assignments (who builds CNI, LoadBalancer controller, IAM webhook)
Milestone definitions (what constitutes MVP, beta, GA)
Testing strategy (unit, integration, end-to-end, chaos)
Documentation plan (user docs, operator docs, developer docs)
Go/no-go criteria for production launch

Research Validation Tasks

Before proceeding to S2, validate the following:

k3s Component Replacement: Deploy k3s cluster, disable Flannel, test custom CNI plugin replacement
LoadBalancer Controller: Deploy sample controller, watch Services, verify lifecycle
Authentication Webhook: Deploy test webhook server, configure k3s API server, verify token flow
Multi-Tenancy: Create namespaces, RBAC roles, NetworkPolicies; test isolation
Integration Testing: Verify k3s works with PlasmaCloud network environment

Timeline: 1-2 weeks for validation tasks

References

k3s Architecture

k0s Architecture

Comparisons

Kubernetes APIs

CNI Integration

CSI Integration

CRI Integration

Rust Kubernetes Ecosystem

Multi-Tenancy

Document Version: 1.0 Last Updated: 2025-12-09 Author: PlasmaCloud Architecture Team Status: For Review

39 KiB Raw Blame History

K8s Hosting Architecture Research

Executive Summary

Option 1: k3s-style Architecture

Overview

Key Features

Pros

Cons

Integration Analysis

Effort Estimate

Option 2: k0s-style Architecture

Overview

Key Features

Pros

Cons

Integration Analysis

Effort Estimate

Option 3: Custom Rust Implementation

Overview

Minimal K8s API Subset

Architecture Design

Pros

Cons

Integration Analysis

Effort Estimate

Integration Points

PlasmaVMC (Compute)

PrismNET (Networking)

FlashDNS (Service Discovery)

FiberLB (Load Balancing)

LightningStor (Storage)

IAM (Authentication/RBAC)

Decision Matrix

Detailed Analysis

Recommendation

Primary Recommendation: k3s-style Architecture

Alternative Recommendation: k0s-style Architecture

Why Not Custom Rust Implementation?

Next Steps

If Recommendation Accepted (k3s-style Architecture)

Research Validation Tasks

References

k3s Architecture

k0s Architecture

Comparisons

Kubernetes APIs

CNI Integration

CSI Integration

CRI Integration

Rust Kubernetes Ecosystem

Multi-Tenancy

39 KiB

Raw Blame History