# Service Dependencies Diagram **Document Version:** 1.0 **Last Updated:** 2025-12-10 ## Service Startup Order ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ PlasmaCloud Service Dependency Graph │ │ (systemd unit dependencies) │ └─────────────────────────────────────────────────────────────────────────┘ System Boot │ v ┌──────────────────┐ │ systemd (PID 1) │ └────────┬─────────┘ │ v ┌───────────────────────────────┐ │ basic.target │ │ • mounts filesystems │ │ • activates swap │ └───────────────┬───────────────┘ │ v ┌───────────────────────────────┐ │ network.target │ │ • brings up network interfaces│ │ • configures IP addresses │ └───────────────┬───────────────┘ │ v ┌───────────────────────────────┐ │ network-online.target │ │ • waits for network ready │ │ • ensures DNS resolution │ └───────────────┬───────────────┘ │ v ┌─────────────────────┐ │ multi-user.target │ └──────────┬──────────┘ │ ┌──────────────────┼──────────────────┐ │ │ │ v v v [Level 1] [Level 2] [Level 3+] Foundation Core Services Application Services Level 1: Foundation Services (No dependencies) ═══════════════════════════════════════════════════════════════════════════ ┌────────────────────────────────────────────────────────────────────────┐ │ Chainfire │ │ ├─ After: network-online.target │ │ ├─ Type: notify (systemd-aware) │ │ ├─ Ports: 2379 (API), 2380 (Raft), 2381 (Gossip) │ │ ├─ Data: /var/lib/chainfire │ │ └─ Start: ~10 seconds │ │ │ │ Purpose: Distributed configuration store, service discovery │ │ Critical: Yes (all other services depend on this) │ └────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────────────┐ │ FlareDB │ │ ├─ After: network-online.target, chainfire.service │ │ ├─ Requires: chainfire.service │ │ ├─ Type: notify │ │ ├─ Ports: 2479 (API), 2480 (Raft) │ │ ├─ Data: /var/lib/flaredb │ │ └─ Start: ~15 seconds (after Chainfire) │ │ │ │ Purpose: Time-series database for metrics and events │ │ Critical: Yes (IAM and monitoring depend on this) │ └────────────────────────────────────────────────────────────────────────┘ Level 2: Core Services (Depend on Chainfire + FlareDB) ═══════════════════════════════════════════════════════════════════════════ ┌────────────────────────────────────────────────────────────────────────┐ │ IAM (Identity and Access Management) │ │ ├─ After: flaredb.service │ │ ├─ Requires: flaredb.service │ │ ├─ Type: simple │ │ ├─ Port: 8080 (API) │ │ ├─ Backend: FlareDB (stores users, roles, tokens) │ │ └─ Start: ~5 seconds (after FlareDB) │ │ │ │ Purpose: Authentication and authorization for all APIs │ │ Critical: Yes (API access requires IAM tokens) │ └────────────────────────────────────────────────────────────────────────┘ Level 3: Application Services (Parallel startup) ═══════════════════════════════════════════════════════════════════════════ ┌────────────────────────────────────────────────────────────────────────┐ │ PlasmaVMC (Virtual Machine Controller) │ │ ├─ After: chainfire.service, iam.service │ │ ├─ Wants: chainfire.service, iam.service │ │ ├─ Type: notify │ │ ├─ Port: 9090 (API) │ │ └─ Start: ~10 seconds │ │ │ │ Purpose: VM lifecycle management and orchestration │ └────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────────────┐ │ PrismNET (Software-Defined Networking) │ │ ├─ After: chainfire.service, iam.service │ │ ├─ Wants: chainfire.service │ │ ├─ Type: notify │ │ ├─ Ports: 9091 (API), 4789 (VXLAN) │ │ └─ Start: ~8 seconds │ │ │ │ Purpose: Virtual networking, VXLAN overlay management │ └────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────────────┐ │ FlashDNS (High-Performance DNS) │ │ ├─ After: chainfire.service │ │ ├─ Wants: chainfire.service │ │ ├─ Type: forking │ │ ├─ Ports: 53 (DNS), 853 (DoT) │ │ └─ Start: ~3 seconds │ │ │ │ Purpose: DNS resolution for VMs and services │ └────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────────────┐ │ FiberLB (Layer 4/7 Load Balancer) │ │ ├─ After: chainfire.service, iam.service │ │ ├─ Wants: chainfire.service │ │ ├─ Type: notify │ │ ├─ Port: 9092 (API), 80 (HTTP), 443 (HTTPS) │ │ └─ Start: ~5 seconds │ │ │ │ Purpose: Load balancing and traffic distribution │ └────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────────────┐ │ LightningStor (Distributed Block Storage) │ │ ├─ After: chainfire.service, flaredb.service │ │ ├─ Wants: chainfire.service │ │ ├─ Type: notify │ │ ├─ Ports: 9093 (API), 9094 (Replication), 3260 (iSCSI) │ │ └─ Start: ~12 seconds │ │ │ │ Purpose: Block storage for VMs and containers │ └────────────────────────────────────────────────────────────────────────┘ ┌────────────────────────────────────────────────────────────────────────┐ │ K8sHost (Kubernetes Node Agent) │ │ ├─ After: chainfire.service, plasmavmc.service, prismnet.service │ │ ├─ Wants: chainfire.service, prismnet.service │ │ ├─ Type: notify │ │ ├─ Ports: 10250 (Kubelet), 10256 (Health) │ │ └─ Start: ~15 seconds │ │ │ │ Purpose: Kubernetes node agent for container orchestration │ └────────────────────────────────────────────────────────────────────────┘ ``` ## Dependency Visualization (ASCII) ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Service Dependency Tree │ │ (direction: top-down) │ └─────────────────────────────────────────────────────────────────────────┘ network-online.target │ │ After v ┌───────────────┐ │ Chainfire │ (Level 1) │ Port: 2379 │ └───────┬───────┘ │ ┌────────────┼────────────┐ │ Requires │ Wants │ Wants v v v ┌────────────┐ ┌──────────┐ ┌──────────┐ │ FlareDB │ │PrismNET │ │FlashDNS │ │ Port: 2479 │ │Port: 9091│ │Port: 53 │ └──────┬─────┘ └──────────┘ └──────────┘ │ ┌────────┼────────┬──────────┐ │ Requires│ Wants │ Wants │ Wants v v v v ┌─────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ IAM │ │PlasmaVMC │ │ FiberLB │ │Lightning │ │Port:8080│ │Port: 9090│ │Port: 9092│ │Port: 9093│ └─────────┘ └─────┬────┘ └──────────┘ └──────────┘ │ │ Wants v ┌─────────────┐ │ K8sHost │ (Level 3) │ Port: 10250 │ └─────────────┘ Legend: Requires: Hard dependency (service fails if dependency fails) Wants: Soft dependency (service starts even if dependency fails) After: Ordering (wait for dependency to start, but doesn't require success) ``` ## Runtime Dependencies (Data Flow) ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Service Communication Flow │ └─────────────────────────────────────────────────────────────────────────┘ External Client │ │ HTTPS (8080) v ┌────────────────┐ │ FiberLB │ Load balances requests └───────┬────────┘ │ │ Forward to v ┌────────────────┐ ┌──────────────┐ │ IAM │──────>│ FlareDB │ Validate token │ (Auth check) │<──────│ (Token store)│ └───────┬────────┘ └──────────────┘ │ │ Token valid v ┌────────────────┐ ┌──────────────┐ ┌──────────────┐ │ PlasmaVMC │──────>│ Chainfire │──────>│ Worker Node │ │ (API handler) │<──────│ (Coordination)│<──────│ (VM host) │ └────────────────┘ └──────────────┘ └──────────────┘ │ │ Allocate storage v ┌────────────────┐ ┌──────────────┐ │ LightningStor │──────>│ FlareDB │ Store metadata │ (Block device)│<──────│ (Metadata) │ └────────────────┘ └──────────────┘ │ │ Configure network v ┌────────────────┐ ┌──────────────┐ │ PrismNET │──────>│ FlashDNS │ Register DNS │ (VXLAN setup) │<──────│ (Resolution) │ └────────────────┘ └──────────────┘ ``` ## Failure Impact Analysis ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Failure Impact Matrix │ └─────────────────────────────────────────────────────────────────────────┘ Service Fails │ Impact │ Mitigation ──────────────────┼──────────────────────────────────┼──────────────────── Chainfire │ ✗ Total cluster failure │ Raft quorum (3/5) │ ✗ All services lose config │ Data replicated │ ✗ New VMs cannot start │ Existing VMs run │ │ Auto-leader election ──────────────────┼──────────────────────────────────┼──────────────────── FlareDB │ ✗ Metrics not collected │ Raft quorum (3/5) │ ✗ IAM auth fails │ Cache last tokens │ ⚠ Existing VMs continue │ New VMs blocked │ │ Data replicated ──────────────────┼──────────────────────────────────┼──────────────────── IAM │ ✗ New API requests fail │ Token cache (TTL) │ ⚠ Existing sessions valid │ Multiple instances │ ⚠ Internal services unaffected │ Load balanced ──────────────────┼──────────────────────────────────┼──────────────────── PlasmaVMC │ ✗ Cannot create/delete VMs │ Multiple instances │ ✓ Existing VMs unaffected │ Stateless (uses DB) │ ⚠ VM monitoring stops │ Auto-restart VMs ──────────────────┼──────────────────────────────────┼──────────────────── PrismNET │ ✗ Cannot create new networks │ Multiple instances │ ✓ Existing networks work │ Distributed agents │ ⚠ VXLAN tunnels persist │ Control plane HA ──────────────────┼──────────────────────────────────┼──────────────────── FlashDNS │ ⚠ DNS resolution fails │ Multiple instances │ ✓ Existing connections work │ DNS caching │ ⚠ New connections affected │ Fallback DNS ──────────────────┼──────────────────────────────────┼──────────────────── FiberLB │ ⚠ Load balancing stops │ Multiple instances │ ✓ Direct API access works │ VIP failover │ ⚠ Client requests may timeout │ Health checks ──────────────────┼──────────────────────────────────┼──────────────────── LightningStor │ ⚠ Storage I/O may degrade │ Replication (3x) │ ✓ Replicas on other nodes │ Auto-rebalance │ ✗ New volumes cannot be created │ Multi-node cluster ──────────────────┼──────────────────────────────────┼──────────────────── K8sHost │ ⚠ Pods on failed node evicted │ Pod replicas │ ✓ Cluster continues │ Kubernetes HA │ ⚠ Capacity reduced │ Auto-rescheduling Legend: ✗ Complete service failure ⚠ Partial service degradation ✓ No impact or minimal impact ``` ## Service Health Check Endpoints ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ Health Check Endpoint Reference │ └─────────────────────────────────────────────────────────────────────────┘ Service │ Endpoint │ Expected Response ──────────────┼──────────────────────────────────┼──────────────────────── Chainfire │ https://host:2379/health │ {"status":"healthy", │ │ "raft":"leader", │ │ "cluster_size":3} ──────────────┼──────────────────────────────────┼──────────────────────── FlareDB │ https://host:2479/health │ {"status":"healthy", │ │ "raft":"follower", │ │ "chainfire":"connected"} ──────────────┼──────────────────────────────────┼──────────────────────── IAM │ https://host:8080/health │ {"status":"healthy", │ │ "database":"connected", │ │ "version":"1.0.0"} ──────────────┼──────────────────────────────────┼──────────────────────── PlasmaVMC │ https://host:9090/health │ {"status":"healthy", │ │ "vms_running":42} ──────────────┼──────────────────────────────────┼──────────────────────── PrismNET │ https://host:9091/health │ {"status":"healthy", │ │ "networks":5} ──────────────┼──────────────────────────────────┼──────────────────────── FlashDNS │ dig @host +short health.local │ 127.0.0.1 (A record) │ https://host:853/health │ {"status":"healthy"} ──────────────┼──────────────────────────────────┼──────────────────────── FiberLB │ https://host:9092/health │ {"status":"healthy", │ │ "backends":3} ──────────────┼──────────────────────────────────┼──────────────────────── LightningStor │ https://host:9093/health │ {"status":"healthy", │ │ "volumes":15, │ │ "total_gb":5000} ──────────────┼──────────────────────────────────┼──────────────────────── K8sHost │ https://host:10250/healthz │ ok (HTTP 200) ``` ## First-Boot Service Dependencies ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ First-Boot Automation Services │ │ (T032.S4 - First-Boot) │ └─────────────────────────────────────────────────────────────────────────┘ network-online.target │ v ┌─────────────────┐ │ chainfire.service│ └────────┬─────────┘ │ After v ┌──────────────────────────────┐ │ chainfire-cluster-join.service│ (First-boot) │ ├─ Reads cluster-config.json │ │ ├─ Detects bootstrap mode │ │ └─ Joins cluster or waits │ └────────┬─────────────────────┘ │ After v ┌───────────────┐ │flaredb.service│ └────────┬──────┘ │ After v ┌──────────────────────────────┐ │ flaredb-cluster-join.service │ (First-boot) │ ├─ Waits for FlareDB healthy │ │ └─ Joins FlareDB cluster │ └────────┬─────────────────────┘ │ After v ┌───────────────┐ │ iam.service │ └────────┬──────┘ │ After v ┌──────────────────────────────┐ │ iam-initial-setup.service │ (First-boot) │ ├─ Creates admin user │ │ └─ Initializes IAM │ └────────┬─────────────────────┘ │ After v ┌──────────────────────────────┐ │ cluster-health-check.service│ (First-boot) │ ├─ Validates all services │ │ ├─ Checks Raft quorum │ │ └─ Reports cluster ready │ └──────────────────────────────┘ │ v ┌──────────────────┐ │ Cluster Ready │ │ (multi-user.target reached)│ └──────────────────┘ ``` ## Systemd Unit Configuration Examples ```bash # Chainfire service (example) [Unit] Description=Chainfire Distributed Configuration Service After=network-online.target Wants=network-online.target [Service] Type=notify ExecStart=/nix/store/.../bin/chainfire-server --config /etc/nixos/chainfire.toml Restart=on-failure RestartSec=10s TimeoutStartSec=60s # Environment Environment="CHAINFIRE_LOG_LEVEL=info" EnvironmentFile=-/etc/nixos/secrets/chainfire.env # Permissions User=chainfire Group=chainfire StateDirectory=chainfire ConfigurationDirectory=chainfire # Security hardening PrivateTmp=true ProtectSystem=strict ProtectHome=true NoNewPrivileges=true [Install] WantedBy=multi-user.target # FlareDB service (example) [Unit] Description=FlareDB Time-Series Database After=network-online.target chainfire.service Requires=chainfire.service Wants=network-online.target [Service] Type=notify ExecStart=/nix/store/.../bin/flaredb-server --config /etc/nixos/flaredb.toml Restart=on-failure RestartSec=10s TimeoutStartSec=90s # Dependencies: Wait for Chainfire ExecStartPre=/bin/sh -c 'until curl -k https://localhost:2379/health; do sleep 5; done' [Install] WantedBy=multi-user.target # First-boot cluster join (example) [Unit] Description=Chainfire Cluster Join (First Boot) After=chainfire.service Requires=chainfire.service Before=flaredb-cluster-join.service [Service] Type=oneshot RemainAfterExit=true ExecStart=/nix/store/.../bin/cluster-join.sh --service chainfire Restart=on-failure RestartSec=10s [Install] WantedBy=multi-user.target ``` --- **Document End**