photoncloud-monorepo/docs/por/T032-baremetal-provisioning/NETWORK.md
centra d2149b6249 fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth
- Replace form_urlencoded with RFC 3986 compliant URI encoding
- Implement aws_uri_encode() matching AWS SigV4 spec exactly
- Unreserved chars (A-Z,a-z,0-9,-,_,.,~) not encoded
- All other chars percent-encoded with uppercase hex
- Preserve slashes in paths, encode in query params
- Normalize empty paths to '/' per AWS spec
- Fix test expectations (body hash, HMAC values)
- Add comprehensive SigV4 signature determinism test

This fixes the canonicalization mismatch that caused signature
validation failures in T047. Auth can now be enabled for production.

Refs: T058.S1
2025-12-12 06:23:46 +09:00

32 KiB

Network Reference Guide

Document Version: 1.0 Last Updated: 2025-12-10

Table of Contents

Complete Port Matrix

Service Port Overview

Service API Port Raft/Consensus Additional Protocol Source Destination
Chainfire 2379 2380 2381 (gossip) TCP Cluster nodes Cluster nodes
FlareDB 2479 2480 - TCP Cluster nodes Cluster nodes
IAM 8080 - - TCP Clients,nodes Control plane
PlasmaVMC 9090 - - TCP Clients,nodes Control plane
PrismNET 9091 - 4789 (VXLAN) TCP/UDP Cluster nodes Cluster nodes
FlashDNS 53 - 853 (DoT) TCP/UDP Clients,nodes Cluster nodes
FiberLB 9092 - 80,443 (pass) TCP Clients Load balancers
LightningStor 9093 9094 3260 (iSCSI) TCP Worker nodes Storage nodes
K8sHost 10250 - 2379,2380 TCP Control plane Worker nodes

Detailed Port Breakdown

Chainfire

Port Direction Purpose Source Subnet Destination Required
2379 Inbound Client API 10.0.0.0/8 Control plane Yes
2380 Inbound Raft consensus Control plane Control plane Yes
2381 Inbound Gossip protocol Cluster nodes Cluster nodes Yes
2379 Outbound Client API Control plane Control plane Yes
2380 Outbound Raft replication Control plane Control plane Yes
2381 Outbound Gossip protocol Cluster nodes Cluster nodes Yes

Firewall Rules:

# iptables
iptables -A INPUT -p tcp --dport 2379 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 2380 -s 10.0.200.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 2381 -s 10.0.200.0/24 -j ACCEPT

# nftables
nft add rule inet filter input tcp dport 2379 ip saddr 10.0.0.0/8 accept
nft add rule inet filter input tcp dport { 2380, 2381 } ip saddr 10.0.200.0/24 accept

FlareDB

Port Direction Purpose Source Subnet Destination Required
2479 Inbound Client API 10.0.0.0/8 Control plane Yes
2480 Inbound Raft consensus Control plane Control plane Yes
2479 Outbound Client API Control plane Control plane Yes
2480 Outbound Raft replication Control plane Control plane Yes

Firewall Rules:

# iptables
iptables -A INPUT -p tcp --dport 2479 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 2480 -s 10.0.200.0/24 -j ACCEPT

# nftables
nft add rule inet filter input tcp dport 2479 ip saddr 10.0.0.0/8 accept
nft add rule inet filter input tcp dport 2480 ip saddr 10.0.200.0/24 accept

IAM

Port Direction Purpose Source Subnet Destination Required
8080 Inbound API (HTTP) 10.0.0.0/8 Control plane Yes
8443 Inbound API (HTTPS) 10.0.0.0/8 Control plane Optional

Firewall Rules:

# iptables
iptables -A INPUT -p tcp --dport 8080 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 8443 -s 10.0.0.0/8 -j ACCEPT

# nftables
nft add rule inet filter input tcp dport { 8080, 8443 } ip saddr 10.0.0.0/8 accept

PlasmaVMC

Port Direction Purpose Source Subnet Destination Required
9090 Inbound API 10.0.0.0/8 Control plane Yes

Firewall Rules:

# iptables
iptables -A INPUT -p tcp --dport 9090 -s 10.0.0.0/8 -j ACCEPT

# nftables
nft add rule inet filter input tcp dport 9090 ip saddr 10.0.0.0/8 accept

PrismNET

Port Direction Purpose Source Subnet Destination Required
9091 Inbound API 10.0.0.0/8 Control plane Yes
4789 Inbound VXLAN overlay Cluster nodes Cluster nodes Yes

Firewall Rules:

# iptables
iptables -A INPUT -p tcp --dport 9091 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p udp --dport 4789 -s 10.0.200.0/24 -j ACCEPT

# nftables
nft add rule inet filter input tcp dport 9091 ip saddr 10.0.0.0/8 accept
nft add rule inet filter input udp dport 4789 ip saddr 10.0.200.0/24 accept

FlashDNS

Port Direction Purpose Source Subnet Destination Required
53 Inbound DNS (UDP) 10.0.0.0/8 Cluster nodes Yes
53 Inbound DNS (TCP) 10.0.0.0/8 Cluster nodes Yes
853 Inbound DNS-over-TLS 10.0.0.0/8 Cluster nodes Optional

Firewall Rules:

# iptables
iptables -A INPUT -p udp --dport 53 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 53 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 853 -s 10.0.0.0/8 -j ACCEPT

# nftables
nft add rule inet filter input udp dport 53 ip saddr 10.0.0.0/8 accept
nft add rule inet filter input tcp dport { 53, 853 } ip saddr 10.0.0.0/8 accept

FiberLB

Port Direction Purpose Source Subnet Destination Required
9092 Inbound API 10.0.0.0/8 Load balancers Yes
80 Inbound HTTP (passthrough) 0.0.0.0/0 Load balancers Optional
443 Inbound HTTPS (passthrough) 0.0.0.0/0 Load balancers Optional

Firewall Rules:

# iptables
iptables -A INPUT -p tcp --dport 9092 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 80 -j ACCEPT  # Allow from anywhere
iptables -A INPUT -p tcp --dport 443 -j ACCEPT

# nftables
nft add rule inet filter input tcp dport 9092 ip saddr 10.0.0.0/8 accept
nft add rule inet filter input tcp dport { 80, 443 } accept

K8sHost

Port Direction Purpose Source Subnet Destination Required
10250 Inbound Kubelet API Control plane Worker nodes Yes
10256 Inbound Health check Control plane Worker nodes Optional
30000-32767 Inbound NodePort services Clients Worker nodes Optional

Firewall Rules:

# iptables
iptables -A INPUT -p tcp --dport 10250 -s 10.0.200.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 10256 -s 10.0.200.0/24 -j ACCEPT
iptables -A INPUT -p tcp --dport 30000:32767 -s 10.0.0.0/8 -j ACCEPT

# nftables
nft add rule inet filter input tcp dport { 10250, 10256 } ip saddr 10.0.200.0/24 accept
nft add rule inet filter input tcp dport 30000-32767 ip saddr 10.0.0.0/8 accept

Management and Infrastructure Ports

Service Port Protocol Purpose Source Destination
SSH 22 TCP Remote management Admin subnet All nodes
NTP 123 UDP Time synchronization All nodes NTP servers
DHCP 67,68 UDP IP address assignment PXE clients PXE server
TFTP 69 UDP PXE bootloader download PXE clients PXE server
HTTP 80 TCP PXE boot scripts/images PXE clients PXE server
HTTPS 443 TCP Secure management Admin clients All nodes
Prometheus 9100 TCP Node exporter metrics Prometheus All nodes
IPMI 623 UDP BMC remote management Admin subnet BMC network

Firewall Rules (Management):

# iptables
iptables -A INPUT -p tcp --dport 22 -s 10.0.10.0/24 -j ACCEPT
iptables -A INPUT -p udp --dport 123 -j ACCEPT
iptables -A INPUT -p tcp --dport 9100 -s 10.0.10.0/24 -j ACCEPT

# nftables
nft add rule inet filter input tcp dport 22 ip saddr 10.0.10.0/24 accept
nft add rule inet filter input udp dport 123 accept
nft add rule inet filter input tcp dport 9100 ip saddr 10.0.10.0/24 accept

DHCP Option Reference

Standard DHCP Options

Option Name Type Purpose Example Value
1 Subnet Mask IP Network subnet mask 255.255.255.0
3 Router IP Default gateway 10.0.100.1
6 Domain Name Server IP list DNS servers 10.0.100.1, 8.8.8.8
12 Host Name String Client hostname node01
15 Domain Name String DNS domain suffix example.com
28 Broadcast Address IP Broadcast address 10.0.100.255
42 NTP Servers IP list Time servers 10.0.100.1
51 Lease Time Int32 DHCP lease duration (seconds) 86400

PXE-Specific DHCP Options

Option Name Type Purpose Example Value
60 Vendor Class ID String Client vendor identification PXEClient
66 TFTP Server Name String TFTP server hostname or IP 10.0.100.10
67 Boot File Name String Boot file to download undionly.kpxe
77 User Class String Client user class (iPXE detection) iPXE
93 Client Architecture Uint16 Client architecture type 0x0000 (BIOS), 0x0007 (UEFI x64)
94 Client Network Interface Bytes NIC type and version 0x010201 (UNDI v2.1)
97 UUID/GUID Bytes Client system UUID Machine-specific

Option 93 (Client Architecture) Values

Value Architecture Boot Method
0x0000 x86 BIOS Legacy PXE
0x0001 NEC PC-98 Not supported
0x0002 EFI Itanium EFI PXE
0x0006 x86 UEFI HTTP Boot HTTP Boot
0x0007 x64 UEFI UEFI PXE
0x0008 EFI Xscale Not supported
0x0009 x64 UEFI HTTP Boot HTTP Boot
0x000a ARM 32-bit UEFI ARM PXE
0x000b ARM 64-bit UEFI ARM PXE

ISC DHCP Configuration Examples

Basic PXE Configuration:

# /etc/dhcp/dhcpd.conf

# Global options
option architecture-type code 93 = unsigned integer 16;
default-lease-time 600;
max-lease-time 7200;
authoritative;

# Subnet configuration
subnet 10.0.100.0 netmask 255.255.255.0 {
    range 10.0.100.100 10.0.100.200;
    option routers 10.0.100.1;
    option domain-name-servers 10.0.100.1, 8.8.8.8;
    option domain-name "example.com";
    option broadcast-address 10.0.100.255;
    option ntp-servers 10.0.100.1;

    # PXE boot server
    next-server 10.0.100.10;

    # Boot file selection based on architecture
    if exists user-class and option user-class = "iPXE" {
        filename "http://10.0.100.10:8080/boot/ipxe/boot.ipxe";
    } elsif option architecture-type = 00:00 {
        filename "undionly.kpxe";
    } elsif option architecture-type = 00:07 {
        filename "ipxe.efi";
    } elsif option architecture-type = 00:09 {
        filename "ipxe.efi";
    } else {
        filename "ipxe.efi";
    }
}

# Static host reservations
host node01 {
    hardware ethernet 52:54:00:12:34:56;
    fixed-address 10.0.100.50;
    option host-name "node01";
}

Advanced PXE Configuration with Classes:

# Define client classes
class "pxeclients" {
    match if substring (option vendor-class-identifier, 0, 9) = "PXEClient";
}

class "ipxeclients" {
    match if exists user-class and option user-class = "iPXE";
}

# Subnet configuration
subnet 10.0.100.0 netmask 255.255.255.0 {
    # ... (basic options) ...

    # Different boot files per class
    class "ipxeclients" {
        filename "http://10.0.100.10:8080/boot/ipxe/boot.ipxe";
    }

    class "pxeclients" {
        if option architecture-type = 00:00 {
            filename "undionly.kpxe";
        } elsif option architecture-type = 00:07 {
            filename "ipxe.efi";
        }
    }
}

DNS Zone File Examples

Forward Zone (example.com)

; /var/named/example.com.zone
$TTL 86400
@   IN  SOA     ns1.example.com. admin.example.com. (
        2025121001  ; Serial
        3600        ; Refresh (1 hour)
        1800        ; Retry (30 minutes)
        604800      ; Expire (1 week)
        86400       ; Minimum TTL (1 day)
)

; Name servers
@           IN  NS      ns1.example.com.
@           IN  NS      ns2.example.com.

; Name server A records
ns1         IN  A       10.0.200.10
ns2         IN  A       10.0.200.11

; Control plane nodes
node01      IN  A       10.0.200.10
node02      IN  A       10.0.200.11
node03      IN  A       10.0.200.12

; Worker nodes
worker01    IN  A       10.0.200.20
worker02    IN  A       10.0.200.21
worker03    IN  A       10.0.200.22

; Service VIPs (virtual IPs for load balancing)
chainfire   IN  A       10.0.200.100
flaredb     IN  A       10.0.200.101
iam         IN  A       10.0.200.102
plasmavmc   IN  A       10.0.200.103

; Service CNAMEs (point to VIP or specific node)
api         IN  CNAME   iam.example.com.
db          IN  CNAME   flaredb.example.com.
vm          IN  CNAME   plasmavmc.example.com.

; Wildcard for ingress (optional)
*.apps      IN  A       10.0.200.105

Reverse Zone (10.0.200.0/24)

; /var/named/200.0.10.in-addr.arpa.zone
$TTL 86400
@   IN  SOA     ns1.example.com. admin.example.com. (
        2025121001  ; Serial
        3600        ; Refresh
        1800        ; Retry
        604800      ; Expire
        86400       ; Minimum TTL
)

; Name servers
@               IN  NS      ns1.example.com.
@               IN  NS      ns2.example.com.

; Control plane nodes
10.200.0.10     IN  PTR     node01.example.com.
11.200.0.10     IN  PTR     node02.example.com.
12.200.0.10     IN  PTR     node03.example.com.

; Worker nodes
20.200.0.10     IN  PTR     worker01.example.com.
21.200.0.10     IN  PTR     worker02.example.com.
22.200.0.10     IN  PTR     worker03.example.com.

; Service VIPs
100.200.0.10    IN  PTR     chainfire.example.com.
101.200.0.10    IN  PTR     flaredb.example.com.
102.200.0.10    IN  PTR     iam.example.com.
103.200.0.10    IN  PTR     plasmavmc.example.com.

DNS Configuration (BIND9)

// /etc/named.conf

options {
    directory "/var/named";
    listen-on port 53 { 10.0.200.10; 127.0.0.1; };
    allow-query { 10.0.0.0/8; localhost; };
    recursion yes;
    forwarders { 8.8.8.8; 8.8.4.4; };
};

zone "example.com" IN {
    type master;
    file "example.com.zone";
    allow-update { none; };
};

zone "200.0.10.in-addr.arpa" IN {
    type master;
    file "200.0.10.in-addr.arpa.zone";
    allow-update { none; };
};

Firewall Rule Templates

iptables Complete Ruleset

#!/bin/bash
# /etc/iptables/rules.v4

# Flush existing rules
iptables -F
iptables -X
iptables -t nat -F
iptables -t nat -X
iptables -t mangle -F
iptables -t mangle -X

# Default policies
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT ACCEPT

# Allow loopback
iptables -A INPUT -i lo -j ACCEPT

# Allow established connections
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

# Allow SSH from management network
iptables -A INPUT -p tcp --dport 22 -s 10.0.10.0/24 -j ACCEPT

# Allow ICMP (ping)
iptables -A INPUT -p icmp -j ACCEPT

# PlasmaCloud services (cluster subnet only)
iptables -A INPUT -p tcp --dport 2379 -s 10.0.200.0/24 -j ACCEPT  # Chainfire API
iptables -A INPUT -p tcp --dport 2380 -s 10.0.200.0/24 -j ACCEPT  # Chainfire Raft
iptables -A INPUT -p tcp --dport 2381 -s 10.0.200.0/24 -j ACCEPT  # Chainfire Gossip
iptables -A INPUT -p tcp --dport 2479 -s 10.0.200.0/24 -j ACCEPT  # FlareDB API
iptables -A INPUT -p tcp --dport 2480 -s 10.0.200.0/24 -j ACCEPT  # FlareDB Raft

# Allow IAM from internal network
iptables -A INPUT -p tcp --dport 8080 -s 10.0.0.0/8 -j ACCEPT

# Allow PlasmaVMC from internal network
iptables -A INPUT -p tcp --dport 9090 -s 10.0.0.0/8 -j ACCEPT

# Allow FlashDNS
iptables -A INPUT -p udp --dport 53 -s 10.0.0.0/8 -j ACCEPT
iptables -A INPUT -p tcp --dport 53 -s 10.0.0.0/8 -j ACCEPT

# Allow PrismNET VXLAN
iptables -A INPUT -p udp --dport 4789 -s 10.0.200.0/24 -j ACCEPT

# Allow Prometheus metrics from monitoring server
iptables -A INPUT -p tcp --dport 9100 -s 10.0.10.5 -j ACCEPT

# Log dropped packets (optional, for debugging)
iptables -A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables INPUT DROP: " --log-level 7

# Save rules
iptables-save > /etc/iptables/rules.v4

nftables Complete Ruleset

#!/usr/sbin/nft -f
# /etc/nftables.conf

flush ruleset

table inet filter {
    chain input {
        type filter hook input priority 0; policy drop;

        # Allow loopback
        iif lo accept

        # Allow established connections
        ct state established,related accept

        # Allow ICMP
        ip protocol icmp accept
        ip6 nexthdr icmpv6 accept

        # Allow SSH from management network
        tcp dport 22 ip saddr 10.0.10.0/24 accept

        # PlasmaCloud services (cluster subnet)
        tcp dport { 2379, 2380, 2381 } ip saddr 10.0.200.0/24 accept  # Chainfire
        tcp dport { 2479, 2480 } ip saddr 10.0.200.0/24 accept        # FlareDB

        # PlasmaCloud services (internal network)
        tcp dport { 8080, 9090 } ip saddr 10.0.0.0/8 accept

        # FlashDNS
        udp dport 53 ip saddr 10.0.0.0/8 accept
        tcp dport 53 ip saddr 10.0.0.0/8 accept

        # PrismNET VXLAN
        udp dport 4789 ip saddr 10.0.200.0/24 accept

        # Prometheus metrics
        tcp dport 9100 ip saddr 10.0.10.5 accept

        # Log dropped packets
        log prefix "nftables drop: " level debug limit rate 5/minute
    }

    chain forward {
        type filter hook forward priority 0; policy drop;
    }

    chain output {
        type filter hook output priority 0; policy accept;
    }
}

NixOS Firewall Configuration

# In configuration.nix
{ config, pkgs, lib, ... }:

{
  networking.firewall = {
    enable = true;

    # Allow specific ports
    allowedTCPPorts = [ 22 ];  # SSH only

    # Allow ports from specific sources (requires extraCommands)
    extraCommands = ''
      # Chainfire
      iptables -A INPUT -p tcp --dport 2379 -s 10.0.200.0/24 -j ACCEPT
      iptables -A INPUT -p tcp --dport 2380 -s 10.0.200.0/24 -j ACCEPT
      iptables -A INPUT -p tcp --dport 2381 -s 10.0.200.0/24 -j ACCEPT

      # FlareDB
      iptables -A INPUT -p tcp --dport 2479 -s 10.0.200.0/24 -j ACCEPT
      iptables -A INPUT -p tcp --dport 2480 -s 10.0.200.0/24 -j ACCEPT

      # IAM
      iptables -A INPUT -p tcp --dport 8080 -s 10.0.0.0/8 -j ACCEPT

      # PlasmaVMC
      iptables -A INPUT -p tcp --dport 9090 -s 10.0.0.0/8 -j ACCEPT

      # FlashDNS
      iptables -A INPUT -p udp --dport 53 -s 10.0.0.0/8 -j ACCEPT
      iptables -A INPUT -p tcp --dport 53 -s 10.0.0.0/8 -j ACCEPT

      # PrismNET VXLAN
      iptables -A INPUT -p udp --dport 4789 -s 10.0.200.0/24 -j ACCEPT
    '';

    extraStopCommands = ''
      # Cleanup on firewall stop
      iptables -D INPUT -p tcp --dport 2379 -s 10.0.200.0/24 -j ACCEPT || true
      # ... (other cleanup) ...
    '';
  };
}

VLAN Tagging Guide

VLAN Configuration Overview

VLAN ID Name Subnet Purpose
10 Management 10.0.10.0/24 BMC/IPMI, admin access
100 Provisioning 10.0.100.0/24 PXE boot, temporary
200 Production 10.0.200.0/24 Cluster communication
300 Client 10.0.300.0/24 External client access
400 Storage 10.0.400.0/24 iSCSI, NFS, block storage
4789 VXLAN Overlay Dynamic PrismNET virtual networks

Linux VLAN Configuration (ip command)

# Create VLAN interface
ip link add link eth0 name eth0.100 type vlan id 100
ip link set dev eth0.100 up

# Assign IP address
ip addr add 10.0.100.50/24 dev eth0.100

# Add route
ip route add 10.0.100.0/24 dev eth0.100

# Make persistent (systemd-networkd)
cat > /etc/systemd/network/10-eth0.100.netdev <<EOF
[NetDev]
Name=eth0.100
Kind=vlan

[VLAN]
Id=100
EOF

cat > /etc/systemd/network/20-eth0.100.network <<EOF
[Match]
Name=eth0.100

[Network]
Address=10.0.100.50/24
Gateway=10.0.100.1
DNS=10.0.100.1
EOF

systemctl restart systemd-networkd

NixOS VLAN Configuration

# In configuration.nix
{ config, pkgs, lib, ... }:

{
  networking = {
    vlans = {
      vlan100 = {
        id = 100;
        interface = "eth0";
      };
      vlan200 = {
        id = 200;
        interface = "eth0";
      };
      vlan300 = {
        id = 300;
        interface = "eth0";
      };
    };

    interfaces = {
      vlan100 = {
        ipv4.addresses = [{
          address = "10.0.100.50";
          prefixLength = 24;
        }];
      };
      vlan200 = {
        ipv4.addresses = [{
          address = "10.0.200.10";
          prefixLength = 24;
        }];
      };
      vlan300 = {
        ipv4.addresses = [{
          address = "10.0.300.10";
          prefixLength = 24;
        }];
      };
    };

    defaultGateway = {
      address = "10.0.200.1";
      interface = "vlan200";
    };
  };
}

Switch Configuration Examples

Cisco IOS:

! Trunk port (to server)
interface GigabitEthernet0/1
  description Server node01
  switchport mode trunk
  switchport trunk allowed vlan 10,100,200,300,400
  switchport trunk native vlan 100
  spanning-tree portfast trunk

! Access port (single VLAN)
interface GigabitEthernet0/2
  description Client access
  switchport mode access
  switchport access vlan 300
  spanning-tree portfast

HP/Aruba Procurve:

! Trunk port
interface ethernet 1
  description Server node01
  tagged vlan 10,100,200,300,400
  untagged vlan 100

! Access port
interface ethernet 2
  description Client access
  untagged vlan 300

Linux Bridge (for testing):

# Create bridge with VLAN filtering
ip link add br0 type bridge vlan_filtering 1
ip link set br0 up

# Add interface to bridge
ip link set eth1 master br0

# Add VLAN to bridge
bridge vlan add vid 100 dev br0 self
bridge vlan add vid 200 dev br0 self

# Add VLAN to port
bridge vlan add vid 100 dev eth1
bridge vlan add vid 200 dev eth1

Network Troubleshooting Flowcharts

PXE Boot Failure Troubleshooting

┌─────────────────────────────┐
│ Server powers on, no PXE    │
│ boot or no IP address       │
└──────────────┬──────────────┘
               │
               v
      ┌────────────────┐
      │ Is BIOS/UEFI   │───No───> Enter BIOS, enable network boot
      │ PXE enabled?   │          Set boot order: Network → Disk
      └────────┬───────┘
               │ Yes
               v
      ┌────────────────┐
      │ DHCP server    │───No───> Check DHCP server:
      │ running?       │          - systemctl status dhcpd4
      └────────┬───────┘          - Verify interface config
               │ Yes              - Check firewall (UDP 67/68)
               v
      ┌────────────────┐
      │ Server getting │───No───> Monitor DHCP logs:
      │ IP address?    │          - journalctl -u dhcpd4 -f
      └────────┬───────┘          - tcpdump -i eth0 port 67
               │ Yes              - Verify server is on same subnet
               v
      ┌────────────────┐
      │ TFTP download  │───No───> Check TFTP server:
      │ working?       │          - systemctl status atftpd
      └────────┬───────┘          - tftp localhost -c get undionly.kpxe
               │ Yes              - Verify files exist
               v
      ┌────────────────┐
      │ iPXE loads and │───No───> Check HTTP server:
      │ downloads boot │          - systemctl status nginx
      │ script?        │          - curl http://10.0.100.10/boot/ipxe/boot.ipxe
      └────────┬───────┘
               │ Yes
               v
      ┌────────────────┐
      │ Kernel/initrd  │───No───> Verify netboot images:
      │ download and   │          - Check file sizes (bzImage ~10MB, initrd ~200MB)
      │ boot?          │          - Verify HTTP accessibility
      └────────┬───────┘          - Check console for error messages
               │ Yes
               v
      ┌────────────────┐
      │ NixOS installer│
      │ boots, SSH     │
      │ accessible     │
      └────────────────┘

Cluster Join Failure Troubleshooting

┌─────────────────────────────┐
│ Node boots but does not     │
│ join cluster                │
└──────────────┬──────────────┘
               │
               v
      ┌────────────────┐
      │ Check first-   │
      │ boot logs:     │
      │ journalctl -u  │
      │ chainfire-     │
      │ cluster-join   │
      └────────┬───────┘
               │
               v
      ┌────────────────┐
      │ Service        │───No───> Check main service:
      │ started?       │          - systemctl status chainfire.service
      └────────┬───────┘          - journalctl -u chainfire.service
               │ Yes              - Verify config file exists
               v
      ┌────────────────┐
      │ cluster-config │───No───> Check configuration:
      │ .json exists?  │          - ls -l /etc/nixos/secrets/cluster-config.json
      └────────┬───────┘          - jq . /etc/nixos/secrets/cluster-config.json
               │ Yes
               v
      ┌────────────────┐
      │ Health check   │───No───> Wait or troubleshoot:
      │ passes?        │          - curl -k https://localhost:2379/health
      └────────┬───────┘          - Check TLS certificates
               │ Yes              - Check port not in use
               v
      ┌────────────────┐
      │ Bootstrap mode │
      │ or join mode?  │
      └───┬────────┬───┘
          │        │
     Bootstrap   Join
          │        │
          v        v
   ┌──────────┐ ┌──────────┐
   │ Peers    │ │ Leader   │───No───> Check network:
   │ reachable│ │ reachable│          - ping leader
   │?         │ │?         │          - curl -k https://leader:2379/health
   └────┬─────┘ └────┬─────┘          - Check firewall
        │ Yes        │ Yes
        v            v
   ┌──────────┐ ┌──────────┐
   │ Cluster  │ │ Join API │───No───> Manual join:
   │ forms    │ │ succeeds?│          - curl -k -X POST https://leader:2379/admin/member/add
   │ auto-    │ └────┬─────┘
   │ matically│      │ Yes
   └──────────┘      v
        │       ┌──────────┐
        └──────>│ Cluster  │
                │ healthy  │
                └──────────┘

Network Connectivity Troubleshooting

┌─────────────────────────────┐
│ Nodes cannot communicate    │
└──────────────┬──────────────┘
               │
               v
      ┌────────────────┐
      │ Basic IP       │───No───> Check network config:
      │ connectivity?  │          - ip addr show
      │ (ping)         │          - ip route show
      └────────┬───────┘          - Fix interface/routing
               │ Yes
               v
      ┌────────────────┐
      │ DNS resolution │───No───> Check DNS:
      │ working?       │          - cat /etc/resolv.conf
      │ (dig/nslookup) │          - dig @10.0.200.1 node01.example.com
      └────────┬───────┘          - Add to /etc/hosts as workaround
               │ Yes
               v
      ┌────────────────┐
      │ Specific port  │───No───> Check firewall:
      │ reachable?     │          - iptables -L -n | grep <port>
      │ (nc -zv)       │          - Add firewall rules
      └────────┬───────┘          - Restart service
               │ Yes
               v
      ┌────────────────┐
      │ TLS handshake  │───No───> Check certificates:
      │ succeeds?      │          - openssl s_client -connect host:port
      │ (openssl)      │          - Verify cert paths
      └────────┬───────┘          - Check cert expiry
               │ Yes
               v
      ┌────────────────┐
      │ Application    │
      │ responds       │
      └────────────────┘

Document End