photoncloud-monorepo/docs/por/T057-k8shost-resource-management/task.yaml
centra 3eeb303dcb feat: Batch commit for T039.S3 deployment
Includes all pending changes needed for nixos-anywhere:
- fiberlb: L7 policy, rule, certificate types
- deployer: New service for cluster management
- nix-nos: Generic network modules
- Various service updates and fixes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 04:34:51 +09:00

139 lines
5.3 KiB
YAML

id: T057
name: k8shost Resource Management
goal: Implement proper IP Address Management (IPAM) and tenant-aware scheduling for k8shost
status: complete
priority: P1
owner: peerB
created: 2025-12-12
depends_on: []
blocks: [T039]
context: |
**Findings from T049 Audit:**
- `k8shost/crates/k8shost-server/src/scheduler.rs`: `// TODO: Get list of active tenants from IAM or FlareDB`
- `k8shost/crates/k8shost-server/src/services/service.rs`: `/// TODO: Implement proper IP allocation with IPAM`
**Strategic Value:**
- Essential for multi-tenant isolation and efficient resource utilization.
- Required for Production Readiness (T039).
acceptance:
- k8shost scheduler is tenant-aware (can prioritize/constrain pods by tenant)
- Pluggable IPAM system implemented for Service IP allocation
- IPAM integrates with PrismNET for IP assignment and management
- Integration tests for tenant scheduling and IPAM
steps:
- step: S1
name: IPAM System Design & Spec
done: Define IPAM system architecture and API (integration with PrismNET)
status: complete
started: 2025-12-12 18:30 JST
completed: 2025-12-12 18:45 JST
owner: peerA
priority: P1
outputs:
- path: S1-ipam-spec.md
note: IPAM system specification (250+ lines)
notes: |
Designed IPAM integration between k8shost and PrismNET:
- ServiceIPPool resource for ClusterIP and LoadBalancer IPs
- IpamService gRPC API in PrismNET
- IpamClient for k8shost integration
- Per-tenant IP pool isolation
- ChainFire-backed storage for consistency
- Backward compatible fallback to local counter
- step: S2
name: Service IP Allocation
done: Implement IPAM integration for k8shost Service IPs
status: complete
started: 2025-12-12 20:03 JST
completed: 2025-12-12 23:35 JST
owner: peerB
priority: P1
outputs:
- path: prismnet/crates/prismnet-server/src/services/ipam.rs
note: IpamService gRPC implementation (310 LOC)
- path: prismnet/crates/prismnet-server/src/metadata.rs
note: IPAM metadata storage methods (+150 LOC)
- path: k8shost/crates/k8shost-server/src/ipam_client.rs
note: IpamClient gRPC wrapper (100 LOC)
notes: |
**Implementation Complete (1,030 LOC)**
PrismNET IPAM (730 LOC):
✅ ServiceIPPool types with CIDR + HashSet allocation tracking
✅ IPAM proto definitions (6 RPCs: Create/Get/List pools, Allocate/Release/Get IPs)
✅ IpamService gRPC implementation with next-available-IP algorithm
✅ ChainFire metadata storage (6 methods)
✅ Registered in prismnet-server main.rs
k8shost Integration (150 LOC):
✅ IpamClient gRPC wrapper
✅ ServiceServiceImpl updated to use IPAM (allocate on create, release on delete)
✅ PrismNetConfig added to k8shost config
✅ Tests updated
Technical highlights:
- Tenant isolation via (org_id, project_id) scoping
- IPv4 CIDR enumeration (skips network/broadcast, starts at .10)
- Auto-pool-selection by type (ClusterIp/LoadBalancer/NodePort)
- Best-effort IP release on service deletion
- ChainFire persistence with JSON serialization
- step: S3
name: Tenant-Aware Scheduler
done: Modify scheduler to respect tenant constraints/priorities
status: complete
started: 2025-12-12 23:36 JST
completed: 2025-12-12 23:45 JST
owner: peerB
priority: P1
outputs:
- path: k8shost/crates/k8shost-server/src/scheduler.rs
note: Tenant-aware scheduler with quota enforcement (+150 LOC)
- path: k8shost/crates/k8shost-server/src/storage.rs
note: list_all_pods for tenant discovery (+35 LOC)
notes: |
**Implementation Complete (185 LOC)**
✅ CreditService client integration (CREDITSERVICE_ENDPOINT env var)
✅ Tenant discovery via pod query (get_active_tenants)
✅ Quota enforcement (check_quota_for_pod) before scheduling
✅ Resource cost calculation matching PodServiceImpl pattern
✅ Best-effort reliability (logs warnings, continues on errors)
Architecture decisions:
- Pragmatic tenant discovery: query pods for unique (org_id, project_id)
- Best-effort quota: availability over strict consistency
- Cost consistency: same formula as admission control
evidence:
- item: S1 IPAM System Design
desc: |
Created IPAM integration specification:
File: S1-ipam-spec.md (250+ lines)
Key design decisions:
- ServiceIPPool resource: Per-tenant IP pools for ClusterIP and LoadBalancer
- IpamService gRPC: AllocateServiceIP, ReleaseServiceIP, GetIPAllocation
- Storage: ChainFire-backed pools and allocations
- Tenant isolation: Separate pools per org_id/project_id
- Backward compat: Fallback to local counter if IPAM unavailable
Architecture:
- k8shost → IpamClient → PrismNET IpamService
- PrismNET stores pools in /prismnet/ipam/pools/{org}/{proj}/{pool}
- Allocations tracked in /prismnet/ipam/allocations/{org}/{proj}/{ip}
Implementation phases:
1. PrismNET IpamService (new gRPC service)
2. k8shost IpamClient integration
3. Default pool auto-provisioning
files:
- docs/por/T057-k8shost-resource-management/S1-ipam-spec.md
timestamp: 2025-12-12 18:45 JST
notes: |
Critical for multi-tenant and production deployments.