photoncloud-monorepo/docs/por/T019-overlay-network-implementation/task.yaml
centra a7ec7e2158 Add T026 practical test + k8shost to flake + workspace files
- Created T026-practical-test task.yaml for MVP smoke testing
- Added k8shost-server to flake.nix (packages, apps, overlays)
- Staged all workspace directories for nix flake build
- Updated flake.nix shellHook to include k8shost

Resolves: T026.S1 blocker (R8 - nix submodule visibility)
2025-12-09 06:07:50 +09:00

226 lines
9.4 KiB
YAML

id: T019
name: Overlay Network Implementation (NovaNET)
status: complete
goal: Implement multi-tenant overlay networking with OVN integration for PlasmaVMC
priority: P0
owner: peerA (strategy) + peerB (implementation)
created: 2025-12-08
depends_on: [T015]
context: |
PROJECT.md item 11 specifies overlay networking for multi-tenant isolation.
T015 completed specification work:
- research-summary.md: OVN recommended over Cilium/Calico
- tenant-network-model.md: VPC/subnet/port/security-group model
- plasmavmc-integration.md: VM-port attachment flow
NovaNET will be a new component providing:
- Tenant network isolation (VPC model)
- OVN integration layer (ovsdb, ovn-controller)
- Security groups (firewall rules)
- PlasmaVMC integration hooks
acceptance:
- novanet workspace created (novanet-api, novanet-server, novanet-types)
- gRPC services for VPC, Subnet, Port, SecurityGroup CRUD
- OVN integration layer (ovsdb client)
- PlasmaVMC hook for VM-port attachment
- Integration test showing VM network isolation
steps:
- step: S1
action: NovaNET workspace scaffold
priority: P0
status: complete
owner: peerB
notes: |
Create novanet workspace structure:
- novanet/Cargo.toml (workspace)
- novanet/crates/novanet-api (proto + generated code)
- novanet/crates/novanet-server (gRPC server)
- novanet/crates/novanet-types (domain types)
Pattern: follow fiberlb/flashdns structure
deliverables:
- Workspace compiles
- Proto for VPC, Subnet, Port, SecurityGroup services
outputs:
- path: novanet/crates/novanet-server/src/services/vpc.rs
note: VPC gRPC service implementation
- path: novanet/crates/novanet-server/src/services/subnet.rs
note: Subnet gRPC service implementation
- path: novanet/crates/novanet-server/src/services/port.rs
note: Port gRPC service implementation
- path: novanet/crates/novanet-server/src/services/security_group.rs
note: SecurityGroup gRPC service implementation
- path: novanet/crates/novanet-server/src/main.rs
note: Server binary entry point
- step: S2
action: NovaNET types and metadata store
priority: P0
status: complete
owner: peerB
notes: |
Define domain types from T015 spec:
- VPC (id, org_id, project_id, cidr, name)
- Subnet (id, vpc_id, cidr, gateway, dhcp_enabled)
- Port (id, subnet_id, mac, ip, device_id, device_type)
- SecurityGroup (id, org_id, project_id, name, rules[])
- SecurityGroupRule (direction, protocol, port_range, remote_cidr)
Create NetworkMetadataStore with ChainFire backend.
Key schema:
/novanet/vpcs/{org_id}/{project_id}/{vpc_id}
/novanet/subnets/{vpc_id}/{subnet_id}
/novanet/ports/{subnet_id}/{port_id}
/novanet/security_groups/{org_id}/{project_id}/{sg_id}
Progress (2025-12-08 20:51):
- ✓ Proto: All requests (Get/Update/Delete/List) include org_id/project_id for VPC/Subnet/Port/SecurityGroup
- ✓ Metadata: Tenant-validated signatures implemented with cross-tenant delete denial test
- ✓ Service layer aligned to new signatures (vpc/subnet/port/security_group) and compiling
- ✓ SecurityGroup architectural consistency: org_id added to type/proto/keys (uniform tenant model)
- ✓ chainfire-proto decoupling completed; novanet-api uses vendored protoc
deliverables:
- Types defined
- Metadata store with CRUD
- Unit tests
outputs:
- path: novanet/crates/novanet-server/src/metadata.rs
note: Async metadata store with ChainFire backend
- step: S3
action: gRPC control plane services
priority: P0
status: complete
owner: peerB
notes: |
Implement gRPC services:
- VpcService: Create, Get, List, Delete
- SubnetService: Create, Get, List, Delete
- PortService: Create, Get, List, Delete, AttachDevice, DetachDevice
- SecurityGroupService: Create, Get, List, Delete, AddRule, RemoveRule
deliverables:
- All services functional
- cargo check passes
- step: S4
action: OVN integration layer
priority: P1
status: complete
owner: peerB
notes: |
Create OVN client for network provisioning:
- OvnClient struct connecting to ovsdb (northbound)
- create_logical_switch(vpc) -> OVN logical switch
- create_logical_switch_port(port) -> OVN LSP
- create_acl(security_group_rule) -> OVN ACL
Note: Initial implementation can use mock/stub for CI.
Real OVN requires ovn-northd, ovsdb-server running.
deliverables:
- OvnClient with basic operations
- Mock mode for testing
outputs:
- path: novanet/crates/novanet-server/src/ovn/client.rs
note: OvnClient mock/real scaffold with LS/LSP/ACL ops, env-configured
- path: novanet/crates/novanet-server/src/services
note: VPC/Port/SG services invoke OVN provisioning hooks post-metadata writes
- step: S5
action: PlasmaVMC integration hooks
priority: P1
status: complete
owner: peerB
notes: |
Add network attachment to PlasmaVMC:
- Extend VM spec with network_ports: [PortId]
- On VM create: request ports from NovaNET
- Pass port info to hypervisor (tap device name, MAC)
- On VM delete: release ports
deliverables:
- PlasmaVMC network hooks
- Integration test
outputs:
- path: plasmavmc/crates/plasmavmc-types/src/vm.rs
note: NetworkSpec extended with subnet_id and port_id fields
- path: plasmavmc/crates/plasmavmc-server/src/novanet_client.rs
note: NovaNET client wrapper for port management (82L)
- path: plasmavmc/crates/plasmavmc-server/src/vm_service.rs
note: VM lifecycle hooks for NovaNET port attach/detach
- step: S6
action: Integration test
priority: P1
status: complete
owner: peerB
notes: |
End-to-end test:
1. Create VPC, Subnet via gRPC
2. Create Port
3. Create VM with port attachment (mock hypervisor)
4. Verify port status updated
5. Test security group rules (mock ACL check)
deliverables:
- Integration tests passing
- Evidence log
outputs:
- path: plasmavmc/crates/plasmavmc-server/tests/novanet_integration.rs
note: E2E integration test (246L) - VPC/Subnet/Port creation, VM attach/detach lifecycle
blockers:
- description: "CRITICAL SECURITY: Proto+metadata allow Get/Update/Delete by ID without tenant validation (R6 escalation)"
owner: peerB
status: resolved
severity: critical
discovered: "2025-12-08 18:38 (peerA strategic review of 000170)"
details: |
Proto layer (novanet.proto:50-84):
- GetVpcRequest/UpdateVpcRequest/DeleteVpcRequest only have 'id' field
- Missing org_id/project_id tenant context
Metadata layer (metadata.rs:220-282):
- get_vpc_by_id/update_vpc/delete_vpc use ID index without tenant check
- ID index pattern (/novanet/vpc_ids/{id}) bypasses tenant scoping
- Same for Subnet, Port, SecurityGroup operations
Pattern violation:
- FiberLB/FlashDNS/LightningSTOR: delete methods take full object
- NovaNET: delete methods take only ID (allows bypass)
Attack vector:
- Attacker learns VPC ID via leak/guess
- Calls DeleteVpc(id) without org/project
- Retrieves and deletes victim's VPC
Violates: Multi-tenant isolation hard guardrail (PROJECT.md)
fix_required: |
OPTION A (Recommended - Pattern Match + Defense-in-Depth):
1. Proto: Add org_id/project_id to Get/Update/Delete requests for all resources
2. Metadata signatures:
- delete_vpc(&self, org_id: &str, project_id: &str, id: &VpcId) -> Result<Option<Vpc>>
- update_vpc(&self, org_id: &str, project_id: &str, id: &VpcId, ...) -> Result<Option<Vpc>>
OR alternate: delete_vpc(&self, vpc: &Vpc) to match FiberLB/FlashDNS pattern
3. Make *_by_id methods private (internal helpers only)
4. Add test: cross-tenant Get/Delete with wrong org/project returns NotFound/PermissionDenied
OPTION B (Auth Layer Validation):
- gRPC services extract caller org_id/project_id from auth context
- After *_by_id fetch, validate object.org_id == caller.org_id
- Return PermissionDenied on mismatch
- Still lacks defense-in-depth at data layer
DECISION: Option A required (defense-in-depth + pattern consistency)
progress: |
2025-12-08 20:15 - Proto+metadata + service layer updated to enforce tenant context on Get/Update/Delete/List for VPC/Subnet/Port; SecurityGroup list now takes org/project.
- cross-tenant delete denial test added (metadata::tests::test_cross_tenant_delete_denied)
- cargo test -p novanet-server passes (tenant isolation coverage)
next: "Proceed to S3 gRPC control-plane wiring"
evidence:
- "2025-12-08: cargo test -p novanet-server :: ok (tenant isolation tests passing)"
- "2025-12-08: proto updated for tenant-scoped Get/Update/Delete/List (novanet/crates/novanet-api/proto/novanet.proto)"
notes: |
NovaNET naming: Nova (star) + NET (network) = bright network
Risk: OVN complexity requires real infrastructure for full testing.
Mitigation: Use mock/stub mode for CI; document manual OVN testing.
Risk: PlasmaVMC changes may break existing functionality.
Mitigation: Add network_ports as optional field; existing tests unchanged.