photoncloud-monorepo/nightlight
centra d2149b6249 fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth
- Replace form_urlencoded with RFC 3986 compliant URI encoding
- Implement aws_uri_encode() matching AWS SigV4 spec exactly
- Unreserved chars (A-Z,a-z,0-9,-,_,.,~) not encoded
- All other chars percent-encoded with uppercase hex
- Preserve slashes in paths, encode in query params
- Normalize empty paths to '/' per AWS spec
- Fix test expectations (body hash, HMAC values)
- Add comprehensive SigV4 signature determinism test

This fixes the canonicalization mismatch that caused signature
validation failures in T047. Auth can now be enabled for production.

Refs: T058.S1
2025-12-12 06:23:46 +09:00
..
crates fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth 2025-12-12 06:23:46 +09:00
tests fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth 2025-12-12 06:23:46 +09:00
Cargo.toml fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth 2025-12-12 06:23:46 +09:00
README.md fix(lightningstor): Fix SigV4 canonicalization for AWS S3 auth 2025-12-12 06:23:46 +09:00

Nightlight

A Prometheus-compatible metrics storage system with mTLS support, written in Rust.

Overview

Nightlight is a high-performance time-series database designed to replace VictoriaMetrics in environments requiring open-source mTLS support. It provides:

  • Prometheus Compatibility: Remote write ingestion and PromQL query support
  • mTLS Security: Mutual TLS authentication for all connections
  • Push-based Ingestion: Accept metrics via Prometheus remote_write protocol
  • Scalable Storage: Efficient time-series storage with compression and retention
  • PromQL Engine: Query metrics using the Prometheus query language

This project is part of the cloud infrastructure stack (PROJECT.md Item 12).

Architecture

For detailed architecture documentation, see docs/por/T033-nightlight/DESIGN.md.

High-Level Components

┌─────────────────────────────────────────────────────────────────┐
│                         Nightlight Server                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────────┐        ┌──────────────────┐              │
│  │  HTTP Ingestion  │        │   gRPC Query     │              │
│  │  (remote_write)  │        │   (PromQL API)   │              │
│  └────────┬─────────┘        └────────┬─────────┘              │
│           │                           │                         │
│           ▼                           ▼                         │
│  ┌──────────────────────────────────────────────┐              │
│  │            Storage Engine                     │              │
│  │  - In-memory head block (WAL-backed)         │              │
│  │  - Persistent blocks (Gorilla compression)   │              │
│  │  - Inverted index (label → series)           │              │
│  │  - Compaction & retention                    │              │
│  └──────────────────────────────────────────────┘              │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Crates

  • nightlight-api: gRPC client library and protobuf definitions
  • nightlight-types: Core data types (Metric, TimeSeries, Label, Sample)
  • nightlight-server: Main server implementation

Building

Prerequisites

  • Rust 1.75 or later
  • Protocol Buffers compiler (provided via protoc-bin-vendored)

Build Commands

# Build all crates
cargo build --release

# Build specific crate
cargo build -p nightlight-server --release

# Run tests
cargo test

# Check code without building
cargo check

NixOS

The project includes Nix flake support (per T024 patterns):

# Build with Nix
nix build

# Enter development shell
nix develop

Configuration

Configuration is specified in YAML format. Default location: config.yaml

Example Configuration

server:
  grpc_addr: "0.0.0.0:9100"           # gRPC query API
  http_addr: "0.0.0.0:9101"           # HTTP remote_write endpoint
  max_concurrent_streams: 100
  query_timeout_seconds: 30
  max_samples_per_query: 10000000

storage:
  data_dir: "/var/lib/nightlight"
  retention_days: 15
  wal_segment_size_mb: 128
  block_duration_hours: 2
  max_head_samples: 1000000
  compaction_interval_seconds: 3600

# Optional: Enable mTLS (T027 unified TLS pattern)
tls:
  cert_file: "/etc/nightlight/tls/cert.pem"
  key_file: "/etc/nightlight/tls/key.pem"
  ca_file: "/etc/nightlight/tls/ca.pem"
  require_client_cert: true

Running

# Run with default config
./target/release/nightlight-server

# Run with custom config
./target/release/nightlight-server --config /path/to/config.yaml

Usage

Ingesting Metrics

Nightlight implements the Prometheus remote_write protocol v1.0 for push-based metric ingestion.

Using Prometheus Remote Write

Configure Prometheus to push metrics to Nightlight:

# prometheus.yml
remote_write:
  - url: "http://localhost:9101/api/v1/write"
    queue_config:
      capacity: 10000
      max_shards: 10
      batch_send_deadline: 5s
    # Optional: mTLS configuration
    tls_config:
      cert_file: client.pem
      key_file: client-key.pem
      ca_file: ca.pem

Using the API Directly

You can also push metrics directly using the remote_write protocol:

# Run the example to push sample metrics
cargo run --example push_metrics

The remote_write endpoint (POST /api/v1/write) expects:

  • Content-Type: application/x-protobuf
  • Content-Encoding: snappy
  • Body: Snappy-compressed Prometheus WriteRequest protobuf

See examples/push_metrics.rs for a complete implementation example.

Features

  • Snappy Compression: Efficient compression for wire transfer
  • Label Validation: Prometheus-compliant label name validation
  • Backpressure: HTTP 429 when write buffer is full
  • Sample Validation: Rejects NaN and Inf values
  • Buffered Writes: In-memory batching for performance

Querying Metrics

Nightlight provides a Prometheus-compatible HTTP API for querying metrics using PromQL.

API Endpoints

Instant Query

Query metric values at a specific point in time:

GET /api/v1/query?query=<promql>&time=<timestamp_ms>

# Example
curl 'http://localhost:9101/api/v1/query?query=up&time=1234567890000'

Parameters:

  • query (required): PromQL expression
  • time (optional): Unix timestamp in milliseconds (defaults to current time)

Response format:

{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {
        "metric": {"__name__": "up", "job": "prometheus"},
        "value": [1234567890000, 1.0]
      }
    ]
  }
}
Range Query

Query metric values over a time range:

GET /api/v1/query_range?query=<promql>&start=<ts>&end=<ts>&step=<duration_ms>

# Example
curl 'http://localhost:9101/api/v1/query_range?query=rate(http_requests_total[5m])&start=1234567890000&end=1234571490000&step=60000'

Parameters:

  • query (required): PromQL expression
  • start (required): Start timestamp in milliseconds
  • end (required): End timestamp in milliseconds
  • step (required): Step duration in milliseconds
Label Values

Get all values for a specific label:

GET /api/v1/label/<label_name>/values

# Example
curl 'http://localhost:9101/api/v1/label/job/values'
Series Metadata

Get metadata for all series:

GET /api/v1/series

# Example
curl 'http://localhost:9101/api/v1/series'

Supported PromQL

Nightlight implements a practical subset of PromQL covering 80% of common use cases:

Selectors:

# Metric name
http_requests_total

# Label matching
http_requests_total{method="GET"}
http_requests_total{method="GET", status="200"}

# Label operators
metric{label="value"}      # Exact match
metric{label!="value"}     # Not equal
metric{label=~"regex"}     # Regex match
metric{label!~"regex"}     # Negative regex

Range Selectors:

http_requests_total[5m]    # Last 5 minutes
http_requests_total[1h]    # Last 1 hour

Aggregations:

sum(http_requests_total)
avg(http_requests_total)
min(http_requests_total)
max(http_requests_total)
count(http_requests_total)

Functions:

# Rate functions
rate(http_requests_total[5m])       # Per-second rate
irate(http_requests_total[5m])      # Instant rate (last 2 points)
increase(http_requests_total[1h])   # Total increase over time

Example Client

Run the example query client to test all query endpoints:

cargo run --example query_metrics

See examples/query_metrics.rs for implementation details.

Grafana Integration

Configure Grafana to use Nightlight as a Prometheus data source:

  1. Add a new Prometheus data source
  2. Set URL to http://localhost:9101
  3. (Optional) Configure mTLS certificates
  4. Test connection with instant query

Grafana will automatically use the /api/v1/query and /api/v1/query_range endpoints for dashboard queries.

Development Roadmap

This workspace scaffold (S2) provides the foundation. Implementation proceeds as:

  • S2 (Scaffold): Complete - workspace structure, types, protobuf definitions
  • S3 (Push Ingestion): Complete - Prometheus remote_write endpoint with validation, compression, and buffering (34 tests passing)
  • S4 (PromQL Engine): Complete - Query execution engine with instant/range queries, aggregations, rate functions (42 tests passing)
  • S5 (Storage Layer): Implement persistent time-series storage backend
  • S6 (Integration): NixOS module, testing, documentation

See docs/por/T033-nightlight/task.yaml for detailed task tracking.

Integration

Service Ports

  • 9100: gRPC query API (mTLS)
  • 9101: HTTP remote_write API (mTLS)

Monitoring

Nightlight exports its own metrics on the standard /metrics endpoint for self-monitoring.

License

MIT OR Apache-2.0

References