- Add self.overlays.default to node01/02/03 configurations - Makes service packages (chainfire-server, flaredb-server, etc.) available to NixOS modules - Fixes "chainfire-server package not found" error during nixos-anywhere deployment Root cause: NixOS modules reference pkgs.chainfire-server but packages were not in pkgs scope Solution: Apply overlay that injects flake packages into nixpkgs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| crates | ||
| tests | ||
| Cargo.lock | ||
| Cargo.toml | ||
| README.md | ||
Nightlight
A Prometheus-compatible metrics storage system with mTLS support, written in Rust.
Overview
Nightlight is a high-performance time-series database designed to replace VictoriaMetrics in environments requiring open-source mTLS support. It provides:
- Prometheus Compatibility: Remote write ingestion and PromQL query support
- mTLS Security: Mutual TLS authentication for all connections
- Push-based Ingestion: Accept metrics via Prometheus remote_write protocol
- Scalable Storage: Efficient time-series storage with compression and retention
- PromQL Engine: Query metrics using the Prometheus query language
This project is part of the cloud infrastructure stack (PROJECT.md Item 12).
Architecture
For detailed architecture documentation, see docs/por/T033-nightlight/DESIGN.md.
High-Level Components
┌─────────────────────────────────────────────────────────────────┐
│ Nightlight Server │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ HTTP Ingestion │ │ gRPC Query │ │
│ │ (remote_write) │ │ (PromQL API) │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────┐ │
│ │ Storage Engine │ │
│ │ - In-memory head block (WAL-backed) │ │
│ │ - Persistent blocks (Gorilla compression) │ │
│ │ - Inverted index (label → series) │ │
│ │ - Compaction & retention │ │
│ └──────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Crates
- nightlight-api: gRPC client library and protobuf definitions
- nightlight-types: Core data types (Metric, TimeSeries, Label, Sample)
- nightlight-server: Main server implementation
Building
Prerequisites
- Rust 1.75 or later
- Protocol Buffers compiler (provided via
protoc-bin-vendored)
Build Commands
# Build all crates
cargo build --release
# Build specific crate
cargo build -p nightlight-server --release
# Run tests
cargo test
# Check code without building
cargo check
NixOS
The project includes Nix flake support (per T024 patterns):
# Build with Nix
nix build
# Enter development shell
nix develop
Configuration
Configuration is specified in YAML format. Default location: config.yaml
Example Configuration
server:
grpc_addr: "0.0.0.0:9100" # gRPC query API
http_addr: "0.0.0.0:9101" # HTTP remote_write endpoint
max_concurrent_streams: 100
query_timeout_seconds: 30
max_samples_per_query: 10000000
storage:
data_dir: "/var/lib/nightlight"
retention_days: 15
wal_segment_size_mb: 128
block_duration_hours: 2
max_head_samples: 1000000
compaction_interval_seconds: 3600
# Optional: Enable mTLS (T027 unified TLS pattern)
tls:
cert_file: "/etc/nightlight/tls/cert.pem"
key_file: "/etc/nightlight/tls/key.pem"
ca_file: "/etc/nightlight/tls/ca.pem"
require_client_cert: true
Running
# Run with default config
./target/release/nightlight-server
# Run with custom config
./target/release/nightlight-server --config /path/to/config.yaml
Usage
Ingesting Metrics
Nightlight implements the Prometheus remote_write protocol v1.0 for push-based metric ingestion.
Using Prometheus Remote Write
Configure Prometheus to push metrics to Nightlight:
# prometheus.yml
remote_write:
- url: "http://localhost:9101/api/v1/write"
queue_config:
capacity: 10000
max_shards: 10
batch_send_deadline: 5s
# Optional: mTLS configuration
tls_config:
cert_file: client.pem
key_file: client-key.pem
ca_file: ca.pem
Using the API Directly
You can also push metrics directly using the remote_write protocol:
# Run the example to push sample metrics
cargo run --example push_metrics
The remote_write endpoint (POST /api/v1/write) expects:
- Content-Type:
application/x-protobuf - Content-Encoding:
snappy - Body: Snappy-compressed Prometheus WriteRequest protobuf
See examples/push_metrics.rs for a complete implementation example.
Features
- Snappy Compression: Efficient compression for wire transfer
- Label Validation: Prometheus-compliant label name validation
- Backpressure: HTTP 429 when write buffer is full
- Sample Validation: Rejects NaN and Inf values
- Buffered Writes: In-memory batching for performance
Querying Metrics
Nightlight provides a Prometheus-compatible HTTP API for querying metrics using PromQL.
API Endpoints
Instant Query
Query metric values at a specific point in time:
GET /api/v1/query?query=<promql>&time=<timestamp_ms>
# Example
curl 'http://localhost:9101/api/v1/query?query=up&time=1234567890000'
Parameters:
query(required): PromQL expressiontime(optional): Unix timestamp in milliseconds (defaults to current time)
Response format:
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {"__name__": "up", "job": "prometheus"},
"value": [1234567890000, 1.0]
}
]
}
}
Range Query
Query metric values over a time range:
GET /api/v1/query_range?query=<promql>&start=<ts>&end=<ts>&step=<duration_ms>
# Example
curl 'http://localhost:9101/api/v1/query_range?query=rate(http_requests_total[5m])&start=1234567890000&end=1234571490000&step=60000'
Parameters:
query(required): PromQL expressionstart(required): Start timestamp in millisecondsend(required): End timestamp in millisecondsstep(required): Step duration in milliseconds
Label Values
Get all values for a specific label:
GET /api/v1/label/<label_name>/values
# Example
curl 'http://localhost:9101/api/v1/label/job/values'
Series Metadata
Get metadata for all series:
GET /api/v1/series
# Example
curl 'http://localhost:9101/api/v1/series'
Supported PromQL
Nightlight implements a practical subset of PromQL covering 80% of common use cases:
Selectors:
# Metric name
http_requests_total
# Label matching
http_requests_total{method="GET"}
http_requests_total{method="GET", status="200"}
# Label operators
metric{label="value"} # Exact match
metric{label!="value"} # Not equal
metric{label=~"regex"} # Regex match
metric{label!~"regex"} # Negative regex
Range Selectors:
http_requests_total[5m] # Last 5 minutes
http_requests_total[1h] # Last 1 hour
Aggregations:
sum(http_requests_total)
avg(http_requests_total)
min(http_requests_total)
max(http_requests_total)
count(http_requests_total)
Functions:
# Rate functions
rate(http_requests_total[5m]) # Per-second rate
irate(http_requests_total[5m]) # Instant rate (last 2 points)
increase(http_requests_total[1h]) # Total increase over time
Example Client
Run the example query client to test all query endpoints:
cargo run --example query_metrics
See examples/query_metrics.rs for implementation details.
Grafana Integration
Configure Grafana to use Nightlight as a Prometheus data source:
- Add a new Prometheus data source
- Set URL to
http://localhost:9101 - (Optional) Configure mTLS certificates
- Test connection with instant query
Grafana will automatically use the /api/v1/query and /api/v1/query_range endpoints for dashboard queries.
Development Roadmap
This workspace scaffold (S2) provides the foundation. Implementation proceeds as:
- S2 (Scaffold): Complete - workspace structure, types, protobuf definitions
- S3 (Push Ingestion): Complete - Prometheus remote_write endpoint with validation, compression, and buffering (34 tests passing)
- S4 (PromQL Engine): Complete - Query execution engine with instant/range queries, aggregations, rate functions (42 tests passing)
- S5 (Storage Layer): Implement persistent time-series storage backend
- S6 (Integration): NixOS module, testing, documentation
See docs/por/T033-nightlight/task.yaml for detailed task tracking.
Integration
Service Ports
- 9100: gRPC query API (mTLS)
- 9101: HTTP remote_write API (mTLS)
Monitoring
Nightlight exports its own metrics on the standard /metrics endpoint for self-monitoring.
License
MIT OR Apache-2.0
References
- Task: T033 Nightlight (PROJECT.md Item 12)
- Design:
docs/por/T033-nightlight/DESIGN.md - Dependencies: T024 (NixOS), T027 (Unified TLS)
- Prometheus Remote Write: https://prometheus.io/docs/concepts/remote_write_spec/
- PromQL: https://prometheus.io/docs/prometheus/latest/querying/basics/