photoncloud-monorepo/specifications/flaredb/sql-layer-design.md

6.3 KiB

FlareDB SQL Layer Design

Overview

This document outlines the design for a SQL-compatible layer built on top of FlareDB's KVS foundation. The goal is to enable SQL queries (DDL/DML) while leveraging FlareDB's existing distributed KVS capabilities.

Architecture Principles

  1. KVS Foundation: All SQL data stored as KVS key-value pairs
  2. Simple First: Start with core SQL subset (no JOINs, no transactions initially)
  3. Efficient Encoding: Optimize key encoding for range scans
  4. Namespace Isolation: Use FlareDB namespaces for multi-tenancy

Key Design Decisions

1. SQL Parser

Choice: Use sqlparser-rs crate

  • Mature, well-tested SQL parser
  • Supports MySQL/PostgreSQL/ANSI SQL dialects
  • Easy to extend for custom syntax

2. Table Metadata Schema

Table metadata stored in KVS with special prefix:

Key:   __sql_meta:tables:{table_name}
Value: TableMetadata {
  table_id: u32,
  table_name: String,
  columns: Vec<ColumnDef>,
  primary_key: Vec<String>,
  created_at: u64,
}

ColumnDef {
  name: String,
  data_type: DataType,
  nullable: bool,
  default_value: Option<Value>,
}

DataType enum:
  - Integer
  - BigInt
  - Text
  - Boolean
  - Timestamp

Table ID allocation:

Key:   __sql_meta:next_table_id
Value: u32 (monotonic counter)

3. Row Key Encoding

Efficient key encoding for table rows:

Format: __sql_data:{table_id}:{primary_key_encoded}

Example:
  Table: users (table_id=1)
  Primary key: id=42
  Key: __sql_data:1:42

For composite primary keys:

Format: __sql_data:{table_id}:{pk1}:{pk2}:...

Example:
  Table: order_items (table_id=2)
  Primary key: (order_id=100, item_id=5)
  Key: __sql_data:2:100:5

4. Row Value Encoding

Row values stored as serialized structs:

Value: RowData {
  columns: HashMap<String, Value>,
  version: u64,  // For optimistic concurrency
}

Value enum:
  - Null
  - Integer(i64)
  - Text(String)
  - Boolean(bool)
  - Timestamp(u64)

Serialization: Use bincode for efficient binary encoding

5. Query Execution Engine

Simple query execution pipeline:

SQL String
  ↓
[Parser]
  ↓
Abstract Syntax Tree (AST)
  ↓
[Planner]
  ↓
Execution Plan
  ↓
[Executor]
  ↓
Result Set

Supported Operations (v1):

DDL:

  • CREATE TABLE
  • DROP TABLE

DML:

  • INSERT INTO ... VALUES (...)
  • SELECT * FROM table WHERE ...
  • SELECT col1, col2 FROM table WHERE ...
  • UPDATE table SET ... WHERE ...
  • DELETE FROM table WHERE ...

WHERE Clause Support:

  • Simple comparisons: =, !=, <, >, <=, >=
  • Logical operators: AND, OR, NOT
  • Primary key lookups (optimized)
  • Full table scans (for non-PK queries)

Query Optimization:

  • Primary key point lookups → raw_get()
  • Primary key range queries → raw_scan()
  • Non-indexed queries → full table scan

6. API Surface

New gRPC service: SqlService

service SqlService {
  rpc Execute(SqlRequest) returns (SqlResponse);
  rpc Query(SqlRequest) returns (stream RowBatch);
}

message SqlRequest {
  string namespace = 1;
  string sql = 2;
}

message SqlResponse {
  oneof result {
    DdlResult ddl_result = 1;
    DmlResult dml_result = 2;
    QueryResult query_result = 3;
    ErrorResult error = 4;
  }
}

message DdlResult {
  string message = 1;  // "Table created", "Table dropped"
}

message DmlResult {
  uint64 rows_affected = 1;
}

message QueryResult {
  repeated string columns = 1;
  repeated Row rows = 2;
}

message Row {
  repeated Value values = 1;
}

message Value {
  oneof value {
    int64 int_value = 1;
    string text_value = 2;
    bool bool_value = 3;
    uint64 timestamp_value = 4;
  }
  bool is_null = 5;
}

7. Namespace Integration

SQL layer respects FlareDB namespaces:

  • Each namespace has isolated SQL tables
  • Table IDs are namespace-scoped
  • Metadata keys include namespace prefix
Key format with namespace:
  {namespace_id}:__sql_meta:tables:{table_name}
  {namespace_id}:__sql_data:{table_id}:{primary_key}

Implementation Plan

Phase 1: Core Infrastructure (S2)

  • Table metadata storage
  • CREATE TABLE / DROP TABLE
  • Table ID allocation

Phase 2: Row Storage (S3)

  • Row key/value encoding
  • INSERT statement
  • UPDATE statement
  • DELETE statement

Phase 3: Query Engine (S4)

  • SELECT parser
  • WHERE clause evaluator
  • Result set builder
  • Table scan implementation

Phase 4: Integration (S5)

  • E2E tests
  • Example application
  • Performance benchmarks

Performance Considerations

  1. Primary Key Lookups: O(1) via raw_get()
  2. Range Scans: O(log N) via raw_scan() with key encoding
  3. Full Table Scans: O(N) - unavoidable without indexes
  4. Metadata Access: Cached in memory for frequently accessed tables

Future Enhancements (Out of Scope)

  1. Secondary Indexes: Additional KVS entries for non-PK queries
  2. JOINs: Multi-table query support
  3. Transactions: ACID guarantees across multiple operations
  4. Query Optimizer: Cost-based query planning
  5. SQL Standard Compliance: More data types, functions, etc.

Testing Strategy

  1. Unit Tests: Parser, executor, encoding/decoding
  2. Integration Tests: Full SQL operations via gRPC
  3. E2E Tests: Real-world application scenarios
  4. Performance Tests: Benchmark vs PostgreSQL/SQLite baseline

Example Usage

// Create connection
let client = SqlServiceClient::connect("http://127.0.0.1:8001").await?;

// Create table
client.execute(SqlRequest {
    namespace: "default".to_string(),
    sql: "CREATE TABLE users (
        id INTEGER PRIMARY KEY,
        name TEXT NOT NULL,
        email TEXT,
        created_at TIMESTAMP
    )".to_string(),
}).await?;

// Insert data
client.execute(SqlRequest {
    namespace: "default".to_string(),
    sql: "INSERT INTO users (id, name, email) VALUES (1, 'Alice', 'alice@example.com')".to_string(),
}).await?;

// Query data
let response = client.query(SqlRequest {
    namespace: "default".to_string(),
    sql: "SELECT * FROM users WHERE id = 1".to_string(),
}).await?;

Success Criteria

✓ CREATE/DROP TABLE working ✓ INSERT/UPDATE/DELETE working ✓ SELECT with WHERE clause working ✓ Primary key lookups optimized ✓ Integration tests passing ✓ Example application demonstrating CRUD

References