photoncloud-monorepo/docs/por/T037-flaredb-sql-layer/DESIGN.md
centra 5c6eb04a46 T036: Add VM cluster deployment configs for nixos-anywhere
- netboot-base.nix with SSH key auth
- Launch scripts for node01/02/03
- Node configuration.nix and disko.nix
- Nix modules for first-boot automation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 09:59:19 +09:00

299 lines
6.3 KiB
Markdown

# FlareDB SQL Layer Design
## Overview
This document outlines the design for a SQL-compatible layer built on top of FlareDB's KVS foundation. The goal is to enable SQL queries (DDL/DML) while leveraging FlareDB's existing distributed KVS capabilities.
## Architecture Principles
1. **KVS Foundation**: All SQL data stored as KVS key-value pairs
2. **Simple First**: Start with core SQL subset (no JOINs, no transactions initially)
3. **Efficient Encoding**: Optimize key encoding for range scans
4. **Namespace Isolation**: Use FlareDB namespaces for multi-tenancy
## Key Design Decisions
### 1. SQL Parser
**Choice**: Use `sqlparser-rs` crate
- Mature, well-tested SQL parser
- Supports MySQL/PostgreSQL/ANSI SQL dialects
- Easy to extend for custom syntax
### 2. Table Metadata Schema
Table metadata stored in KVS with special prefix:
```
Key: __sql_meta:tables:{table_name}
Value: TableMetadata {
table_id: u32,
table_name: String,
columns: Vec<ColumnDef>,
primary_key: Vec<String>,
created_at: u64,
}
ColumnDef {
name: String,
data_type: DataType,
nullable: bool,
default_value: Option<Value>,
}
DataType enum:
- Integer
- BigInt
- Text
- Boolean
- Timestamp
```
Table ID allocation:
```
Key: __sql_meta:next_table_id
Value: u32 (monotonic counter)
```
### 3. Row Key Encoding
Efficient key encoding for table rows:
```
Format: __sql_data:{table_id}:{primary_key_encoded}
Example:
Table: users (table_id=1)
Primary key: id=42
Key: __sql_data:1:42
```
For composite primary keys:
```
Format: __sql_data:{table_id}:{pk1}:{pk2}:...
Example:
Table: order_items (table_id=2)
Primary key: (order_id=100, item_id=5)
Key: __sql_data:2:100:5
```
### 4. Row Value Encoding
Row values stored as serialized structs:
```
Value: RowData {
columns: HashMap<String, Value>,
version: u64, // For optimistic concurrency
}
Value enum:
- Null
- Integer(i64)
- Text(String)
- Boolean(bool)
- Timestamp(u64)
```
Serialization: Use `bincode` for efficient binary encoding
### 5. Query Execution Engine
Simple query execution pipeline:
```
SQL String
[Parser]
Abstract Syntax Tree (AST)
[Planner]
Execution Plan
[Executor]
Result Set
```
**Supported Operations (v1):**
DDL:
- CREATE TABLE
- DROP TABLE
DML:
- INSERT INTO ... VALUES (...)
- SELECT * FROM table WHERE ...
- SELECT col1, col2 FROM table WHERE ...
- UPDATE table SET ... WHERE ...
- DELETE FROM table WHERE ...
**WHERE Clause Support:**
- Simple comparisons: =, !=, <, >, <=, >=
- Logical operators: AND, OR, NOT
- Primary key lookups (optimized)
- Full table scans (for non-PK queries)
**Query Optimization:**
- Primary key point lookups → raw_get()
- Primary key range queries → raw_scan()
- Non-indexed queries → full table scan
### 6. API Surface
New gRPC service: `SqlService`
```protobuf
service SqlService {
rpc Execute(SqlRequest) returns (SqlResponse);
rpc Query(SqlRequest) returns (stream RowBatch);
}
message SqlRequest {
string namespace = 1;
string sql = 2;
}
message SqlResponse {
oneof result {
DdlResult ddl_result = 1;
DmlResult dml_result = 2;
QueryResult query_result = 3;
ErrorResult error = 4;
}
}
message DdlResult {
string message = 1; // "Table created", "Table dropped"
}
message DmlResult {
uint64 rows_affected = 1;
}
message QueryResult {
repeated string columns = 1;
repeated Row rows = 2;
}
message Row {
repeated Value values = 1;
}
message Value {
oneof value {
int64 int_value = 1;
string text_value = 2;
bool bool_value = 3;
uint64 timestamp_value = 4;
}
bool is_null = 5;
}
```
### 7. Namespace Integration
SQL layer respects FlareDB namespaces:
- Each namespace has isolated SQL tables
- Table IDs are namespace-scoped
- Metadata keys include namespace prefix
```
Key format with namespace:
{namespace_id}:__sql_meta:tables:{table_name}
{namespace_id}:__sql_data:{table_id}:{primary_key}
```
## Implementation Plan
### Phase 1: Core Infrastructure (S2)
- Table metadata storage
- CREATE TABLE / DROP TABLE
- Table ID allocation
### Phase 2: Row Storage (S3)
- Row key/value encoding
- INSERT statement
- UPDATE statement
- DELETE statement
### Phase 3: Query Engine (S4)
- SELECT parser
- WHERE clause evaluator
- Result set builder
- Table scan implementation
### Phase 4: Integration (S5)
- E2E tests
- Example application
- Performance benchmarks
## Performance Considerations
1. **Primary Key Lookups**: O(1) via raw_get()
2. **Range Scans**: O(log N) via raw_scan() with key encoding
3. **Full Table Scans**: O(N) - unavoidable without indexes
4. **Metadata Access**: Cached in memory for frequently accessed tables
## Future Enhancements (Out of Scope)
1. **Secondary Indexes**: Additional KVS entries for non-PK queries
2. **JOINs**: Multi-table query support
3. **Transactions**: ACID guarantees across multiple operations
4. **Query Optimizer**: Cost-based query planning
5. **SQL Standard Compliance**: More data types, functions, etc.
## Testing Strategy
1. **Unit Tests**: Parser, executor, encoding/decoding
2. **Integration Tests**: Full SQL operations via gRPC
3. **E2E Tests**: Real-world application scenarios
4. **Performance Tests**: Benchmark vs PostgreSQL/SQLite baseline
## Example Usage
```rust
// Create connection
let client = SqlServiceClient::connect("http://127.0.0.1:8001").await?;
// Create table
client.execute(SqlRequest {
namespace: "default".to_string(),
sql: "CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT,
created_at TIMESTAMP
)".to_string(),
}).await?;
// Insert data
client.execute(SqlRequest {
namespace: "default".to_string(),
sql: "INSERT INTO users (id, name, email) VALUES (1, 'Alice', 'alice@example.com')".to_string(),
}).await?;
// Query data
let response = client.query(SqlRequest {
namespace: "default".to_string(),
sql: "SELECT * FROM users WHERE id = 1".to_string(),
}).await?;
```
## Success Criteria
✓ CREATE/DROP TABLE working
✓ INSERT/UPDATE/DELETE working
✓ SELECT with WHERE clause working
✓ Primary key lookups optimized
✓ Integration tests passing
✓ Example application demonstrating CRUD
## References
- sqlparser-rs: https://github.com/sqlparser-rs/sqlparser-rs
- FlareDB KVS API: flaredb/proto/kvrpc.proto
- RocksDB encoding: https://github.com/facebook/rocksdb/wiki