3.7 KiB
3.7 KiB
Data Model: Relational Document System
Core Entities
Document
Purpose: Represents a markdown file with indexed properties Fields:
- id (string): Unique identifier, typically file path hash
- file_path (string): Absolute path to markdown file
- content_hash (string): SHA256 hash of file content for change detection
- last_modified (integer): Unix timestamp of last file modification
- created_at (integer): Timestamp when document was first indexed
- updated_at (integer): Timestamp of last index update
Property
Purpose: Individual key-value pairs extracted from YAML headers Fields:
- id (string): Unique property identifier
- document_id (string): Foreign key to Document
- key (string): Property name from YAML header
- value (string): Serialized property value
- value_type (string): Data type (string, number, boolean, date, array)
- created_at (integer): Timestamp when property was created
- updated_at (integer): Timestamp of last property update
Query
Purpose: Saved query definitions for reuse Fields:
- id (string): Unique query identifier
- name (string): Human-readable query name
- definition (string): Query syntax definition
- created_at (integer): Query creation timestamp
- last_used (integer): Timestamp of last query execution
- use_count (integer): Number of times query has been executed
Schema
Purpose: Metadata about property types and validation rules Fields:
- property_key (string): Property name across documents
- detected_type (string): Most common data type for this property
- validation_rules (string): JSON-encoded validation rules
- document_count (integer): Number of documents containing this property
- created_at (integer): Timestamp when schema entry was created
Relationships
Document ↔ Property
- One-to-many: Each document has multiple properties
- Cascade delete: Properties are removed when document is deleted
Document ↔ Query
- Many-to-many: Queries can reference multiple documents
- Junction table: QueryResults stores execution history
Property ↔ Schema
- Many-to-one: Multiple properties with same key map to one schema entry
Data Types
Supported Property Types
- string: Text values (default type)
- number: Numeric values (integer or float)
- boolean: true/false values
- date: ISO 8601 date strings
- array: JSON-encoded arrays
- object: JSON-encoded objects (nested structures)
Type Detection Logic
- Parse YAML value using native YAML parser
- Apply type detection rules:
- Strings matching ISO 8601 format → date
- Numeric strings without decimals → number (integer)
- Numeric strings with decimals → number (float)
- "true"/"false" (case insensitive) → boolean
- Arrays/objects → respective types
- Everything else → string
Indexing Strategy
Primary Indices
- documents.file_path (unique)
- properties.document_id (foreign key)
- properties.key (for property-based queries)
- queries.id (unique)
Composite Indices
- properties(document_id, key) for fast document property lookup
- properties(key, value_type) for type-constrained queries
- queries(last_used) for recent query tracking
Validation Rules
Document Validation
- File must exist and be readable
- File must have valid YAML header (--- delimiters)
- YAML must parse without errors
- File must be UTF-8 encoded
Property Validation
- Property keys must be non-empty strings
- Property values must be serializable
- Array/object values must be valid JSON
- Date values must match ISO 8601 format
Query Validation
- Query syntax must be valid according to defined grammar
- Query must reference existing properties
- Query complexity must be within performance limits