notex.nvim/specs/002-notex-is-a/data-model.md

3.7 KiB

Data Model: Relational Document System

Core Entities

Document

Purpose: Represents a markdown file with indexed properties Fields:

  • id (string): Unique identifier, typically file path hash
  • file_path (string): Absolute path to markdown file
  • content_hash (string): SHA256 hash of file content for change detection
  • last_modified (integer): Unix timestamp of last file modification
  • created_at (integer): Timestamp when document was first indexed
  • updated_at (integer): Timestamp of last index update

Property

Purpose: Individual key-value pairs extracted from YAML headers Fields:

  • id (string): Unique property identifier
  • document_id (string): Foreign key to Document
  • key (string): Property name from YAML header
  • value (string): Serialized property value
  • value_type (string): Data type (string, number, boolean, date, array)
  • created_at (integer): Timestamp when property was created
  • updated_at (integer): Timestamp of last property update

Query

Purpose: Saved query definitions for reuse Fields:

  • id (string): Unique query identifier
  • name (string): Human-readable query name
  • definition (string): Query syntax definition
  • created_at (integer): Query creation timestamp
  • last_used (integer): Timestamp of last query execution
  • use_count (integer): Number of times query has been executed

Schema

Purpose: Metadata about property types and validation rules Fields:

  • property_key (string): Property name across documents
  • detected_type (string): Most common data type for this property
  • validation_rules (string): JSON-encoded validation rules
  • document_count (integer): Number of documents containing this property
  • created_at (integer): Timestamp when schema entry was created

Relationships

Document ↔ Property

  • One-to-many: Each document has multiple properties
  • Cascade delete: Properties are removed when document is deleted

Document ↔ Query

  • Many-to-many: Queries can reference multiple documents
  • Junction table: QueryResults stores execution history

Property ↔ Schema

  • Many-to-one: Multiple properties with same key map to one schema entry

Data Types

Supported Property Types

  • string: Text values (default type)
  • number: Numeric values (integer or float)
  • boolean: true/false values
  • date: ISO 8601 date strings
  • array: JSON-encoded arrays
  • object: JSON-encoded objects (nested structures)

Type Detection Logic

  1. Parse YAML value using native YAML parser
  2. Apply type detection rules:
    • Strings matching ISO 8601 format → date
    • Numeric strings without decimals → number (integer)
    • Numeric strings with decimals → number (float)
    • "true"/"false" (case insensitive) → boolean
    • Arrays/objects → respective types
    • Everything else → string

Indexing Strategy

Primary Indices

  • documents.file_path (unique)
  • properties.document_id (foreign key)
  • properties.key (for property-based queries)
  • queries.id (unique)

Composite Indices

  • properties(document_id, key) for fast document property lookup
  • properties(key, value_type) for type-constrained queries
  • queries(last_used) for recent query tracking

Validation Rules

Document Validation

  • File must exist and be readable
  • File must have valid YAML header (--- delimiters)
  • YAML must parse without errors
  • File must be UTF-8 encoded

Property Validation

  • Property keys must be non-empty strings
  • Property values must be serializable
  • Array/object values must be valid JSON
  • Date values must match ISO 8601 format

Query Validation

  • Query syntax must be valid according to defined grammar
  • Query must reference existing properties
  • Query complexity must be within performance limits