notex.nvim/specs/002-notex-is-a/data-model.md

# Data Model: Relational Document System

## Core Entities

### Document
**Purpose**: Represents a markdown file with indexed properties
**Fields**:
- id (string): Unique identifier, typically file path hash
- file_path (string): Absolute path to markdown file
- content_hash (string): SHA256 hash of file content for change detection
- last_modified (integer): Unix timestamp of last file modification
- created_at (integer): Timestamp when document was first indexed
- updated_at (integer): Timestamp of last index update

### Property
**Purpose**: Individual key-value pairs extracted from YAML headers
**Fields**:
- id (string): Unique property identifier
- document_id (string): Foreign key to Document
- key (string): Property name from YAML header
- value (string): Serialized property value
- value_type (string): Data type (string, number, boolean, date, array)
- created_at (integer): Timestamp when property was created
- updated_at (integer): Timestamp of last property update

### Query
**Purpose**: Saved query definitions for reuse
**Fields**:
- id (string): Unique query identifier
- name (string): Human-readable query name
- definition (string): Query syntax definition
- created_at (integer): Query creation timestamp
- last_used (integer): Timestamp of last query execution
- use_count (integer): Number of times query has been executed

### Schema
**Purpose**: Metadata about property types and validation rules
**Fields**:
- property_key (string): Property name across documents
- detected_type (string): Most common data type for this property
- validation_rules (string): JSON-encoded validation rules
- document_count (integer): Number of documents containing this property
- created_at (integer): Timestamp when schema entry was created

## Relationships

### Document ↔ Property
- One-to-many: Each document has multiple properties
- Cascade delete: Properties are removed when document is deleted

### Document ↔ Query
- Many-to-many: Queries can reference multiple documents
- Junction table: QueryResults stores execution history

### Property ↔ Schema
- Many-to-one: Multiple properties with same key map to one schema entry

## Data Types

### Supported Property Types
- **string**: Text values (default type)
- **number**: Numeric values (integer or float)
- **boolean**: true/false values
- **date**: ISO 8601 date strings
- **array**: JSON-encoded arrays
- **object**: JSON-encoded objects (nested structures)

### Type Detection Logic
1. Parse YAML value using native YAML parser
2. Apply type detection rules:
   - Strings matching ISO 8601 format → date
   - Numeric strings without decimals → number (integer)
   - Numeric strings with decimals → number (float)
   - "true"/"false" (case insensitive) → boolean
   - Arrays/objects → respective types
   - Everything else → string

## Indexing Strategy

### Primary Indices
- documents.file_path (unique)
- properties.document_id (foreign key)
- properties.key (for property-based queries)
- queries.id (unique)

### Composite Indices
- properties(document_id, key) for fast document property lookup
- properties(key, value_type) for type-constrained queries
- queries(last_used) for recent query tracking

## Validation Rules

### Document Validation
- File must exist and be readable
- File must have valid YAML header (--- delimiters)
- YAML must parse without errors
- File must be UTF-8 encoded

### Property Validation
- Property keys must be non-empty strings
- Property values must be serializable
- Array/object values must be valid JSON
- Date values must match ISO 8601 format

### Query Validation
- Query syntax must be valid according to defined grammar
- Query must reference existing properties
- Query complexity must be within performance limits
Initial vibecoded proof of concept 2025-10-05 20:16:33 -04:00			`# Data Model: Relational Document System`

			`## Core Entities`

			`### Document`
			`Purpose: Represents a markdown file with indexed properties`
			`Fields:`
			`- id (string): Unique identifier, typically file path hash`
			`- file_path (string): Absolute path to markdown file`
			`- content_hash (string): SHA256 hash of file content for change detection`
			`- last_modified (integer): Unix timestamp of last file modification`
			`- created_at (integer): Timestamp when document was first indexed`
			`- updated_at (integer): Timestamp of last index update`

			`### Property`
			`Purpose: Individual key-value pairs extracted from YAML headers`
			`Fields:`
			`- id (string): Unique property identifier`
			`- document_id (string): Foreign key to Document`
			`- key (string): Property name from YAML header`
			`- value (string): Serialized property value`
			`- value_type (string): Data type (string, number, boolean, date, array)`
			`- created_at (integer): Timestamp when property was created`
			`- updated_at (integer): Timestamp of last property update`

			`### Query`
			`Purpose: Saved query definitions for reuse`
			`Fields:`
			`- id (string): Unique query identifier`
			`- name (string): Human-readable query name`
			`- definition (string): Query syntax definition`
			`- created_at (integer): Query creation timestamp`
			`- last_used (integer): Timestamp of last query execution`
			`- use_count (integer): Number of times query has been executed`

			`### Schema`
			`Purpose: Metadata about property types and validation rules`
			`Fields:`
			`- property_key (string): Property name across documents`
			`- detected_type (string): Most common data type for this property`
			`- validation_rules (string): JSON-encoded validation rules`
			`- document_count (integer): Number of documents containing this property`
			`- created_at (integer): Timestamp when schema entry was created`

			`## Relationships`

			`### Document ↔ Property`
			`- One-to-many: Each document has multiple properties`
			`- Cascade delete: Properties are removed when document is deleted`

			`### Document ↔ Query`
			`- Many-to-many: Queries can reference multiple documents`
			`- Junction table: QueryResults stores execution history`

			`### Property ↔ Schema`
			`- Many-to-one: Multiple properties with same key map to one schema entry`

			`## Data Types`

			`### Supported Property Types`
			`- string: Text values (default type)`
			`- number: Numeric values (integer or float)`
			`- boolean: true/false values`
			`- date: ISO 8601 date strings`
			`- array: JSON-encoded arrays`
			`- object: JSON-encoded objects (nested structures)`

			`### Type Detection Logic`
			`1. Parse YAML value using native YAML parser`
			`2. Apply type detection rules:`
			`- Strings matching ISO 8601 format → date`
			`- Numeric strings without decimals → number (integer)`
			`- Numeric strings with decimals → number (float)`
			`- "true"/"false" (case insensitive) → boolean`
			`- Arrays/objects → respective types`
			`- Everything else → string`

			`## Indexing Strategy`

			`### Primary Indices`
			`- documents.file_path (unique)`
			`- properties.document_id (foreign key)`
			`- properties.key (for property-based queries)`
			`- queries.id (unique)`

			`### Composite Indices`
			`- properties(document_id, key) for fast document property lookup`
			`- properties(key, value_type) for type-constrained queries`
			`- queries(last_used) for recent query tracking`

			`## Validation Rules`

			`### Document Validation`
			`- File must exist and be readable`
			`- File must have valid YAML header (--- delimiters)`
			`- YAML must parse without errors`
			`- File must be UTF-8 encoded`

			`### Property Validation`
			`- Property keys must be non-empty strings`
			`- Property values must be serializable`
			`- Array/object values must be valid JSON`
			`- Date values must match ISO 8601 format`

			`### Query Validation`
			`- Query syntax must be valid according to defined grammar`
			`- Query must reference existing properties`
			`- Query complexity must be within performance limits`