STXT - Semantic Text
Built for humans. Reliable for machines.

STXT Schemas (@stxt.schema)

1. Introduction
2. Terminology
3. Relationship between STXT and Schema
4. General Structure of a Schema
5. One schema per namespace
6. Node Definition (`Node:`)
7. Children (`Children:`) and cross-namespace references
8. Cardinalities
9. Types
10. Normative Examples
11. Schema Errors
12. Conformance
13. Schema of the Schema (`@stxt.schema`)
14. End of Document

1. Introduction

This document defines the specification of the STXT Schema language, a mechanism for validating STXT documents through formal semantic rules.

A schema:

2. Terminology

The keywords "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" must be interpreted according to RFC 2119.

Terms such as node, indentation, namespace, inline and >> block retain their meaning in STXT-SPEC.

3. Relationship between STXT and Schema

Validation through schema occurs after STXT parsing:

  1. Parsing the document into a hierarchical STXT structure.
  2. Resolving the effective namespace of each node.
  3. Applying the corresponding schema.

An implementation MAY apply validation during the parsing process, provided that such validation remains loosely coupled to the base parser. This allows errors to be detected before the full parsing is completed.

4. General Structure of a Schema

A schema document MUST have as its root node:

Schema (@stxt.schema): <target_namespace>

Rules:

Example:

Schema (@stxt.schema): com.example.docs
    Description: Example schema
    Node: Document
        Type: GROUP
        Children:
        	Child: Author
        	Child: Date
        		Max: 1
        	Child: Content
        		Min: 1
        		Max: 1
        	Child: Metadata (org.example.meta)
        		Max: 1
    Node: Author
    Node: Date
        Type: DATE
    Node: Content
        Type: TEXT

5. One schema per namespace

For each logical namespace:

6. Node Definition (`Node:`)

6.1 Basic form

Node: Node Name
	Description: Node description
    Type: Type
    Children:
    	Child: Child Name. It may include a namespace if it is different from the target namespace
    		Min: optional, indicates the minimum number of children that may appear
    		Max: optional, indicates the maximum number of children that may appear

Rules:

6.2 Values in ENUM types

Node: Node Name
	Description: Node description
    Type: ENUM
    Children:
    	Child: Child Name. It may include a namespace if it is different from the target namespace
    		Min: optional, indicates the minimum number of children that may appear
    		Max: optional, indicates the maximum number of children that may appear
    Values:
    	Value: value 1
    	Value: value 2
    	Value: value 3

The ENUM type, and only ENUM, MAY specify a Values node with the allowed values through Value nodes. If Values exists, it MUST contain at least one Value node. If a Node declares Type: ENUM, it MUST include Values.

7. Children (`Children:`) and cross-namespace references

A node MAY have a Children entry. If Children exists, it MUST contain one or more Child nodes with the information about the allowed children.

A Child MAY belong to another namespace, in which case it is indicated in the name of the Child itself. Example:

Node: node name
	Children:
		Child: child name (child.namespace)
			Min: 0
			Max: 1

7.1 Explicitly defined nodes

Every node that appears in Children must have its own definition as Node: in its corresponding schema.

This avoids “ghost” children and guarantees that all nodes have defined semantics.

This implies:

8. Cardinalities

Cardinalities are expressed through the Min and Max nodes within each Child. They are optional non-negative integers that indicate the minimum or maximum number of allowed occurrences of that child.

Rules:

9. Types

Types define:

  1. The form of the node value (inline, >> block, or none).
  2. Whether the node is compatible with children.
  3. Content validation.

They are defined within Node, through a Type element. Example:

Node: node name
	Type: NODE_TYPE
	Children:
		Child: Child Name

Other considerations:

9.1. Basic structural types

A conforming implementation MUST support these types and MUST validate their structure.

Type Text forms Compatible children Description / Validation
INLINE INLINE YES Inline text :. Default type. It may have children.
BLOCK BLOCK NO Only >> text block. It cannot have children.
TEXT INLINE/BLOCK NO Generic text. It may be inline : or block >>.
GROUP NONE YES Does not allow textual value. Only structured children.

9.2. Basic INLINE content types

A conforming implementation MUST support these types and MUST validate their structure.

Type Text forms Compatible children Description / Validation
BOOLEAN INLINE YES true or false.
NUMBER INLINE YES Number in JSON format.
ENUM INLINE YES Only specified values (see 9.5)

9.3. Extended INLINE content types

A conforming implementation MUST support these types and SHOULD validate their structure.

Type Text forms Compatible children Description / Validation
INTEGER INLINE YES Number without decimals (positive and negative).
NATURAL INLINE YES Numbers greater than or equal to 0 without decimals.
DATE INLINE YES Date YYYY-MM-DD.
TIME INLINE YES ISO 8601 time, hh:mm:ss.
TIMESTAMP INLINE YES Full ISO 8601 timestamp.
UUID INLINE YES Canonical UUID.
URL INLINE YES Valid URL or URI.
EMAIL INLINE YES Valid email address.

9.4. Extended INLINE/BLOCK binary content types

A conforming implementation MUST support these types and MAY validate their structure.

Type Text forms Compatible children Description / Validation
HEXADECIMAL INLINE / BLOCK NO [0-9A-Fa-f]+. Hexadecimal string.
BINARY INLINE / BLOCK NO [01]+. Binary string.
BASE64 INLINE / BLOCK NO Valid Base64 content.

9.5. ENUM type

The ENUM type allows explicitly enumerating the allowed values for a node. Rules:

Example:

Node: Node Name
    Type: ENUM
    Values:
    	Value: value 1
    	Value: value 2
    	Value: value 3

A conforming implementation MUST check ENUM types against their allowed values and MUST reject any value that does not match exactly one of them.

10. Normative Examples

10.1. Schema with cross-namespace references

Schema (@stxt.schema): com.example.docs
    Node: Document
        Type: GROUP
        Children:
            Child: Metadata (org.example.meta)
            	Max: 1
            Child: Content
            	Min: 1
            	Max: 1
    Node: Content
        Type: BLOCK

And in org.example.meta:

Schema (@stxt.schema): org.example.meta
    Node: Metadata
    	Type: INLINE

10.2. Valid document

Document (com.example.docs):
    Metadata (org.example.meta): info
    Content>>
        Line 1
        Line 2

11. Schema Errors

A schema is invalid if:

  1. It defines two Node entries with the same canonical name.
  2. It uses an unknown Type.
  3. It defines Children in a Node whose type does not allow children.
  4. The cardinality is invalid.
  5. A Node: ENUM does not define Values, or Values does not contain any Value.
  6. A duplicated Value appears after inline normalization by trim.
  7. Two equivalent Child nodes appear within the same Children.
  8. A child appears in Children whose Node is not defined in its corresponding schema.

12. Conformance

An implementation is conforming if:

13. Schema of the Schema (`@stxt.schema`)

This section defines the official schema of the schema system itself: the meta-schema that validates all documents in the @stxt.schema namespace.

13.1. Considerations

13.2. Complete Meta-Schema

Schema (@stxt.schema): @stxt.schema
    Node: Schema
        Children:
            Child: Description
                Max: 1
            Child: Node
                Min: 1
    Node: Node
        Children:
            Child: Type
                Max: 1
            Child: Children
                Max: 1
            Child: Description
                Max: 1
            Child: Values
                Max: 1
    Node: Children
       	Type: GROUP
        Children:
            Child: Child
                Min: 1
    Node: Description
        Type: TEXT
    Node: Child
        Children:
            Child: Min
                Max: 1
            Child: Max
                Max: 1
    Node: Min
        Type: NATURAL
    Node: Max
        Type: NATURAL
    Node: Type
        Type: ENUM
        Values:
            Value: INLINE
            Value: BLOCK
            Value: TEXT
            Value: GROUP
            Value: BOOLEAN
            Value: NUMBER
            Value: ENUM
            Value: INTEGER
            Value: NATURAL
            Value: DATE
            Value: TIME
            Value: TIMESTAMP
            Value: UUID
            Value: URL
            Value: EMAIL
            Value: HEXADECIMAL
            Value: BINARY
            Value: BASE64
    Node: Values
        Type: GROUP
        Children:
            Child: Value
                Min: 1
    Node: Value

13.3. Quick reading

13.4. Minimal valid example

Schema (@stxt.schema): com.example.docs
    Node: Document

13.5. Complete example

Schema (@stxt.schema): com.example.docs
    Description: Example schema
    Node: Document
        Type: GROUP
        Children:
        	Child: Title
        		Min: 1
        		Max: 1
        	Child: Author
        	Child: Metadata (org.example.meta)
        		Max: 1
    Node: Title
        Type: INLINE
    Node: Author
        Type: INLINE

14. End of Document