STXT Schemas (@stxt.schema)
1. Introduction2. Terminology
3. Relationship between STXT and Schema
4. General Structure of a Schema
5. One schema per namespace
6. Node Definition (`Node:`)
7. Children (`Children:`) and cross-namespace references
8. Cardinalities
9. Types
10. Normative Examples
11. Schema Errors
12. Conformance
13. Schema of the Schema (`@stxt.schema`)
14. End of Document
1. Introduction
This document defines the specification of the STXT Schema language, a mechanism for validating STXT documents through formal semantic rules.
A schema:
- Is an STXT document with namespace
@stxt.schema. - Defines the nodes, types, and cardinalities of the target namespace.
- Does not modify the base syntax of STXT; it operates on the already parsed structure.
2. Terminology
The keywords "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" must be interpreted according to RFC 2119.
Terms such as node, indentation, namespace, inline and >> block retain their meaning in STXT-SPEC.
3. Relationship between STXT and Schema
Validation through schema occurs after STXT parsing:
- Parsing the document into a hierarchical STXT structure.
- Resolving the effective namespace of each node.
- Applying the corresponding schema.
An implementation MAY apply validation during the parsing process, provided that such validation remains loosely coupled to the base parser. This allows errors to be detected before the full parsing is completed.
4. General Structure of a Schema
A schema document MUST have as its root node:
Schema (@stxt.schema): <target_namespace>
Rules:
<target_namespace>MUST be a valid namespace according to STXT-SPEC.- The root node
SchemaMUST belong to the@stxt.schemanamespace. - The schema document MAY include a
Descriptionnode. - The schema document MUST include one or more
Nodenodes.
Example:
Schema (@stxt.schema): com.example.docs
Description: Example schema
Node: Document
Type: GROUP
Children:
Child: Author
Child: Date
Max: 1
Child: Content
Min: 1
Max: 1
Child: Metadata (org.example.meta)
Max: 1
Node: Author
Node: Date
Type: DATE
Node: Content
Type: TEXT
5. One schema per namespace
For each logical namespace:
- There MUST NOT be more than one effective schema simultaneously.
- If an implementation has several candidate schemas for the same namespace, it SHOULD apply a clear and deterministic selection policy.
- For a specific validation, there MUST be only one effective schema.
6. Node Definition (`Node:`)
6.1 Basic form
Node: Node Name
Description: Node description
Type: Type
Children:
Child: Child Name. It may include a namespace if it is different from the target namespace
Min: optional, indicates the minimum number of children that may appear
Max: optional, indicates the maximum number of children that may appear
Rules:
- The inline value of
NodeMUST be a valid node name according to STXT-SPEC. - Each
NodeMUST be unique within the schema at the canonical name level. - Each
Nodedefines the semantics of the node in the schema's target namespace. - If
Typeis omitted, the default type isINLINE. - A
NodeMUST NOT contain more than oneDescriptionnode, more than oneTypenode, more than oneChildrennode, nor more than oneValuesnode.
6.2 Values in ENUM types
Node: Node Name
Description: Node description
Type: ENUM
Children:
Child: Child Name. It may include a namespace if it is different from the target namespace
Min: optional, indicates the minimum number of children that may appear
Max: optional, indicates the maximum number of children that may appear
Values:
Value: value 1
Value: value 2
Value: value 3
The ENUM type, and only ENUM, MAY specify a Values node with the allowed values through Value nodes.
If Values exists, it MUST contain at least one Value node.
If a Node declares Type: ENUM, it MUST include Values.
7. Children (`Children:`) and cross-namespace references
A node MAY have a Children entry.
If Children exists, it MUST contain one or more Child nodes with the information about the allowed children.
A Child MAY belong to another namespace, in which case it is indicated in the name of the Child itself.
Example:
Node: node name Children: Child: child name (child.namespace) Min: 0 Max: 1
- If the namespace is omitted, the
Childbelongs to the target namespace of the current schema. - If an explicit namespace is indicated, the
Childbelongs to that specific namespace. - Within the same
Childrennode, an implementation MUST NOT accept twoChildnodes that point to the same logical paircanonical name + effective namespace.
7.1 Explicitly defined nodes
Every node that appears in Children must have its own definition as Node: in its corresponding schema.
This avoids “ghost” children and guarantees that all nodes have defined semantics.
This implies:
- If
Child: Metadata (org.example.meta)appears, then there MUST be a schema fororg.example.metaand within it there MUST beNode: Metadata. - An implementation MAY defer this check until the moment of document validation, but the schema remains semantically incomplete until that definition exists.
8. Cardinalities
Cardinalities are expressed through the Min and Max nodes within each Child.
They are optional non-negative integers that indicate the minimum or maximum number of allowed occurrences of that child.
Rules:
- If
Minis omitted, the effective minimum is0. - If
Maxis omitted, the effective maximum is unlimited. MinandMaxMUST NOT appear more than once within the sameChild.- If both exist,
MinMUST NOT be greater thanMax. - Cardinality applies per instance of the parent node.
- Cardinality counts only direct children with the same canonical name and the same effective namespace.
- A conforming implementation MUST check cardinalities.
9. Types
Types define:
- The form of the node value (inline,
>>block, or none). - Whether the node is compatible with children.
- Content validation.
They are defined within Node, through a Type element. Example:
Node: node name Type: NODE_TYPE Children: Child: Child Name
Other considerations:
- The type does NOT control requiredness; only the form and validity of the value. Requiredness of appearance is controlled through cardinality.
- The value of
TypeMUST match exactly one of the types defined in this section. - Types that allow only
BLOCKorINLINE/BLOCKform and are defined as not compatible with children MUST NOT coexist withChildren. - An incompatible definition between
TypeandChildrenMUST cause a schema error.
9.1. Basic structural types
A conforming implementation MUST support these types and MUST validate their structure.
| Type | Text forms | Compatible children | Description / Validation |
|---|---|---|---|
| INLINE | INLINE | YES | Inline text :. Default type. It may have children. |
| BLOCK | BLOCK | NO | Only >> text block. It cannot have children. |
| TEXT | INLINE/BLOCK | NO | Generic text. It may be inline : or block >>. |
| GROUP | NONE | YES | Does not allow textual value. Only structured children. |
9.2. Basic INLINE content types
A conforming implementation MUST support these types and MUST validate their structure.
| Type | Text forms | Compatible children | Description / Validation |
|---|---|---|---|
| BOOLEAN | INLINE | YES | true or false. |
| NUMBER | INLINE | YES | Number in JSON format. |
| ENUM | INLINE | YES | Only specified values (see 9.5) |
9.3. Extended INLINE content types
A conforming implementation MUST support these types and SHOULD validate their structure.
| Type | Text forms | Compatible children | Description / Validation |
|---|---|---|---|
| INTEGER | INLINE | YES | Number without decimals (positive and negative). |
| NATURAL | INLINE | YES | Numbers greater than or equal to 0 without decimals. |
| DATE | INLINE | YES | Date YYYY-MM-DD. |
| TIME | INLINE | YES | ISO 8601 time, hh:mm:ss. |
| TIMESTAMP | INLINE | YES | Full ISO 8601 timestamp. |
| UUID | INLINE | YES | Canonical UUID. |
| URL | INLINE | YES | Valid URL or URI. |
| INLINE | YES | Valid email address. |
9.4. Extended INLINE/BLOCK binary content types
A conforming implementation MUST support these types and MAY validate their structure.
| Type | Text forms | Compatible children | Description / Validation |
|---|---|---|---|
| HEXADECIMAL | INLINE / BLOCK | NO | [0-9A-Fa-f]+. Hexadecimal string. |
| BINARY | INLINE / BLOCK | NO | [01]+. Binary string. |
| BASE64 | INLINE / BLOCK | NO | Valid Base64 content. |
9.5. ENUM type
The ENUM type allows explicitly enumerating the allowed values for a node.
Rules:
- Comparison MUST be made on the inline value already normalized by left and right trim, as defined by STXT-SPEC.
- Comparison MUST be exact and CASE-SENSITIVE.
- Comparison MUST NOT apply additional canonicalization, removal of diacritics, or normalizations equivalent to the canonical name of nodes.
- The node MUST define
ValueswithValuenodes, which represent the allowed values. - Each
ValueMUST be unique after inline normalization by trim.
Example:
Node: Node Name
Type: ENUM
Values:
Value: value 1
Value: value 2
Value: value 3
A conforming implementation MUST check ENUM types against their allowed values and MUST reject any value that does not match exactly one of them.
10. Normative Examples
10.1. Schema with cross-namespace references
Schema (@stxt.schema): com.example.docs
Node: Document
Type: GROUP
Children:
Child: Metadata (org.example.meta)
Max: 1
Child: Content
Min: 1
Max: 1
Node: Content
Type: BLOCK
And in org.example.meta:
Schema (@stxt.schema): org.example.meta
Node: Metadata
Type: INLINE
10.2. Valid document
Document (com.example.docs):
Metadata (org.example.meta): info
Content>>
Line 1
Line 2
11. Schema Errors
A schema is invalid if:
- It defines two
Nodeentries with the same canonical name. - It uses an unknown
Type. - It defines
Childrenin aNodewhose type does not allow children. - The cardinality is invalid.
- A
Node: ENUMdoes not defineValues, orValuesdoes not contain anyValue. - A duplicated
Valueappears after inline normalization by trim. - Two equivalent
Childnodes appear within the sameChildren. - A child appears in
ChildrenwhoseNodeis not defined in its corresponding schema.
12. Conformance
An implementation is conforming if:
- It fully implements this document.
- It validates types, value forms, cardinalities, and allowed values (ENUM).
- It applies the strict rule of mandatory definition of all nodes referenced in
Children. - It selects, for each validation, a single effective schema per namespace.
- It rejects invalid documents and schemas.
13. Schema of the Schema (`@stxt.schema`)
This section defines the official schema of the schema system itself: the meta-schema that validates all documents in the @stxt.schema namespace.
13.1. Considerations
- Every schema document is:
Schema (@stxt.schema): <target-namespace> - A schema contains:
- Optionally a
Description. - One or more
Nodenodes.
- Optionally a
- Each
Node:- Has an inline value (the name of the node in the target namespace).
- May optionally have:
DescriptionTypeChildrenValues
- Each
Child(element ofChildren) defines the name (and optionally a different namespace) and may have:Min: Minimum number of nodes that must appear. If the node does not exist, there is no established minimum.Max: Maximum number of nodes that may appear. If the node does not exist, there is no established maximum.
- Each
Values:- May only appear in
Nodenodes of typeENUM. - Contains one or more
Valuenodes.
- May only appear in
- The names (
Schema,Node,Type,Children,Child,Description,Min,Max,Values,Value) belong to the@stxt.schemanamespace.
13.2. Complete Meta-Schema
Schema (@stxt.schema): @stxt.schema
Node: Schema
Children:
Child: Description
Max: 1
Child: Node
Min: 1
Node: Node
Children:
Child: Type
Max: 1
Child: Children
Max: 1
Child: Description
Max: 1
Child: Values
Max: 1
Node: Children
Type: GROUP
Children:
Child: Child
Min: 1
Node: Description
Type: TEXT
Node: Child
Children:
Child: Min
Max: 1
Child: Max
Max: 1
Node: Min
Type: NATURAL
Node: Max
Type: NATURAL
Node: Type
Type: ENUM
Values:
Value: INLINE
Value: BLOCK
Value: TEXT
Value: GROUP
Value: BOOLEAN
Value: NUMBER
Value: ENUM
Value: INTEGER
Value: NATURAL
Value: DATE
Value: TIME
Value: TIMESTAMP
Value: UUID
Value: URL
Value: EMAIL
Value: HEXADECIMAL
Value: BINARY
Value: BASE64
Node: Values
Type: GROUP
Children:
Child: Value
Min: 1
Node: Value
13.3. Quick reading
-
SchemaInline value = target namespace (e.g.com.example.docs). Children:Description(?),Node(*). -
NodeInline value = name of the target node (e.g.Document,Author). Optional children:Type: specific type (if missing ⇒INLINE).Children: Node with list of allowedChildentriesDescription: explanatory text.Values: Allowed values (ENUM type only)
-
TypeInline (INLINE), with the name of the type (GROUP,INLINE,NUMBER, etc.). -
ChildrenGROUP: contains one or moreChildnodes. -
DescriptionTEXT: may be inline or multiline. -
ValuesGROUP: contains one or moreValuenodes. -
ValueInline value with one of the allowed values for theENUM.
13.4. Minimal valid example
Schema (@stxt.schema): com.example.docs
Node: Document
13.5. Complete example
Schema (@stxt.schema): com.example.docs
Description: Example schema
Node: Document
Type: GROUP
Children:
Child: Title
Min: 1
Max: 1
Child: Author
Child: Metadata (org.example.meta)
Max: 1
Node: Title
Type: INLINE
Node: Author
Type: INLINE