diff --git a/packages/xforms-engine/ARCHITECTURE.md b/packages/xforms-engine/ARCHITECTURE.md new file mode 100644 index 00000000..fb943ddd --- /dev/null +++ b/packages/xforms-engine/ARCHITECTURE.md @@ -0,0 +1,111 @@ +# `@getodk/xforms-engine`: architecture, design, and key concepts + +## Guiding principles and assumptions + +1. `@getodk/xforms-engine` (interchangeably referred to as "the engine") is conceived as a software implementation of the data and computational models defined by the [ODK XForms Specification](https://getodk.github.io/xforms-spec/). + +2. The engine is conceptually similar to [JavaRosa](https://github.com/getodk/javarosa), which serves as an engine for [ODK Collect](https://github.com/getodk/collect). Alignment with JavaRosa's behavior and functionality is a primary goal, and that project is a primary point of reference, e.g.: + + - for understanding, interpreting and disambiguating important details specified by ODK XForms + + - for prioritization of feature support + + - for identification of _bug-for-bug_ compatibility goals + + In **most cases**, JavaRosa is assumed to be the "source of truth" when answering questions about spec and requirements. Cases where we deviate tend to involve increased spec support; in all cases, interoperability between the two ODK form implementations is of utmost importance. + +3. ODK forms are typically authored in the [XLSForms](https://xlsform.org/en/) format, which simplifies many ODK XForms concepts for form designers. The format is an _abstraction_ over ODK XForms. For the purposes of the engine: + + - The XLSForms format establishes requirements and priority of features. ODK XForms features which _can be expressed_ in an XLSForm are higher priority than those which cannot. + + - The format provides guidance—_but is not a source of truth_—on questions terminology, feature semantics, user concepts, etc. As an abstraction, XLSForms ultimately defers to ODK XForms as its underlying specification, and the engine does in kind. Especially in terms of naming and spec references, the engine's internals tend to stick close to ODK XForms; XLSForms concepts are more appropriate at the package boundary (usually called the "client interface"). + +4. The engine is intentionally designed to be "client agnostic", in terms of: + + - Presentation and interaction: the engine supports integration and presentation as a conventional form UI on the web (hence the name "ODK Web Forms"); programmatically (as is the case in the `@getodk/scenario` integration test client); hypothetically as a graphical interface for non-web (e.g. mobile or desktop native) platforms, or even as a command line other text-first interface. + + - Rendering technology: the engine is not coupled to any particular UI library or framework (web or otherwise). It is a goal of the engine to support integration with a client's choice of component framework, or even its own bespoke presentation layer. + + - Model of state over time: the engine provides a minimal interface for clients to integrate their own model for observing state changes as they occur. The behavior of this interface is treated as opaque within the engine. It's generally assumed that clients will use and integrate an implementation of reactivity to handle state changes. But the interface is fully optional (for instance it is unused by the vast majority of tests in `@getodk/scenario`). + +5. The engine has a _synchronous, consistent computational model_: + + - **Synchronous to read:** Once the engine has initialized a form's state, a client may access any of that state **by reading object properties**. These properties may be implemented internally by `get` accessors and/or `Proxy` traps, but they always produce a _synchronous value_ (i.e. not a `Promise` or other access mechanism which unblocks the main thread's event loop). + + - **Synchronous to write:** APIs provided by the engine for clients to make any state change will perform that state change in the same event loop tick. + + - **Consistent (on read, after every write):** The engine ensures that any computations which must be performed to produce state to a client are performed before the client accesses that state. When a client issues any state change to the engine, any state subsequently read by that client will be a product of the complete and consistent result of any computations dependent (directly or indirectly) on that state change. + + The engine _may_ defer certain computations (e.g. to optimize performance), but any deferred computations **will** be performed before a client acccesses the computation's result (or any other computation which depends on it). The engine may perform deferred computations either on client demand (e.g. in a `get` accessor or `Proxy` trap) or in the background (e.g. asynchrony which is not observable by a client). + + - (Likely to change) **Every write method returns the complete form state**: early in the engine's design, we established a _convention_ for client-facing write method signatures where any write method for any aspect of a form's state will return the complete state of the form. This was intended as an _implied contract_ of the engine's guarantees of synchrony and consistency. While this has some philosophical value, we've found it doesn't have much _practical value_. So we will probably eliminate this convention, probably in favor of a more idiomatic convention like write methods returning the directly written state. + +## Engine flow (broad strokes) + +1. **`initializeForm` (client entrypoint):** accepts a `FormResource` (XForms XML string, or URL reference to XForms XML resource), and a client's options configuring certain form init/runtime behavior. + +2. **Retrieve form definition (via client-configured `fetchFormDefinition`):** a `fetch`-like function. If the client provides the `FormResource` as a URL, the engine will use this option to request the form definition. + +3. **Initial/pre-parse (engine-internal `XFormDOM`):** + + - Parses the form definition from its XForms XML string input into a traversable tree. + - Performs some minor pre-processing to normalize certain ambiguous structures specified by ODK XForms. + - Identifies key aspects of the form definition for reference in later stages of parsing. + +4. **Resolve form attachments (via client-configured `fetchFormAttachment`, `FormAttachmentResource`):** a `fetch`-like function. Form attachment references are identified in the `XFormDOM` structure. These references are then retrieved by the client-provided `fetchFormAttachment` function. The retrieved resources are represented internally by a `FormAttachmentResource`, which may be referenced for further parsing or other form functionality. + +5. **Parse secondary instances (`FormAttachmentResource`, `SecondaryInstanceDefinition`, etc):** Forms may reference secondary instances in computations, generally to populate `` items (for ``, etc) + - Structural elements (``, ``) containing those controls (or other sub-structures) + + Each `BodyElementDefinition` may _also reference_ parsed representations of _supporting elements_: + + - Representing text (`