The Empirical Linguistic Engine (ELE)

A Unified Architectural Framework Across the P→P→L→C→C Continuum

(Physics → Physiology → Linguistics → Cognition → Communication)


I. Introduction: The Mandate for an Empirical Linguistic Engine (ELE)

The development of the Empirical Linguistic Engine (ELE) requires a decisive move away from purely abstract linguistic formalism toward a unified, recursive, and empirically validated computational architecture.

This architecture must:

  • Integrate physical and biological constraints
  • Respect cognitive and social theories
  • Define explicit interfaces, measurable units, and operational principles
  • Honor the causal hierarchy:

Physics → Physiology → Linguistics → Cognition → Communication
(P→P→L→C→C)

The Necessity of Grounding and Unification

Historically, the study of language has been fractured:

  • Generative linguistics (Chomsky, 1950s) posits an innate Language Acquisition Device (LAD), a specialized, modular system devoted exclusively to language. [1]
  • Cognitive linguistics (1970s onward) rejects strict modularity, arguing that language and cognition share mechanisms, and that language is fundamentally embodied and situated. [3]

From this perspective, traditional analyses focused solely on phonetics, syntax, and semantics are insufficient, especially when explaining communication deficits in everyday social interaction. [5]

The ELE adopts an integrative stance:

  • It leverages the computational precision of structured representations.
  • It mandates adherence to physical and physiological grounding.
  • It directly addresses critiques of large language models (LLMs) that claim text-only training is ungrounded and lacks “real understanding”. [6]

ELE does this by explicitly integrating empirical constraints from the physical world into its architecture.

The Computational Imperative: Recursion and Multi-Scale Architectures

Human language is:

  • Hierarchical
  • Productivity-driven
  • Capable of generating infinite expressions from finite means

The key mechanism enabling this is recursion—the embedding of structures within themselves. [8]

This functional requirement induces architectural constraints for any computational linguistic engine.

Modern neural network research shows:

  • Complex language systems spontaneously develop hierarchical temporal structures that align with linguistic levels, without explicit pre-programming. [9]
  • Multi-Timescale Recurrent Neural Networks (MTRNNs) self-organize into distinct timescales:
  • Short timescales (~0.17 words): phonological processes
  • Medium timescales (1–10 words): morphological and syntactic structures
  • Long timescales (≈360,000+ words): lexical and semantic relationships [9]

This validates a functional separation of linguistic strata and supports the adoption of Recursive Linguistic Models (RLMs). [10]

RLMs:

  • Depart from flat, purely sequential processing
  • Allow recursive querying of:
  • the model itself, or
  • an external environment (e.g., a REPL)

This is crucial for:

  • Managing vast context lengths
  • Modeling human-like discourse
  • Handling situated pragmatics at scale

II. Stage 1: Physics — The Aerodynamic Foundation

The linguistic chain begins in physics, not in abstract grammar.

At the foundational level:

  • Air pressure and airflow are manipulated.
  • These set material limitations and temporal boundaries for language.

Acoustic and Aerodynamic Principles

Speech production (phonation) depends on:

  • Vital Capacity (VC) — maximum exhaled air volume after maximal inhalation
  • VC and derived metrics (e.g., Phonation Quotient, Estimated Mean Airflow Rate) inform:
  • vocal stamina
  • maximum utterance length [12]

Key control variable:

  • Lung (subglottal) pressure → drives self-oscillation models of the vocal folds. [14]

Outputs at this stage:

  • Phonation Trigger
  • Acoustic waveform

Examples:

  • VC (mL) defines maximum temporal bounds of continuous utterance
  • Typical norms:
    • Males ≤ 39 years: ~3530 mL
    • Females ≤ 39 years: ~3080 mL [12]

These volumes constrain Maximum Phonation Time, which in turn:

  • Sets natural boundaries for breath groups and prosodic phrases.
  • Tethers abstract linguistic units like “phrase” to physical respiratory capacity. [15]

A realistic ELE must therefore:

  • Initialize speech generation with real-time or modeled aerodynamic data.

Historical-Conceptual Grounding: Pneuma

The linkage between air and cognition is ancient:

  • Greek Pneuma (πνεῦμα) = “breath”, “wind”, “spirit”. [16]
  • In classical thought, pneuma was seen as:
  • inhaled air transformed into a vital spirit
  • traveling through organs to the brain
  • distributed via nerves as “animal spirit” [18]

Anaximenes (~500 BC) equated:

  • Soul (psyche) = air (aer), breath (pneuma), and world-encompassing medium. [16]

Though biologically outdated, this framework anticipates ELE’s causal demand:

External air → physiological flow → cognitive control


III. Stage 2: Physiology — Embodied Processor and Motor Command

Here, bulk aerodynamic input becomes:

  • Precisely controlled
  • Temporally sequenced
  • Biomechanically embodied motor output

Mechanisms of Phonation and Muscle Control

Controlled airflow is converted into oscillation via the larynx, regulated by:

  • Fine-grained laryngeal muscle control
  • Low-dimensional self-oscillation models capturing:
  • body-cover differentiation of the vocal folds
  • primary vibration modes (shear and compressional) [14]

Neural programming at this stage:

  • Converts normalized activation levels of key muscles:
  • Cricothyroid (CT)
  • Thyroarytenoid (TA)
  • Lateral Cricoarytenoid (LCA)
  • Posterior Cricoarytenoid (PCA)
  • Into physical quantities:
  • vocal fold strain
  • adduction
  • glottal convergence
  • stiffness [14]

Critical output:

  • Pneuma (Motor Command) — the patterned neural command controlling glottal aerodynamics.

This is the modern physiological analog of the ancient pneuma.

Neurophysiological Constraints and Embodiment

Physiology is the causal bridge between:

  • Abstract linguistic form
  • Concrete motor action

Sequential motor control:

  • Enforces precise timing
  • Shapes articulation
  • Suggests non-arbitrary foundations for:
  • phonotactic rules
  • linear order in syntax [19]

Evidence for embodied cognition:

  • Language may rely on a “language prewired brain” built upon the Mirror Neuron System (MNS). [3]
  • Language processes display interaction-dominant dynamics:
  • heavily dependent on situational variables at fine timescales
  • word recognition and lexical mapping depend on object visibility and actionfulness in context [20]

Thus:

Proper analysis of language must be conducted at the level of organisms + environment. [20]

Architecturally, ELE must:

  • Be non-sequential only (not a strict pipeline)
  • Be interaction-dominant and recursive at the sensory input layer
  • Dynamically link:
  • physiological and initial linguistic layers
  • with an evolving external environment model from later cognitive stages

IV. Stage 3: Linguistics — Formal Hierarchical Structure (L-Units)

This stage:

  • Formalizes structured organization of the linguistic signal
  • Transforms physiological outputs into symbolic representation

Language’s power lies in its hierarchical structure, enabling infinite productivity. [8]

Classical Stratification and Hierarchy

Standard hierarchy:

  • Phonology
  • Studies sound systems
  • Basic unit: Phoneme — smallest sound unit that can distinguish meaning (/p/ vs /b/) [8]
  • Morphology
  • Studies internal structure of words
  • Basic unit: Morpheme — smallest meaningful unit (e.g., plural “-s”) [8]
  • Syntax
  • Rules for combining words into larger constituents
  • Basic objects: phrases and sentences
  • Key property: Recursion — embedding phrases within phrases [8]

Generative models (e.g., Government and Binding Theory):

  • Propose “movement” and traces (silent copies) in sentence structure
  • Psycholinguistic evidence supports traces as part of mental representation. [22]

Defining the Atomic Units of Meaning: The Sememe

Semantics sometimes requires a finer granularity than morphemes.

The Sememe (or seme):

  • An indivisible, atomic unit of meaning [21]
  • Historically: a “definite idea-content expressed in some linguistic form” [23]

Example:

  • “Triangle” = sememe “three-sided straight-lined figure” [23]

Computational Hierarchy and Embodiment

Empirical evidence:

  • ERP/MEG studies show morphologically complex words elicit larger responses than monomorphemic words — supporting active morphological decomposition during access. [24]

Neural architectures mirror this:

  • Transformer models (e.g., BERT) exhibit systematic info distribution:
  • Early layers (1–3): phonological/morphological segmentation
  • Middle layers: syntactic structure
  • Late layers: lexical and semantic relationships [9]

This:

  • Empirically validates functional separation of linguistic levels.
  • Confirms that Sememes (atomic meaning units) must be:

Grounded through sensorimotor simulations (Stage 2),
linking semantics with embodied cognition and action. [15]

Thus, there is a biologically plausible feedback loop between:

  • Linguistics (Stage 3)
  • Physiology (Stage 2)

V. Stage 4: Cognition — Mental Representation and Architectures

Cognition:

  • Acquires, stores, and applies linguistic knowledge
  • Interfaces formal language structure with the external world

Language–Cognition Interaction

Language and cognition:

  • Interact closely rather than being strictly identical or fully separate. [3]

Cognition:

  • Builds internal models of the world

Language:

  • Acts as a storehouse of cultural wisdom
  • Serves as a teacher, adapting cultural knowledge to concrete life situations. [3]

Psycholinguistic Mechanisms: Storage, Retrieval, Generalization

Language processing hinges on:

  • Sensory, short-term, and long-term memory coordination. [26]
  • Chunking: grouping elements into manageable units

Short-term memory:

  • Holds roughly 5–9 chunks [26]
  • Directly shapes parsing efficiency and lexical retrieval

Metarules:

  • Capture generalizations about language structure
  • Do not list structures themselves
  • Instead define classes of permissible structures [27]

In frameworks like Generalized Phrase Structure Grammar (GPSG):

  • Metarules, Feature Co-occurrence Restrictions, and the Head Feature Convention:
  • compactly encode structural generalizations [27]

Cognition also governs:

  • Conceptual metaphor
  • Prototype categorization [29]
  • Figurative language processing (metaphors, idioms) [28]
  • Mnemonics and associative networks for long-term recall [30]

Computational Architectures and Limits

To model long-term dependencies and context:

  • ELE must use Recursive Linguistic Models (RLMs) [10]
  • RLMs:
  • handle vast context via recursive subqueries
  • mimic attention and memory mechanisms across long discourse

Caution:

  • fMRI-based semantic mapping (brain regions vs. conceptual fields) suffers:
  • signal loss in temporal regions
  • performance-related variability [32]

Thus, ELE’s cognitive validation demands:

  • Controlled performance variables
  • Multi-modal methods (EEG/MEG, behavioral data, etc.)

VI. Stage 5: Communication — Situated Pragmatics and Intentionality

This stage deals with actual use of language in:

  • Social contexts
  • Environmental situations

Here, meaning is defined not only by structure but by:

  • Intention
  • Context
  • Social norms

Situated Context and Social Layers

Pragmatics:

  • Studies how utterances communicate beyond literal meaning and grammar [34][35]
  • Focuses on:
  • implications
  • inferences
  • attitudes
  • situational context

Communication operates across social layers [28]:

  • Social Norms layer:
  • unwritten rules
  • respect, politeness, body language
  • when/how to speak
  • Friendships layer:
  • humor, empathy, in-jokes
  • slang and non-literal uses

Defining the Unit of Interaction: The Pragmeme

Analogous to Sememe (semantic atom), Pragmeme:

  • Proposed to clarify “indirect speech act” phenomena [36]
  • Defined as:
  • a unit realized through pragmatic acts
  • tightly bound to situation and context [36]

A Pragmeme may:

  • Arise through Pragmaticalization
  • Carry its own illocutionary force [37]

Measuring it requires:

  • Rich pragmatic assessment protocols
  • Inclusion of:
  • linguistic
  • extra-linguistic
  • paralinguistic signals [5]

Cross-cultural work needs coding schemes that distinguish:

  • Core Code Categories (replicable, e.g., syntactic downgrading)
  • Situated Code Categories (culture-specific, context-bound) [38]

Intentionality and Theory of Mind

Successful communication depends on:

  • Inferring others’ beliefs, intentions, and perspectives
  • This is Theory of Mind (ToM) [39]

For ELE:

  • ToM-like inference is required for full pragmatic competence
  • Recent work shows advanced LLMs can pass false-belief tasks, suggesting:
  • sufficient structural complexity to model basic ToM reasoning [39]

The Pragmeme:

  • Is the highest-order grounding unit:
  • Validates and adapts cognitive constructs (Stage 4)
  • Constrains them via real-world environment and social dynamics [20][36]

This demands an interaction-dominant architecture:

  • Continuous adaptation
  • Dynamic sensitivity to context and norms

VII. The Empirical Linguistic Engine Architecture and Conclusions

The ELE unifies five domains:

  1. Physics
  2. Physiology
  3. Linguistics
  4. Cognition
  5. Communication

In this framework:

  • Physical constraints start the chain
  • Physiology embodies it
  • Linguistics structures it
  • Cognition indexes and recursively manipulates it
  • Communication situates and validates it

The Integrated P→P→L→C→C Model

The architecture must:

  • Enforce the causal hierarchy: Physics → Physiology → Linguistics → Cognition → Communication
  • Allow dynamic, recursive interaction between layers
  • Especially connect:
  • early embodiment
  • later situational models

Table 1. The ELE Architectural Hierarchy

Stage in ContinuumPrimary Scientific DomainOperational Function / OutputKey Discrete Unit (Computational / Linguistic)Example Empirical Measurement
PhysicsFluid Dynamics / AcousticsAirflow generation and resonance initiationPhonation Trigger / Acoustic WaveformVital Capacity (mL); Lung Pressure (cm H₂O) [12]
PhysiologyNeurobiology / Motor ControlLaryngeal & articulatory muscle activation and vibrationPneuma (Physiological / Motor Command) [14]Cricothyroid (CT) muscle activation level [14]
LinguisticsFormal / Generative TheoryStructuring form and basic meaningMorpheme (Minimal Meaningful Unit) [8]ERP/MEG morphological decomposition effects [24]
CognitionPsycholinguistics / MemoryConceptual indexing, retrieval, and recursive structuringSememe (Atomic Semantic Feature) [21]Chunking capacity (≈5–9 items); lexical retrieval time [26]
CommunicationPragmatics / SociolinguisticsSituated contextual interpretation and intentionalityPragmeme (Situationally Bound Speech Act) [36]Pragmatic protocol score (topic coherence, appropriateness, etc.) [5]

Table 2. Cognitive and Computational Correlates of Linguistic Layers

Linguistic LevelCognitive / Computational CorrelateTimescale / Locus (LLM / Human)Evidence / Function
Phonology / LetterPhonological processes; sound-to-letter mappingShort timescale (~0.17 words); early layers [9]Fast neurons encode basic acoustic sequence data [9]
Morphology / WordMorphological segmentation; basic syntactic structureMedium timescale (1–10 words); middle layers [9]Processing minimal meaningful units and phrase structure [8]
Syntax / PhrasePhrase-structure parsing and recursion (embedding)Generative mechanism; RLM context management [8][10]Enables infinite productivity via hierarchical organization [8]
Semantics / LexicalLexical and semantic relationships; conceptualizationLong timescale (≈360,000+ words); late layers [9]Storage of cultural knowledge; embodied meaning indexing [3][9]
Pragmatics / DiscourseSituational adaptation; Theory of Mind inference; discourse-level controlExternal environment; interaction-dominant dynamics [20]Defines meaning via context, intent, social norms, and ToM-based inference [28][39]

Conclusions and Architectural Recommendations

Successful implementation of the ELE requires three architectural directives:

  1. Mandatory Embodiment and Sensorimotor Grounding
  • Semantic units (Sememes) must be indexed through simulated physical actions or “imagined manipulation.” [15]
  • Grounding is rooted in the physiological stage (mirror neuron system, embodiment) [3]
  • This directly addresses critiques of ungrounded, text-only models. [6]
  1. Adoption of Recursive, Multi-Scale Processing
  • Use architectures like RLMs to exploit functional hierarchies (fast vs. slow neurons). [9][10]
  • Handle vast contexts by recursive querying of environment and memory.
  • Essential for:
    • Cognitive generalization (Metarules, chunking)
    • Pragmatic adaptation (Pragmemes, discourse integration)
  1. Dynamic Interaction-Dominant Architecture
  • The engine must not be a rigid pipeline.
  • Empirical evidence shows situational context influences low-level processing in real time. [20]
  • Therefore:
    • Physiological and early linguistic layers must remain dynamically linked to the pragmatic situational model.
    • Meaning is continuously constrained by environment and social norms (Pragmeme layer). [36]

For empirical validation, ELE must:

  • Go beyond text-only corpora
  • Incorporate multimodal data:
  • Aerodynamic measures (VC)
  • Muscle activation signals
  • High-temporal-resolution neuroimaging (EEG/MEG)
  • Actively correct for:
  • fMRI signal loss in temporal regions
  • Performance variability across groups and tasks [32]

By anchoring all formal linguistic units to measurable physical and physiological constraints, the Empirical Linguistic Engine provides a robust, human-plausible framework for next-generation computational language models.

It is:

  • Physically grounded
  • Biologically embodied
  • Linguistically structured
  • Cognitively recursive
  • Communicatively situated

— a full P→P→L→C→C continuum rendered as a single coherent architecture.