Emergent Coupling Dynamics:

Oct 13
\documentclass[11pt]{article}
\usepackage[a4paper,margin=1in]{geometry}
\usepackage{amsmath,amssymb,amsthm,mathtools}
\usepackage{algorithm,algorithmic}
\usepackage{microtype}
\usepackage{graphicx}
\usepackage{booktabs}
\usepackage{siunitx}
\usepackage{hyperref}
\usepackage{xcolor}
\usepackage{listings}

\hypersetup{
  colorlinks=true,
  linkcolor=blue!60!black,
  citecolor=blue!60!black,
  urlcolor=blue!60!black
}

% ---------------------------------------------------------
% Code style
% ---------------------------------------------------------
\lstdefinestyle{py}{
  language=Python,
  basicstyle=\ttfamily\small,
  breaklines=true,
  showstringspaces=false,
  frame=single,
  framerule=0.3pt,
  numbers=left,
  numberstyle=\tiny,
  xleftmargin=1em
}

% ---------------------------------------------------------
% Title / Meta
% ---------------------------------------------------------
\title{\Large Emergent Coupling Dynamics (ECD):\\
\large A Bounded, Contractive, and Testable Framework with LLM Mapping and Embedded Reproducibility}
\author{C.\,L.\,Vaillant}
\date{October 2025}

% ---------------------------------------------------------
% Theorem environments
% ---------------------------------------------------------
\newtheorem{definition}{Definition}
\newtheorem{lemma}{Lemma}
\newtheorem{theorem}{Theorem}
\newtheorem{remark}{Remark}

% ---------------------------------------------------------
% Results macros (auto-overridden by results.tex after running code)
% ---------------------------------------------------------
\newcommand{\Lbound}{\ensuremath{0.49}}            % placeholder
\newcommand{\TauPred}{\ensuremath{4.9}}            % placeholder
\newcommand{\TauObsMean}{\ensuremath{\text{(TBD)}}}
\newcommand{\TauObsStd}{\ensuremath{\text{(TBD)}}}
\newcommand{\RMSEecdMean}{\ensuremath{\text{(TBD)}}}
\newcommand{\RMSEecdStd}{\ensuremath{\text{(TBD)}}}
\newcommand{\RMSEltiMean}{\ensuremath{\text{(TBD)}}}
\newcommand{\RMSEltiStd}{\ensuremath{\text{(TBD)}}}
\newcommand{\RMSEnnMean}{\ensuremath{\text{(TBD)}}}
\newcommand{\RMSEnnStd}{\ensuremath{\text{(TBD)}}}
\newcommand{\CrecovMean}{\ensuremath{\text{(TBD)}}}
\newcommand{\CrecovStd}{\ensuremath{\text{(TBD)}}}
\IfFileExists{results.tex}{\input{results.tex}}{}

\begin{document}
\maketitle

\begin{abstract}
\textbf{Objective.} Formalize the \emph{Emergent Coupling Dynamics (ECD)} framework, a bounded discrete-time dynamical system with alignment, coupling, and optional multi-timescale and stochastic extensions; integrate an optional language-model (LLM) mapping to bounded observation/target space; and make the framework empirically falsifiable and reproducible in a single, self-contained manuscript.

\textbf{Method.} We specify the ECD update, derive the Jacobian and prove boundedness, global Lipschitz continuity in the state, and convergence under contraction; provide a state-space estimation objective with regularization and identifiability guidance; define falsifiable predictions and a validation protocol; add an LLM-derived observation/target mapping with drift monitoring; and embed a modular Python reference implementation (functions for data generation, training with early stopping and validation, visualization, ablations, and multi-seed aggregation) that generates figures and a \texttt{results.tex} file to auto-populate tables and metrics.

\textbf{Results.} Theoretical contraction bound $L$ yields a predicted relaxation time $\tau_{95}\approx \ln(0.05)/\ln L$. The code fits ECD (optimizing $C$) and baselines (LTI, neural Euler-MLP) on a canonical synthetic scenario across multiple seeds, reporting mean $\pm$ std and saving figures, which \LaTeX{} includes on recompilation.

\textbf{Conclusion.} ECD is mathematically grounded, implementation-ready, and testable. \textbf{No real-world dataset is included in this version;} all quantitative results are synthetic. We state this limitation explicitly and outline how to add a real-data case study.
\end{abstract}

\section*{Reproducibility: Running the Experiment}
This single manuscript contains theory, LLM mapping, implementation, and results. To regenerate all figures and populate numerical results:
\begin{enumerate}
  \item Copy the Appendix code to \texttt{run\_ecd.py}.
  \item Install dependencies: \texttt{pip install numpy torch matplotlib}.
  \item Execute: \texttt{python run\_ecd.py}.
  \item Recompile this \LaTeX{} document to insert the updated numbers and figures.
\end{enumerate}
Expected runtime: $<1$ minute on CPU for defaults (multi-seed may take slightly longer).

\section{Introduction and Motivation}
Mathematical models of coupled dynamics are widespread (e.g., consensus, synchronization, population interactions). Many flexible sequence models (RNNs, neural ODEs) lack global stability guarantees or interpretable coupling structure. We present \emph{Emergent Coupling Dynamics (ECD)}, a bounded, contractive, and interpretable discrete-time framework merging three ingredients: (i) a convex-combination update (EMA-like), (ii) an explicit coupling matrix $C$, and (iii) an alignment branch to observations/targets. We make ECD falsifiable by (a) providing contraction-based rate predictions, (b) specifying sensitivity to coupling perturbations, and (c) enforcing bounded-state compliance. To connect with text/multimodal data, we include an optional LLM-derived mapping into the bounded observation/target space. All claims are mathematical; semantics arise only through the defined mapping and require independent validation.

\section{Model: Update, Jacobian, and Basic Properties}
\paragraph{State and bounds.}
Let $n\in\mathbb{N}$ and $[\ell,u]\subset\mathbb{R}$ with $\ell<u$. At discrete time $t=0,1,2,\dots$ the state is $\mathbf{s}_t\in[\ell,u]^n$.

\paragraph{Inputs (optional).}
Exogenous input $\mathbf{x}_t\in\mathbb{R}^m$, control/action $\mathbf{a}_t\in\mathbb{R}^p$ (set $\mathbf{a}_t=\mathbf{0}$ if absent), and an optional target/reference $\mathbf{y}_t\in[\ell,u]^n$.

\begin{definition}[ECD one-step update]
Let $\phi:\mathbb{R}^n\to[\ell,u]^n$ be Lipschitz (constant $L_\phi$), elementwise (e.g.\ scaled logistic). Let $\Pi_{[\ell,u]^n}$ denote the elementwise projection (non-expansive). For parameters
\[
W_t\in\mathbb{R}^{n\times m},\quad D_t\in\mathbb{R}^{n\times p},\quad C_t\in\mathbb{R}^{n\times n},\quad
\alpha\in(0,1],\;\gamma,\lambda\in[0,1],
\]
define
\begin{equation}
\label{eq:ecd}
\mathbf{s}_{t+1} =
\Pi_{[\ell,u]^n}\!\Big(
(1-\gamma)\big[(1-\alpha)\mathbf{s}_t + \alpha\,\phi(W_t\mathbf{x}_t + D_t\mathbf{a}_t + C_t\mathbf{s}_t)\big]
+ \gamma\big[(1-\lambda)\mathbf{s}_t + \lambda\,\mathbf{y}_t\big]\Big).
\end{equation}
\end{definition}

\paragraph{Jacobian (pre-projection).}
Let $r_t=W_t\mathbf{x}_t + D_t\mathbf{a}_t + C_t\mathbf{s}_t$ and $D_\phi(r_t)=\mathrm{diag}(\phi'(r_{t,1}),\dots,\phi'(r_{t,n}))$. The Jacobian of the pre-projection map $T$ w.r.t.\ $\mathbf{s}_t$ is
\begin{equation}
\label{eq:jac}
J_t \;=\; \frac{\partial T}{\partial \mathbf{s}_t}
\;=\; (1-\gamma)\Big[(1-\alpha)I + \alpha\,D_\phi(r_t)\,C_t\Big] \;+\; \gamma(1-\lambda)I.
\end{equation}
Since $\|\mathrm{diag}(\phi')\|_2 \le L_\phi$, we obtain $\|J_t\|_2 \le (1-\gamma)[(1-\alpha)+\alpha L_\phi\|C_t\|_2]+\gamma(1-\lambda)$, matching the Lipschitz bound below.

\begin{lemma}[Boundedness]
If $\mathbf{s}_0\in[\ell,u]^n$, then $\mathbf{s}_t\in[\ell,u]^n$ for all $t\ge 0$.
\end{lemma}
\begin{proof}
$\phi(\cdot)\in[\ell,u]^n$ elementwise by definition. The inner convex combinations keep values in $[\ell,u]^n$, and $\Pi_{[\ell,u]^n}$ is a projection onto $[\ell,u]^n$.
\end{proof}

\begin{lemma}[Global Lipschitz constant in $\mathbf{s}$]
Let $T$ denote the pre-projection map in \eqref{eq:ecd}. For the state argument,
\begin{equation}
\label{eq:Lfull}
L \;\le\; (1-\gamma)\big[(1-\alpha) + \alpha\,L_\phi\,\|C_t\|_2\big] + \gamma(1-\lambda).
\end{equation}
Here $W_t,D_t$ act on exogenous inputs and do not enter the Lipschitz constant in $\mathbf{s}$. Projection is $1$-Lipschitz.
\end{lemma}

\begin{theorem}[Contraction and convergence]
$[\ell,u]^n$ is compact (hence complete), and by Lemma~1 the map in \eqref{eq:ecd} is self-mapping. If $L<1$ in \eqref{eq:Lfull}, then $T$ is a contraction on $[\ell,u]^n$, ECD admits a unique fixed point $\mathbf{s}^\star$, and
\[
\|\mathbf{s}_t-\mathbf{s}^\star\|_2 \le L^t \|\mathbf{s}_0-\mathbf{s}^\star\|_2.
\]
\end{theorem}

\begin{remark}[Extensions (stochastic, time-varying, multi-rate)]
\emph{Stochastic:} For $\mathbf{s}_{t+1}=T(\mathbf{s}_t)+\varepsilon_t$ with $\mathbb{E}[\varepsilon_t]=0$, $\mathrm{Var}(\varepsilon_t)<\infty$, and uniform $L<1$, standard results (Meyn--Tweedie, \emph{Markov Chains and Stochastic Stability}, Theorem~15.0.1) imply a unique invariant measure and geometric ergodicity; for stochastic approximation rates, see Kushner--Yin (2003, Ch.\,2).\\
\emph{Time-varying:} If $\sup_t L_t<1$ and inter-step parameter drift $\|C_{t+1}-C_t\|_2$ is sufficiently small, uniform contraction persists (cf.\ Granas--Dugundji, fixed-point perturbations).\\
\emph{Multi-rate:} With componentwise $\alpha_i\in(0,1]$, define $\alpha_{\max}=\max_i \alpha_i$ and \emph{substitute $\alpha_{\max}$ for $\alpha$ in \eqref{eq:Lfull}}. A sufficient condition is $L(\alpha_{\max})<1$. The slowest component controls the relaxation time.
\end{remark}

\section{Estimation, Identifiability, and Baselines}
\paragraph{Observation model.} We use $\mathbf{y}_t=g_\theta(\mathbf{s}_t)+\eta_t$; for synthetic tests, $g_\theta$ is identity and $\eta_t$ is Gaussian.

\paragraph{Objective.} Given sequences $(\mathbf{x}_t,\mathbf{a}_t,\mathbf{y}_t)$, estimate $\theta=\{W,D,C,\alpha,\gamma,\lambda\}$ by minimizing
\[
\mathcal{L}(\theta)=\frac{1}{T}\sum_{t=1}^{T}\|\mathbf{y}_t-g_\theta(\mathbf{s}_t(\theta))\|_2^2+\lambda_1\|C\|_1+\lambda_2\|C\|_F^2,
\]
using backpropagation through time (BPTT). Identifiability requires sufficient excitation (observability rank conditions); regularization on $C$ aids recovery. Rule of thumb data length: $T \ge 20|\theta|$ (prefer $50|\theta|$ in high-noise regimes), consistent with generalization heuristics from statistical learning (cf.\ VC-type sample complexity).

\paragraph{Baselines.}
\emph{LTI:} $\mathbf{y}_{t+1}=A\mathbf{y}_t+B\mathbf{x}_t$. \;
\emph{Neural (Euler-MLP):} $\mathbf{y}_{t+1}=f_\xi(\mathbf{y}_t,\mathbf{x}_t)$ with a small MLP.

\paragraph{Metrics.} Test RMSE; $\tau_{95}$: steps until $\|\mathbf{s}_t-\mathbf{s}^\star\|\le0.05\|\mathbf{s}_0-\mathbf{s}^\star\|$ (for constant-input runs); coupling recovery $\|\hat{C}-C\|_F$ (when $C$ known). For robustness, report mean $\pm$ std across seeds.

\section{LLM-Derived Observation and Target Mapping (Optional)}
\label{sec:llm-mapping}
\paragraph{Text to embedding.}
Given text (or multimodal) input $\iota_t$, compute an embedding
$e_t = E_{\text{LLM}}(\iota_t) \in \mathbb{R}^{d_{\text{emb}}}$ using a fixed, pretrained encoder.\footnote{Choice of encoder is an implementation detail; the ECD math does not depend on the specific model.}

\paragraph{Dimensionality reduction.}
Project to $k \le n$ informative directions to decorrelate and denoise:
\[
h_t = \mathrm{PCA}_k(e_t)\in\mathbb{R}^{k}\quad\text{or}\quad h_t = U^\top e_t,
\]
with $U$ learned on a separate calibration set to avoid leakage. Maintain a rolling reference $\bar{h}$ via exponential averaging $\bar{h}\leftarrow(1-\rho)\bar{h}+\rho h_t$ (e.g., $\rho=0.01$).

\paragraph{Calibrated mapping to bounded space.}
Map $h_t$ to the ECD observation/target space with a learned, monotone-bounded calibrator $g_\theta$:
\[
\widehat{\mathbf{y}}_t = g_\theta(h_t) \in [\ell,u]^n,\qquad
g_\theta(z) = \ell + (u-\ell)\,\sigma( A z + b ),
\]
where $\sigma$ is applied elementwise (e.g., logistic), and $(A,b)$ are fitted on labeled calibration data. This ensures $\widehat{\mathbf{y}}_t\in[\ell,u]^n$.

\paragraph{Use within ECD.}
Set $\mathbf{y}_t = \widehat{\mathbf{y}}_t$ in the observation model or use $\widehat{\mathbf{y}}_t$ as the alignment target in Eq.~\eqref{eq:ecd}. Estimation then proceeds with $\widehat{\mathbf{y}}_t$ substituted.

\paragraph{Semantic drift monitoring.}
Define the Median Absolute Deviation $\mathrm{MAD}(h)=\mathrm{median}_\tau \|h_\tau-\mathrm{median}(h)\|_2$. Monitor
\[
\mathrm{Drift}(t) \;=\; \frac{\|h_t - \bar{h}\|_2}{\mathrm{MAD}(h)} ,\qquad
\Delta_{\text{map}}(t) \;=\; \|\widehat{\mathbf{y}}_t - g_{\theta^\star}(h_t)\|_2,
\]
where $g_{\theta^\star}$ is the frozen calibrator from initial training. Threshold exceedance triggers recalibration of $U$ and/or $(A,b)$ on held-out validation. This audits the semantics-to-state mapping without altering ECD dynamics.

\paragraph{Validation.}
Validate $g_\theta$ against external ground truth aligned to coordinates (e.g., human ratings, task scores) using correlation and calibration error on a held-out set.

\section{Falsifiable Predictions and Experimental Protocol}
\textbf{P1 (Sensitivity):} At a fixed point, with Jacobian $J=\partial T/\partial \mathbf{s}|_{\mathbf{s}^\star}$ and $\|J\|_2<1$, the linear response is
\[
\frac{\partial \mathbf{s}^\star}{\partial C_{ij}} = (I-J)^{-1}\,\frac{\partial T}{\partial C_{ij}}\Big|_{\mathbf{s}^\star}.
\]
Because $\|J\|_2<1$, $(I-J)^{-1}=\sum_{k\ge0}J^k$ is well-defined (Neumann series). Detectability threshold: changes smaller than $z_{0.95}\hat{\sigma}_j$ in coordinate $j$ are not considered significant.

\noindent\textbf{P2 (Convergence rate):} $\tau_{95}\approx \ln(0.05)/\ln L$. A deviation $>50\%$ falsifies the contraction-rate prediction.

\noindent\textbf{P3 (Boundary compliance):} Projection residual $\|\Pi(\cdot)-\cdot\|_\infty\le 10^{-6}$; larger systematic residuals falsify the bounded-state assumption (or implementation).

\paragraph{Validation protocol (synthetic).}
Data length: $T \ge 20|\theta|$ (prefer $50|\theta|$ in high noise). Train/val/test split (e.g., $60/20/20$). Baselines: LTI and Neural (Euler-MLP) with matched parameter counts. Metrics: Test RMSE, $\tau_{95}$, and $\|\hat{C}-C\|_F$. Report mean $\pm$ std over multiple random seeds.

\section{Canonical Synthetic Configuration and Theoretical Numbers}
We use $n=3$, $[\ell,u]=[0,1]$, scaled logistic ($L_\phi\le 1/4$), $(\alpha,\gamma,\lambda)=(0.7,0.2,0.3)$, and
\[
C=\begin{pmatrix}
0 & -0.45 & 0.62\\
0.51 & 0 & -0.38\\
0.33 & -0.41 & 0
\end{pmatrix},\qquad \|C\|_2\approx 1.6 \text{ (empirical)}.
\]
From \eqref{eq:Lfull}, $L\approx \Lbound$, predicting $\tau_{95}^{\mathrm{pred}}\approx \TauPred$ steps.

\section{Results (Auto-populated after running Appendix code)}
\noindent\textbf{Note.} Values below are auto-populated by \texttt{results.tex} after running the Appendix code. Before first execution, placeholders are shown.

\medskip
\noindent\textbf{Quantitative outcomes (mean $\pm$ std over seeds):}
\begin{itemize}
  \item Observed convergence (constant input): $\tau_{95}^{\mathrm{obs}}=\TauObsMean \pm \TauObsStd$.
  \item Test RMSE: ECD $=\RMSEecdMean \pm \RMSEecdStd$, LTI $=\RMSEltiMean \pm \RMSEltiStd$, Neural $=\RMSEnnMean \pm \RMSEnnStd$.
  \item Coupling recovery: $\|\hat{C}-C\|_F=\CrecovMean \pm \CrecovStd$.
\end{itemize}

\noindent\textbf{Table 1.} Test RMSE (lower is better; mean $\pm$ std).
\begin{center}
\begin{tabular}{l S S}
\toprule
Model & {RMSE (mean)} & {RMSE (std)} \\
\midrule
ECD & \RMSEecdMean & \RMSEecdStd \\
LTI & \RMSEltiMean & \RMSEltiStd \\
Neural (Euler-MLP) & \RMSEnnMean & \RMSEnnStd \\
\bottomrule
\end{tabular}
\end{center}

\begin{figure}[h]
  \centering
  \includegraphics[width=0.7\textwidth]{ecd_figs/C_heatmaps.png}
  \caption{True coupling matrix $C$ vs.\ estimated $\hat{C}$ (representative seed).}
  \label{fig:C}
\end{figure}

\begin{figure}[h]
  \centering
  \includegraphics[width=0.55\textwidth]{ecd_figs/convergence.png}
  \caption{Convergence trajectory $\|\mathbf{s}_t - \mathbf{s}^\star\|$ (log scale) under constant input (representative seed).}
  \label{fig:conv}
\end{figure}

\begin{figure}[h]
  \centering
  \includegraphics[width=0.75\textwidth]{ecd_figs/test_traj_comp1.png}
  \caption{Test set trajectory (component 1): ECD vs.\ LTI vs.\ Neural baseline (representative seed).}
  \label{fig:traj}
\end{figure}

\section{Discussion}
ECD combines bounded EMA-like updates, explicit coupling $C$, and an alignment branch into a contractive discrete-time system with falsifiable predictions and a direct estimation route. In the canonical synthetic setup, the predicted relaxation time from the contraction bound provides a quantitative target; empirical $\tau_{95}$ and RMSE vs.\ baselines validate usefulness beyond theory. A simple parameter-sensitivity sweep (Appendix code) can summarize how RMSE and $\tau_{95}$ vary with $(\alpha,\gamma,\lambda)$.

\subsection*{Ablation (implemented in code)}
\begin{itemize}
  \item Remove alignment ($\gamma=0$): expect slower or no convergence.
  \item Remove coupling ($C=0$): expect degenerate dynamics and poorer fit.
  \item Vary $\alpha$ and $\lambda$: measure sensitivity to mixing rates.
\end{itemize}

\paragraph{Limitation explicitly noted.} This version contains \emph{no real-world case study}. All quantitative results are synthetic. Adding one real dataset (affective dynamics, conceptual network, or control benchmark) is left to future work and would strengthen external validity.

\section{Related Work (Positioning, non-exhaustive)}
\textbf{Neural ODEs.} Continuous-depth modeling of residual networks.\\
\textbf{Stable/Lipschitz RNNs.} Control-theoretic constraints for global stability.\\
\textbf{Consensus/alignment (Cucker--Smale, Kuramoto).} Alignment terms yield flocking/synchronization.\\
\textbf{Lotka--Volterra.} Classical coupled dynamics with rich regimes.\\
\textbf{Contraction theory.} Unified exponential-convergence analysis for nonlinear systems.

\section{Limitations and Failure Modes}
Overparameterization (large $|\theta|$) vs.\ short sequences ($T$) harms identifiability; non-stationary regimes outside uniform contraction may violate theory; selecting $n$ (state dimension) remains application-dependent; semantic labels (if used) require external ground truth; computational scaling $O(Tn^2)$ per epoch—use sparse/structured $C$, truncated BPTT, or block-structured coupling.

% ============================================================
% Appendix: Reproducible Python (modular, early stopping, multi-seed)
% ============================================================
\appendix
\section{Reproducible Synthetic Experiment (Inline Code)}
\noindent
\textbf{How to use.} Copy the following into \texttt{run\_ecd.py} and execute with Python~3.9+. It will: (1) generate data, (2) fit ECD ($C$ only) and baselines with validation and early stopping, across multiple seeds, (3) save figures to \texttt{ecd\_figs/}, (4) write \texttt{results.tex} overriding macros used in the Results section. Recompile this \LaTeX{} to include the numbers and plots.

\begin{lstlisting}[style=py]
import os, math, sys
import numpy as np
import matplotlib.pyplot as plt
import torch, torch.nn as nn, torch.optim as optim

# --------------------- Hyperparameters (documented) ---------------------
N_SEEDS = 5                 # multiple trials for mean ± std
T, Ttrain = 200, 150        # sequence length, train cutoff
VAL_FRAC = 0.2              # split from train for early stopping
LR_C = 0.05                 # learning rate for ECD (C optimization)
EPOCHS_C = 600              # max epochs for ECD
EARLY_PATIENCE = 25         # early stopping patience (epochs)
L1_C, L2_C = 1e-3, 1e-3     # regularization strengths (sparsity, shrinkage)
LR_NN, EPOCHS_NN = 0.01, 300
HIDDEN_NN = 32
RHO_X = 0.8                 # AR(1) input smoothness
SNR_DB = 15                 # observation SNR in dB
ELL, UPP = 0.0, 1.0         # bounds
ALPHA, GAMMA, LAMBDA = 0.7, 0.2, 0.3  # ECD mix rates
ABLATE_ALIGNMENT = False    # if True, set gamma=0 during fitting
ABLATE_COUPLING = False     # if True, fit C=0 baseline ECD

# --------------------- Utility / Safety ---------------------
def safe_mkdir(path: str):
    try:
        os.makedirs(path, exist_ok=True)
    except Exception as e:
        print(f"[WARN] Could not create directory {path}: {e}", file=sys.stderr)

safe_mkdir("ecd_figs")

def sigma_np(z): return 1/(1+np.exp(-z))
def phi_np(z): return ELL + (UPP-ELL)*sigma_np(z)
def clip_bounds(s): return np.clip(s, ELL, UPP)

def torch_phi(z): return torch.sigmoid(z)  # scaled by clipping later

# --------------------- Data generation (synthetic) ---------------------
def gen_synthetic(n=3, m=3, rho=0.8, snr_db=15, T=200, C_true=None, seed=42):
    rng = np.random.default_rng(seed)
    # inputs x: AR(1)
    x = np.zeros((T, m)); xi = rng.standard_normal((T, m))
    for t in range(1, T):
        x[t] = rho*x[t-1] + xi[t]
    # latent s
    if C_true is None:
        C_true = np.array([[0.0, -0.45, 0.62],
                           [0.51, 0.0, -0.38],
                           [0.33, -0.41, 0.0]], float)
    s = np.zeros((T, n)); s[0] = rng.random(n)
    def ecd_step(s_vec, C):
        r = C @ s_vec
        u_t = phi_np(r)
        p = (1 - ALPHA)*s_vec + ALPHA*u_t
        g = 0.0 if ABLATE_ALIGNMENT else GAMMA
        q = (1 - g)*p + g*s_vec  # no target in data gen
        return clip_bounds(q)
    for t in range(T-1):
        s[t+1] = ecd_step(s[t], C_true)
    # observations y = s + noise
    sig_pow = np.mean(np.sum(s**2, axis=1))
    snr = 10**(snr_db/10)
    noise_var = sig_pow/snr
    y = s + rng.standard_normal((T, n))*np.sqrt(noise_var)
    return x, s, y, C_true

# --------------------- ECD training (optimize C only) ---------------------
def train_ecd_C(y_tr, val_frac=0.2, lr=LR_C, epochs=EPOCHS_C,
                l1=L1_C, l2=L2_C, seed=0):
    torch.manual_seed(seed)
    n = y_tr.shape[1]
    # train/val split
    Ttr = y_tr.shape[0]
    Tval = int(Ttr*val_frac)
    Y_val = torch.tensor(y_tr[-Tval:], dtype=torch.float32)
    Y_trn = torch.tensor(y_tr[:-Tval], dtype=torch.float32)
    s0 = Y_trn[0]
    # init C
    C_param = torch.randn(n, n)*0.01
    if ABLATE_COUPLING:
        C_param = torch.zeros(n, n)
    C_param = nn.Parameter(C_param)
    opt = optim.Adam([C_param], lr=lr)

    best_val = float('inf'); best_C = None; patience = 0
    g = 0.0 if ABLATE_ALIGNMENT else GAMMA

    def rollout(Ct, s0_t, steps):
        st = s0_t; traj=[st]
        for _ in range(steps):
            r = Ct @ st
            u_t = torch_phi(r)
            p = (1 - ALPHA)*st + ALPHA*u_t
            q = (1 - g)*p + g*st
            st = torch.clamp(q, ELL, UPP)
            traj.append(st)
        return torch.stack(traj, dim=0)

    for ep in range(epochs):
        opt.zero_grad()
        traj = rollout(C_param, Y_trn[0], len(Y_trn)-1)
        loss = torch.mean((traj - Y_trn)**2) + l1*torch.sum(torch.abs(C_param)) + l2*torch.sum(C_param**2)
        loss.backward(); opt.step()
        # validation
        with torch.no_grad():
            traj_v = rollout(C_param, Y_val[0], len(Y_val)-1)
            val = torch.mean((traj_v - Y_val)**2).item()
        if val < best_val - 1e-6:
            best_val = val; best_C = C_param.detach().clone(); patience = 0
        else:
            patience += 1
            if patience >= EARLY_PATIENCE: break
    return best_C.numpy()

# --------------------- Baselines ---------------------
def fit_lti(y_tr, x_tr):
    Yt = y_tr[:-1].T; Yt1 = y_tr[1:].T; Xt = x_tr[:-1].T
    Z = np.vstack([Yt, Xt])         # [y_t; x_t]
    A_B = Yt1 @ np.linalg.pinv(Z)   # solve Y_{t+1} = [A B][y_t; x_t]
    A = A_B[:, :Yt.shape[0]]; B = A_B[:, Yt.shape[0]:]
    return A, B

class EulerMLP(nn.Module):
    def __init__(self, din, hidden, dout):
        super().__init__()
        self.net = nn.Sequential(nn.Linear(din, hidden), nn.Tanh(), nn.Linear(hidden, dout))
    def forward(self, x): return self.net(x)

def fit_nn(y_tr, x_tr, lr=LR_NN, epochs=EPOCHS_NN, hidden=HIDDEN_NN, seed=0):
    torch.manual_seed(seed)
    n = y_tr.shape[1]; m = x_tr.shape[1]
    Xtr = np.hstack([y_tr[:-1], x_tr[:-1]]).astype(np.float32)
    Ytr = y_tr[1:].astype(np.float32)
    Xtr_t = torch.tensor(Xtr); Ytr_t = torch.tensor(Ytr)
    mlp = EulerMLP(n+m, hidden, n); opt = optim.Adam(mlp.parameters(), lr=lr)
    best = float('inf'); best_sd=None; patience=0
    for ep in range(epochs):
        opt.zero_grad()
        pred = mlp(Xtr_t); loss = torch.mean((pred - Ytr_t)**2)
        loss.backward(); opt.step()
        cur = loss.item()
        if cur < best - 1e-7:
            best = cur; best_sd = {k: v.clone() for k,v in mlp.state_dict().items()}; patience=0
        else:
            patience += 1
            if patience >= EARLY_PATIENCE: break
    if best_sd is not None: mlp.load_state_dict(best_sd)
    return mlp

# --------------------- Evaluation helpers ---------------------
def rollout_ecd_numpy(C_hat, y0, steps, gamma=GAMMA):
    traj=[y0.copy()]
    st=y0.copy()
    g = 0.0 if ABLATE_ALIGNMENT else gamma
    for _ in range(steps):
        r = C_hat @ st
        u_t = phi_np(r)
        p = (1 - ALPHA)*st + ALPHA*u_t
        q = (1 - g)*p + g*st
        st = clip_bounds(q); traj.append(st.copy())
    return np.stack(traj, axis=0)

def rmse(a, b): return float(np.sqrt(np.mean((a-b)**2)))

def fixed_point(C, s0=None, maxit=5000, tol=1e-10):
    n = C.shape[0]
    st = np.zeros(n) if s0 is None else s0.copy()
    for _ in range(maxit):
        stn = rollout_ecd_numpy(C, st, 1)[-1]
        if np.linalg.norm(stn - st) < tol: return stn
        st = stn
    return st

def obs_tau95(C, s0):
    s_star = fixed_point(C, s0=s0)
    d0 = np.linalg.norm(s0 - s_star); thr = 0.05*d0
    st = s0.copy()
    for t in range(1, 2000):
        st = rollout_ecd_numpy(C, st, 1)[-1]
        if np.linalg.norm(st - s_star) <= thr:
            return float(t)
    return float('nan')

# --------------------- Main multi-seed experiment ---------------------
def run_experiment(n=3, m=3, seeds=None):
    if seeds is None: seeds = list(range(N_SEEDS))
    rmse_ecd, rmse_lti, rmse_nn = [], [], []
    crec_list, tau_obs_list = [], []
    # parameter-sensitivity example (alpha/gamma/lambda grid) kept minimal for runtime
    for seed in seeds:
        x, s, y, C_true = gen_synthetic(n=n, m=m, rho=RHO_X, snr_db=SNR_DB, T=T, seed=seed)
        x_tr, x_te = x[:Ttrain], x[Ttrain:]
        y_tr, y_te = y[:Ttrain], y[Ttrain:]
        # ECD (C only)
        C_hat = train_ecd_C(y_tr, seed=seed)
        y_pred_ecd = rollout_ecd_numpy(C_hat, y_tr[-1], len(y_te)-1)
        rmse_ecd.append(rmse(y_pred_ecd, y_te))
        crec_list.append(float(np.linalg.norm(C_hat - C_true, 'fro')))
        # LTI
        A, B = fit_lti(y_tr, x_tr)
        y_lti=[]; yt=y_tr[-1]
        for t in range(len(y_te)):
            xt = x_te[t] if t<len(x_te) else np.zeros(m)
            yt1 = A @ yt + B @ xt
            y_lti.append(yt1); yt = yt1
        rmse_lti.append(rmse(np.array(y_lti), y_te))
        # Neural
        mlp = fit_nn(y_tr, x_tr, seed=seed)
        y_nn=[]; yt=y_tr[-1]
        for t in range(len(y_te)):
            inp = np.hstack([yt, x_te[t]]).astype(np.float32)
            with torch.no_grad(): yt1 = mlp(torch.tensor(inp)).numpy()
            y_nn.append(yt1); yt = yt1
        rmse_nn.append(rmse(np.array(y_nn), y_te))
        # tau95 (constant input case) using true C for theory-vs-observation
        tau_obs = obs_tau95(C_true, s[0])
        tau_obs_list.append(tau_obs)

    # Aggregate
    def mean_std(arr):
        a = np.array(arr, float); return float(np.mean(a)), float(np.std(a, ddof=1)) if len(a)>1 else 0.0
    out = {
        "rmse_ecd": mean_std(rmse_ecd),
        "rmse_lti": mean_std(rmse_lti),
        "rmse_nn":  mean_std(rmse_nn),
        "crec":     mean_std(crec_list),
        "tau_obs":  mean_std(tau_obs_list),
    }
    return out, C_true, C_hat, y_te, y_pred_ecd, np.array(y_lti), np.array(y_nn)

# --------------------- Figures and results writer ---------------------
def plot_and_save(C_true, C_hat, y_te, y_pred_ecd, y_pred_lti, y_pred_nn):
    # C heatmaps
    fig1, ax = plt.subplots(1,2, figsize=(8,3))
    ax[0].imshow(C_true, aspect='auto'); ax[0].set_title("True C")
    ax[1].imshow(C_hat,  aspect='auto'); ax[1].set_title("Estimated C")
    plt.tight_layout(); plt.savefig("ecd_figs/C_heatmaps.png", dpi=200); plt.close(fig1)
    # Convergence (representative)
    dists=[]; st=y_te[0].copy(); C = C_true
    s_star = fixed_point(C, s0=st)
    for t in range(60):
        dists.append(np.linalg.norm(st - s_star)); st = rollout_ecd_numpy(C, st, 1)[-1]
    fig2, ax = plt.subplots(figsize=(5,3))
    ax.plot(range(len(dists)), dists); ax.set_yscale('log')
    ax.set_xlabel("t"); ax.set_ylabel(r"||s_t - s^*||")
    ax.set_title("Convergence (constant input)")
    plt.tight_layout(); plt.savefig("ecd_figs/convergence.png", dpi=200); plt.close(fig2)
    # Test trajectory (component 1)
    fig3, ax = plt.subplots(figsize=(6,3))
    ax.plot(y_te[:,0], label="True")
    ax.plot(y_pred_ecd[:,0], label="ECD", linestyle='--')
    ax.plot(y_pred_lti[:,0], label="LTI", linestyle=':')
    ax.plot(y_pred_nn[:,0], label="Neural")
    ax.set_xlabel("t (test)"); ax.set_ylabel("Comp 1")
    ax.set_title("Test Trajectory (comp 1)"); ax.legend()
    plt.tight_layout(); plt.savefig("ecd_figs/test_traj_comp1.png", dpi=200); plt.close(fig3)

def write_results_tex(L_bound, tau_pred, summary):
    try:
        with open("results.tex", "w") as f:
            f.write("% Auto-written by run_ecd.py\n")
            f.write("\\renewcommand{\\Lbound}{%.3f}\n" % L_bound)
            f.write("\\renewcommand{\\TauPred}{%.2f}\n" % tau_pred)
            f.write("\\renewcommand{\\TauObsMean}{%.2f}\n" % summary["tau_obs"][0])
            f.write("\\renewcommand{\\TauObsStd}{%.2f}\n" % summary["tau_obs"][1])
            f.write("\\renewcommand{\\RMSEecdMean}{%.4f}\n" % summary["rmse_ecd"][0])
            f.write("\\renewcommand{\\RMSEecdStd}{%.4f}\n" % summary["rmse_ecd"][1])
            f.write("\\renewcommand{\\RMSEltiMean}{%.4f}\n" % summary["rmse_lti"][0])
            f.write("\\renewcommand{\\RMSEltiStd}{%.4f}\n" % summary["rmse_lti"][1])
            f.write("\\renewcommand{\\RMSEnnMean}{%.4f}\n" % summary["rmse_nn"][0])
            f.write("\\renewcommand{\\RMSEnnStd}{%.4f}\n" % summary["rmse_nn"][1])
            f.write("\\renewcommand{\\CrecovMean}{%.4f}\n" % summary["crec"][0])
            f.write("\\renewcommand{\\CrecovStd}{%.4f}\n" % summary["crec"][1])
        print("[OK] Wrote results.tex")
    except Exception as e:
        print(f"[WARN] Could not write results.tex: {e}", file=sys.stderr)

# --------------------- Entry point ---------------------
if __name__ == "__main__":
    # Theoretical L_bound and tau_pred from canonical config
    Lphi = (UPP-ELL)/4.0
    Cnorm_assumed = 1.6   # empirical spectral norm in canonical C
    g = 0.0 if ABLATE_ALIGNMENT else GAMMA
    L_bound = (1 - g)*((1 - ALPHA) + ALPHA*Lphi*Cnorm_assumed) + g*(1 - LAMBDA)
    tau_pred = float(np.log(0.05)/np.log(L_bound))
    # Run experiment
    summary, C_true, C_hat, y_te, y_ecd, y_lti, y_nn = run_experiment(seeds=list(range(N_SEEDS)))
    # Figures
    plot_and_save(C_true, C_hat, y_te, y_ecd, y_lti, y_nn)
    # Report
    print("Results (mean ± std over seeds):")
    print(f"  L_bound             = {L_bound:.3f}")
    print(f"  tau_95 predicted    = {tau_pred:.2f} steps")
    print(f"  tau_95 observed     = {summary['tau_obs'][0]:.2f} ± {summary['tau_obs'][1]:.2f}")
    print(f"  RMSE_ECD            = {summary['rmse_ecd'][0]:.4f} ± {summary['rmse_ecd'][1]:.4f}")
    print(f"  RMSE_LTI            = {summary['rmse_lti'][0]:.4f} ± {summary['rmse_lti'][1]:.4f}")
    print(f"  RMSE_NN             = {summary['rmse_nn'][0]:.4f} ± {summary['rmse_nn'][1]:.4f}")
    print(f"  ||C_hat - C||_F     = {summary['crec'][0]:.4f} ± {summary['crec'][1]:.4f}")
    # Write LaTeX overrides
    write_results_tex(L_bound, tau_pred, summary)
\end{lstlisting}

\section{Implementation Notes}
\textbf{Code organization.} The code is modular: separate functions for data generation, ECD training (with validation and early stopping), baselines, plotting, and results writing.\\
\textbf{Error handling.} Directory creation and file writing are wrapped with basic error reporting.\\
\textbf{Early stopping and validation.} ECD and neural baselines use a validation split of the training set with patience-based stopping.\\
\textbf{Hyperparameters.} Documented at the top of the script; recommended to increase $N\_SEEDS$ for tighter CIs.\\
\textbf{Complexity.} BPTT cost scales as $O(Tn^2)$ for dense $C$; prefer sparse/structured $C$ and truncated BPTT for large $n$ or $T$.

\section*{Acknowledgments and Authorship}
C.\,L.\,Vaillant: conception, formal theory, LLM mapping specification, implementation design, writing. This version contains no real-world dataset; all results are synthetic by design.

\section*{References (indicative, non-exhaustive)}
\small
\begin{itemize}
\item Chen, R.T.Q.\ et al.\ (2018). Neural Ordinary Differential Equations.
\item Erichson, N.B.\ et al.\ (2021). Lipschitz Recurrent Neural Networks.
\item Cucker, F. \& Smale, S. (2007). Emergent behavior in flocks.
\item Kuramoto, Y. (1975/84). Self-entrainment of a population of oscillators.
\item Lotka, A. (1925); Volterra, V. (1926). Elements of physical biology / population dynamics.
\item Lohmiller, W. \& Slotine, J.-J.E. (1998). On contraction analysis for nonlinear systems.
\item Meyn, S.P. \& Tweedie, R.L. (2009). Markov Chains and Stochastic Stability, Theorem~15.0.1.
\item Kushner, H. \& Yin, G. (2003). Stochastic Approximation and Recursive Algorithms and Applications, Chapter~2.
\item Granas, A. \& Dugundji, J. (2003). Fixed Point Theory. % for perturbation perspective
\end{itemize}

\end{document}
```0
Cody Vaillant
Emergent Coupling Dynamics:

Identifying Phase Transitions in Recursive Cognition: