Understanding as Dependency Revision: A Dependency-Centric Theory of Intelligence
\documentclass[11pt,english]{article}
% ------------------------------------------------------------
% Packages
% ------------------------------------------------------------
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{lmodern}
\usepackage[a4paper,margin=1in]{geometry}
\usepackage{babel}
\usepackage{microtype}
\usepackage{amsmath,amssymb,amsthm,mathtools}
\usepackage{booktabs}
\usepackage{tabularx}
\usepackage{array}
\usepackage{enumitem}
\usepackage{xcolor}
\usepackage{hyperref}
\usepackage{tikz}
\usepackage{algorithm}
\usepackage{algpseudocode}
\usepackage{float}
\usepackage{caption}
\usetikzlibrary{arrows.meta,positioning,calc,fit,shapes.geometric}
% ------------------------------------------------------------
% Hyperref setup
% ------------------------------------------------------------
\hypersetup{
colorlinks=true,
linkcolor=blue!50!black,
citecolor=blue!50!black,
urlcolor=blue!50!black,
pdftitle={Understanding as Dependency Revision},
pdfauthor={C. L. Vaillant}
}
% ------------------------------------------------------------
% Theorem environments
% ------------------------------------------------------------
\newtheorem{definition}{Definition}
\newtheorem{hypothesis}{Hypothesis}
\newtheorem{proposition}{Proposition}
\newtheorem{claim}{Claim}
\newtheorem{assumption}{Assumption}
\newtheorem{remark}{Remark}
\newtheorem{example}{Example}
% ------------------------------------------------------------
% Custom commands
% ------------------------------------------------------------
\newcommand{\SCDD}{\mathrm{SCDD}}
\newcommand{\DCA}{\mathrm{DCA}}
\newcommand{\LossD}{\mathcal{L}_{D}}
\newcommand{\Graph}{\mathcal{G}}
\newcommand{\Types}{\mathcal{T}}
\newcommand{\Contexts}{\mathcal{C}}
\newcommand{\Obs}{\mathcal{O}}
\newcommand{\Revision}{\mathcal{R}}
\newcommand{\Support}{\mathcal{S}}
\newcommand{\Poss}{\mathcal{P}}
\newcommand{\Act}{\mathcal{A}}
\newcommand{\Cand}{\mathcal{E}_{\mathrm{cand}}}
\newcommand{\Domain}{\mathcal{D}}
\newcommand{\Ctx}{\mathcal{X}}
\newcommand{\Edges}{\mathcal{E}}
\newcommand{\Nodes}{\mathcal{V}}
\newcommand{\Evidence}{\mathcal{Z}}
\newcommand{\TrueGraph}{G^\star}
\newcommand{\Edit}{\mathsf{edit}}
\newcommand{\Cost}{\mathsf{Cost}}
\newcommand{\Complexity}{\mathsf{K}}
\newcommand{\Mismatch}{\mathsf{M}}
\newcommand{\Risk}{\mathsf{Risk}}
\newcommand{\E}{\mathbb{E}}
\DeclareMathOperator{\Compress}{Compress}
\DeclareMathOperator{\Trace}{Trace}
\DeclareMathOperator{\Project}{Project}
\DeclareMathOperator{\Recombine}{Recombine}
\DeclareMathOperator{\argmin}{arg\,min}
\DeclareMathOperator{\softmax}{softmax}
Understanding as Dependency Revision:}\\
A Dependency-Centric Theory of Intelligence}
% ------------------------------------------------------------
% Title
% ------------------------------------------------------------
\title{\textbf{
\author{
C. L. Vaillant\\
\small Recursive Generative Emergence Project\\
\small \texttt{www.rgemergence.com}
}
\date{\today}
\begin{document}
\maketitle
\begin{abstract}
Contemporary theories of intelligence often define learning in terms of prediction, optimization, inference, reward maximization, or compression. These approaches have produced powerful models, but they leave open a deeper question: what is the object being updated when a system comes to understand? This paper proposes that understanding is best modeled as the maintenance and revision of dependency structures: typed graphs of enabling conditions that represent what makes outcomes possible, valid, stable, or coherent. On this view, prediction is not the fundamental aim of intelligence but a downstream consequence of a well-formed dependency model. The proposed framework, Structural-Contextual Dependency Discovery, formalizes learning as a cycle of dependency loss, dependency credit assignment, and graph revision. The theory draws from causal modeling, Bayesian network structure learning, belief revision, truth-maintenance systems, predictive processing, graph networks, mechanistic interpretability, causal representation learning, world models, and automated scientific discovery. Its central contribution is to treat dependency revision itself as the general substrate of understanding. The paper defines a typed dependency taxonomy, introduces operational forms of dependency loss and credit assignment, specifies graph revision as constrained edit selection, proves local loss-reduction and context-transfer propositions under stated assumptions, gives a numeric worked example, and proposes benchmark metrics for empirical evaluation.
\end{abstract}
\noindent\textbf{Keywords:}
dependency revision; understanding; intelligence; causal reasoning; graph learning; belief revision; mechanistic interpretability; predictive processing; scientific discovery; structural learning
\tableofcontents
\section{Introduction}
A predictive system asks what is likely to happen. An understanding system asks what makes an outcome possible.
This distinction matters because prediction can succeed without understanding. A model may learn that a dropped glass tends to break without representing the enabling structure behind the event: gravity, height, material brittleness, floor hardness, impact geometry, prior stress, and context. A dependency model, by contrast, represents the conditions under which the outcome becomes possible, impossible, likely, unlikely, robust, fragile, or transformable.
Modern machine learning often updates parameters. Bayesian systems update beliefs. Reinforcement learning systems update policies or value estimates. Causal models update structural assumptions about variables and interventions. Symbolic systems revise rules or belief sets. Each of these captures part of learning, but none alone fully captures the phenomenon targeted in this paper: the revision of the structure that defines what the system takes to be possible.
The proposal advanced here is that intelligence is the capacity to construct, evaluate, and recursively revise models of enabling conditions across scales.
This is not a rejection of prediction. Rather, prediction is treated as one expression of a deeper competence: maintaining a coherent model of dependency. A system predicts well when its dependency model captures the conditions that generate outcomes. It explains well when it can trace those dependencies. It plans well when it can project them forward. It repairs itself when it can identify which dependency failed. It creates when it recombines dependencies into novel structures.
The framework developed here is called \emph{Structural-Contextual Dependency Discovery}, abbreviated as \(\SCDD\).
\section{Core Thesis}
\begin{hypothesis}[Dependency-Centric Intelligence]
Intelligence is the capacity to construct, evaluate, and recursively revise models of enabling conditions across scales.
\end{hypothesis}
\begin{definition}[Enabling Condition]
An enabling condition is a dependency whose satisfaction contributes to the possibility, validity, generation, stability, coherence, or transformation of a state, process, action, claim, or outcome.
\end{definition}
\begin{definition}[Understanding]
Understanding is the coherent representation of enabling dependencies.
\end{definition}
The central claim can be written compactly as
\[
\boxed{
\text{Understanding is not prediction alone; it is a model of what makes outcomes possible.}
}
\]
The corresponding theory of learning is
\[
\boxed{
\text{Learning is recursive dependency revision under failure.}
}
\]
A model that only predicts attempts to estimate
\[
P(O_{t+1}\mid O_{\leq t}),
\]
where \(O_t\) denotes observed outcomes. A dependency model instead attempts to learn
\[
\Poss(O_t)=\{d_1,d_2,\ldots,d_n\},
\]
where each \(d_i\) is an enabling condition that contributes to the possibility, stability, validity, coherence, or failure of \(O_t\).
\section{Contributions}
This paper makes six contributions.
\begin{enumerate}[leftmargin=1.2cm]
\item It proposes a dependency-centric account of understanding: understanding is the coherent representation of enabling conditions.
\item It formalizes understanding as a typed dependency graph
\[
G_t=(V_t,E_t,\tau,\chi,w,s,u),
\]
whose edges encode dependency type, context, strength, support, and uncertainty.
\item It introduces dependency loss, a composite failure signal that includes prediction, explanation, coherence, support, type consistency, contextual validity, and complexity.
\item It defines Dependency Credit Assignment, a graph-level analogue of credit assignment that localizes failure to nodes, edges, dependency types, contexts, or missing relations.
\item It defines graph revision as constrained graph-edit selection under loss, complexity, and revision-cost terms.
\item It proposes evaluation metrics and benchmark structure for testing whether dependency-revision systems outperform prediction-only systems on counterfactual transfer, explanation repair, type correction, and context-sensitive generalization.
\end{enumerate}
\section{Central Update Rule}
Let \(G_t\) denote the system's current dependency graph at revision cycle \(t\). Let \(O_t\) denote the observed outcome produced by inference, action, or interaction with the environment. Let
\[
\LossD(G_t,O_t)
\]
denote dependency loss, a measure of the system's failure to explain, predict, support, type, contextualize, or compress the observed outcome.
Let
\[
\DCA(\LossD(G_t,O_t))
\]
denote Dependency Credit Assignment: the mechanism that identifies which nodes, edges, assumptions, types, contexts, or missing dependencies contributed to the observed failure.
Let \(R\) denote the graph revision operator. Then the dependency revision update is
\[
\boxed{
G_{t+1}
=
R\!\left(
G_t,\;
\DCA\!\left(
\LossD(G_t,O_t)
\right)
\right).
}
\]
In plain language, understanding evolves by detecting where a dependency model fails, assigning responsibility to the assumptions that produced the failure, and revising the graph so future explanations and actions become more coherent.
\section{Relation to Existing Work}
\subsection{Causal Models}
Structural causal models represent causal relationships among variables and support reasoning about interventions and counterfactuals \cite{pearl2009causality}. \(\SCDD\) overlaps with causal modeling but is broader. A causal edge is one kind of dependency, but not every dependency is causal in the interventionist sense.
For example,
\[
\text{oxygen} \rightarrow \text{combustion}
\]
may be causal or enabling, while
\[
\text{three sides} \rightarrow \text{triangle}
\]
is constitutive rather than causal. Likewise,
\[
\text{type consistency} \rightarrow \text{valid formal inference}
\]
is logical or formal rather than physical. \(\SCDD\) therefore generalizes from causal graphs to typed dependency graphs.
\subsection{Causal Discovery and Bayesian Network Structure Learning}
Causal discovery attempts to infer causal structure from statistical evidence under assumptions \cite{spirtes2000causation}. Bayesian network structure learning studies how probabilistic graphical structures can be inferred from data \cite{koller2009probabilistic}. These fields are close to \(\SCDD\) because they treat structure as learnable.
The difference is that \(\SCDD\) is not limited to probabilistic or causal structure. Its target is enabling structure: the typed set of dependencies that determine what must hold for a claim, action, explanation, system, or outcome to be possible or valid.
\subsection{Belief Revision}
Belief revision theory studies how rational agents should change their beliefs when new information arrives. The AGM framework is the classical reference point \cite{alchourron1985logic,gardenfors1988knowledge}. In belief revision, the object of revision is typically a set of propositions or beliefs.
\(\SCDD\) changes the unit of revision. The object is not only a proposition set, but a dependency graph. The relevant question is not merely which belief should be removed, but which dependency made the failed belief possible and how the graph of support should be revised.
\subsection{Truth-Maintenance Systems}
Truth-maintenance systems provide one of the closest historical precedents. Doyle's truth-maintenance system represented dependencies among beliefs and used justifications to maintain consistency \cite{doyle1979truth}. Dependency-directed backtracking similarly recorded dependencies so that failures could be traced to responsible assumptions \cite{stallman1977forward}.
The honest novelty claim is therefore not that dependency tracing has never existed. It has. The stronger claim is that \(\SCDD\) generalizes this repair logic into a theory of intelligence itself. Truth-maintenance systems use dependency tracing to preserve consistency in symbolic reasoning systems. \(\SCDD\) proposes dependency revision as a general mechanism underlying learning, explanation, counterfactual reasoning, planning, creativity, self-correction, and scientific discovery.
\subsection{Predictive Processing and Active Inference}
Predictive processing describes cognition as hierarchical prediction and error correction \cite{clark2013whatever}. The free-energy principle similarly frames perception, action, and learning through the minimization of free energy or prediction error \cite{friston2010free}.
\(\SCDD\) is compatible with predictive processing but shifts the explanatory emphasis. Predictive processing treats prediction error as central. \(\SCDD\) treats prediction error as a diagnostic signal that reveals where the system's dependency model is deficient.
Thus,
\[
\text{prediction error}
\rightarrow
\text{dependency loss}
\rightarrow
\text{dependency revision}.
\]
Prediction error matters because it exposes failed assumptions about what makes outcomes possible.
\subsection{Graph Networks and Relational Inductive Bias}
Graph networks encode entities and relations directly and provide a general architecture for relational reasoning \cite{battaglia2018relational}. \(\SCDD\) agrees that relational structure is fundamental. However, graph networks primarily specify a computational architecture, whereas \(\SCDD\) specifies an interpretation of the graph: it is a model of enabling conditions.
The graph is not merely a convenient data structure. It is the system's current model of possibility.
\subsection{Mechanistic Interpretability}
Mechanistic interpretability attempts to reverse-engineer learned neural systems into understandable circuits, features, and algorithms \cite{olah2020zoom}. \(\SCDD\) complements this project by suggesting a target representation: the dependency graph that explains why a model produced a result.
A mechanistic interpretability result may identify circuits, features, or attention patterns. \(\SCDD\) asks how those components participate in enabling or disabling claims, predictions, actions, or explanations.
\subsection{Causal Representation Learning and World Models}
Causal representation learning connects machine learning with causal abstraction, transfer, and generalization \cite{scholkopf2021causal}. World-model approaches in reinforcement learning learn compressed representations of environments that support control \cite{ha2018world}.
\(\SCDD\) shares their ambition but shifts emphasis again. A world model represents an environment. A causal representation captures causal structure. A dependency graph represents enabling conditions across causal, logical, semantic, contextual, temporal, evidential, functional, and structural dimensions.
\subsection{Automated Scientific Discovery}
Automated scientific discovery and symbolic regression seek human-interpretable structures, equations, and theories from data \cite{schmidt2009distilling}. \(\SCDD\) naturally fits this domain because science is not merely prediction. Scientific understanding involves discovering the dependencies that generate a phenomenon.
A scientific theory is powerful when it exposes the enabling structure behind observations.
\section{Comparison with Neighboring Frameworks}
Table \ref{tab:comparison} summarizes the distinction between \(\SCDD\) and adjacent approaches.
\begin{table}[H]
\centering
\small
\begin{tabularx}{\linewidth}{
>{\raggedright\arraybackslash}p{2.65cm}
>{\raggedright\arraybackslash}p{2.95cm}
>{\raggedright\arraybackslash}p{2.85cm}
>{\raggedright\arraybackslash}X}
\toprule
\textbf{Framework} & \textbf{Primary object updated} & \textbf{Failure signal} & \textbf{Relation to \(\SCDD\)} \\
\midrule
Gradient-based machine learning & Parameters & Predictive or task loss & \(\SCDD\) may use gradients, but its target is graph-level dependency revision rather than parameter adjustment alone. \\
\midrule
Causal discovery & Causal graph structure & Statistical or interventional inconsistency & \(\SCDD\) includes causal dependency but also logical, contextual, constitutive, semantic, evidential, temporal, and normative dependencies. \\
\midrule
Bayesian network learning & Probabilistic graphical structure & Likelihood, posterior score, or structure prior & \(\SCDD\) is not restricted to probabilistic dependence; it models enabling conditions and validity contexts. \\
\midrule
Belief revision & Belief set or proposition set & Inconsistency with new information & \(\SCDD\) revises the dependency structure that supports beliefs, not only the beliefs themselves. \\
\midrule
Truth-maintenance systems & Justifications and assumptions & Contradiction or invalid support & \(\SCDD\) generalizes dependency tracing from local consistency maintenance to a broad theory of learning and understanding. \\
\midrule
Predictive processing & Hierarchical generative expectations & Prediction error & \(\SCDD\) treats prediction error as a diagnostic signal for dependency loss. \\
\midrule
Graph networks & Latent relational representations & Task loss & \(\SCDD\) gives a semantic interpretation to the graph as a model of enabling conditions. \\
\midrule
Mechanistic interpretability & Circuits, features, algorithms & Explanation gap or behavioral puzzle & \(\SCDD\) asks how identified mechanisms enable or disable claims, outputs, and actions. \\
\bottomrule
\end{tabularx}
\caption{Comparison between \(\SCDD\) and adjacent research traditions.}
\label{tab:comparison}
\end{table}
\section{Formal Framework}
Let the system maintain a typed dependency graph
\[
G_t = (V_t,E_t,\tau,\chi,w,s,u),
\]
where
\[
V_t
\]
is the set of nodes,
\[
E_t
\]
is the set of dependency edges,
\[
\tau:E_t\rightarrow \Types
\]
assigns a dependency type,
\[
\chi:E_t\rightarrow \Contexts
\]
assigns a context of validity,
\[
w:E_t\rightarrow \mathbb{R}
\]
assigns edge strength,
\[
s:E_t\rightarrow [0,1]
\]
assigns evidential support or confidence, and
\[
u:E_t\rightarrow [0,1]
\]
assigns uncertainty.
A dependency edge
\[
e_{ij}:v_i\rightarrow v_j
\]
means that \(v_i\) contributes to the possibility, validity, generation, support, or stability of \(v_j\) under context \(\chi(e_{ij})\).
\begin{definition}[Typed Dependency Graph]
A typed dependency graph is a graph whose edges encode not only the existence of a relation but also the kind of dependency, the context in which it holds, its strength, its evidential support, and its uncertainty.
\end{definition}
\begin{definition}[Graph-State Understanding]
A system's understanding at revision cycle \(t\) is the graph state \(G_t\), together with the procedures that use \(G_t\) for prediction, explanation, action, counterfactual reasoning, and revision.
\end{definition}
\section{Dependency Taxonomy}
A central requirement of \(\SCDD\) is that dependencies not be collapsed into a single generic relation. Different dependency types support different forms of revision.
Let
\[
\Types =
\{
T_{\mathrm{causal}},
T_{\mathrm{logical}},
T_{\mathrm{constitutive}},
T_{\mathrm{contextual}},
T_{\mathrm{temporal}},
T_{\mathrm{semantic}},
T_{\mathrm{evidential}},
T_{\mathrm{normative}},
T_{\mathrm{functional}}
\}.
\]
The following taxonomy is not exhaustive, but gives a working basis.
\begin{table}[H]
\centering
\small
\begin{tabularx}{\linewidth}{
>{\raggedright\arraybackslash}p{3.0cm}
>{\raggedright\arraybackslash}p{4.0cm}
>{\raggedright\arraybackslash}X}
\toprule
\textbf{Dependency type} & \textbf{Meaning} & \textbf{Example} \\
\midrule
Causal & One state or process contributes to producing another. & Heat contributes to thermal expansion. \\
\midrule
Logical & One proposition or rule supports formal inference to another. & Axioms enable a theorem. \\
\midrule
Constitutive & One property partly defines what another thing is. & Three-sidedness is constitutive of triangularity. \\
\midrule
Contextual & A relation holds only within a specified context. & Water boils near \(100^\circ\mathrm{C}\) at approximately 1 atm pressure. \\
\midrule
Temporal & One event or state depends on ordering, duration, or sequence. & Fertilization precedes embryonic development. \\
\midrule
Semantic & Meaning depends on symbol use, interpretation, or reference. & The meaning of a pronoun depends on its antecedent. \\
\midrule
Evidential & A claim is supported by observations or records. & A measurement supports an empirical estimate. \\
\midrule
Normative & A conclusion depends on a rule, value, duty, or standard. & A policy judgment depends on a fairness criterion. \\
\midrule
Functional & A component contributes to the operation of a system. & A valve enables pressure regulation. \\
\bottomrule
\end{tabularx}
\caption{A working taxonomy of dependency types used by \(\SCDD\).}
\label{tab:taxonomy}
\end{table}
The taxonomy matters because a failure in one dependency type should not automatically be repaired as though it were a failure in another. For example, a moral objection to a theorem may affect the theorem's use or social interpretation, but it does not by itself refute the theorem's formal validity. Conversely, a formally valid inference may still fail normatively, contextually, or evidentially.
\section{Assumptions and Scope Conditions}
The present version of \(\SCDD\) relies on the following assumptions.
\begin{assumption}[Representability]
The system's relevant understanding can be approximated by a typed dependency graph \(G_t\).
\end{assumption}
\begin{assumption}[Failure Detectability]
At least some failures of prediction, explanation, coherence, support, type consistency, contextual validity, or complexity can be measured by a dependency loss function \(\LossD\).
\end{assumption}
\begin{assumption}[Attribution Possibility]
For a nontrivial class of failures, responsibility can be assigned to a subset of nodes, edges, types, contexts, or missing dependencies.
\end{assumption}
\begin{assumption}[Revision Locality]
Many failures can be repaired through local or semi-local graph revision rather than complete graph reconstruction.
\end{assumption}
\begin{assumption}[Compression Requirement]
Because dependency graphs can grow without bound, graph revision must include compression, pruning, abstraction, or regularization.
\end{assumption}
\begin{assumption}[Contextual Recurrence]
Some contexts recur sufficiently often that corrected dependency structure can improve future performance.
\end{assumption}
These assumptions are not guaranteed in every domain. They specify the conditions under which \(\SCDD\) should be expected to operate.
\section{Dependency Loss}
Dependency loss measures where the graph fails as a model of possibility. Define
\[
\LossD(G_t,O_t)
=
\lambda_p L_{\mathrm{pred}}
+
\lambda_e L_{\mathrm{expl}}
+
\lambda_c L_{\mathrm{coh}}
+
\lambda_s L_{\mathrm{sup}}
+
\lambda_\tau L_{\mathrm{type}}
+
\lambda_x L_{\mathrm{ctx}}
+
\lambda_k L_{\mathrm{complexity}}
+
\lambda_u L_{\mathrm{uncertainty}},
\]
where
\[
L_{\mathrm{pred}}
\]
measures predictive error,
\[
L_{\mathrm{expl}}
\]
measures explanatory failure,
\[
L_{\mathrm{coh}}
\]
measures incoherence or contradiction,
\[
L_{\mathrm{sup}}
\]
measures lack of support,
\[
L_{\mathrm{type}}
\]
measures type inconsistency,
\[
L_{\mathrm{ctx}}
\]
measures contextual invalidity,
\[
L_{\mathrm{complexity}}
\]
penalizes unnecessary graph complexity, and
\[
L_{\mathrm{uncertainty}}
\]
penalizes overconfident unsupported dependencies.
A useful form of the uncertainty term is
\[
L_{\mathrm{uncertainty}}
=
\sum_{e\in E_t}
s(e)\,u(e)\,|w(e)|,
\]
which penalizes edges that are simultaneously strong, uncertain, and treated as supported.
This is a key departure from ordinary predictive loss. The system is not only penalized for being wrong. It is penalized for being wrong for structurally bad reasons.
A system may predict correctly but explain poorly. In \(\SCDD\), that still produces dependency loss. A system may produce a plausible answer from an invalid dependency. In \(\SCDD\), that also produces dependency loss. A system may generate a coherent local explanation that violates context. In \(\SCDD\), that too is a graph failure.
\section{Dependency Credit Assignment}
Dependency Credit Assignment identifies which nodes, edges, types, contexts, or abstractions contributed to dependency loss.
Let
\[
A_t = \DCA(\LossD(G_t,O_t)),
\]
where \(A_t\) is an attribution structure over the graph:
\[
A_t =
\{
a_v(v_i),
a_e(e_{ij}),
a_\tau(e_{ij}),
a_\chi(e_{ij}),
a_m(m_k)
\}.
\]
Here,
\[
a_v(v_i)
\]
estimates the contribution of node \(v_i\) to the failure,
\[
a_e(e_{ij})
\]
estimates the contribution of edge \(e_{ij}\),
\[
a_\tau(e_{ij})
\]
estimates the contribution of an incorrect dependency type,
\[
a_\chi(e_{ij})
\]
estimates the contribution of an incorrect context assignment, and
\[
a_m(m_k)
\]
estimates the contribution of a missing dependency candidate \(m_k\).
\subsection{Differentiable Credit Assignment}
In differentiable systems, \(\DCA\) may use gradients:
\[
a_e(e_{ij})
\propto
\left|
\frac{\partial \LossD}{\partial w(e_{ij})}
\right|.
\]
A large derivative indicates that small changes in the edge weight would strongly affect dependency loss.
\subsection{Symbolic Credit Assignment}
In symbolic systems, \(\DCA\) may use proof tracing, justification tracking, contradiction provenance, or dependency-directed backtracking. If a contradiction depends on assumptions \(a_1,a_2,a_3\), then those assumptions and their supporting edges receive credit for the failure.
\subsection{Counterfactual Credit Assignment}
A system may also estimate credit through graph interventions:
\[
a_e(e)
=
\LossD(G_t,O_t)
-
\LossD(G_t\setminus e,O_t).
\]
If removing edge \(e\) reduces dependency loss, then \(e\) is likely contributing to failure.
Similarly, if adding a candidate dependency \(m\) reduces loss,
\[
a_m(m)
=
\LossD(G_t,O_t)
-
\LossD(G_t\cup \{m\},O_t),
\]
then \(m\) is a candidate missing dependency.
\subsection{Bayesian Credit Assignment}
A probabilistic version of \(\DCA\) treats graph edits as hypotheses. Let \(h_r\) be the hypothesis that revision \(r\) is the correct repair. Then
\[
P(h_r\mid O_t,G_t)
\propto
P(O_t\mid h_r,G_t)P(h_r\mid G_t).
\]
The revision system can then prioritize edits by posterior plausibility:
\[
r_t^*
=
\argmax_{r\in\Act(G_t)}
P(h_r\mid O_t,G_t).
\]
\subsection{Hybrid Credit Assignment}
In hybrid systems, \(\DCA\) may combine gradient attribution, ablation testing, counterfactual intervention, symbolic provenance, human correction, and uncertainty estimates:
\[
A_t
=
\alpha A_{\mathrm{grad}}
+
\beta A_{\mathrm{symbolic}}
+
\gamma A_{\mathrm{cf}}
+
\delta A_{\mathrm{human}}
+
\zeta A_{\mathrm{uncertainty}}.
\]
The purpose is not merely to assign blame. It is to localize the structural source of failure.
\section{Graph Revision}
The graph revision operator updates the dependency graph:
\[
G_{t+1}=R(G_t,A_t).
\]
Revision may include weakening unsupported edges,
\[
w(e) \leftarrow w(e)-\eta a_e(e),
\]
strengthening supported dependencies,
\[
s(e) \leftarrow s(e)+\eta \Delta s(e),
\]
pruning unsupported dependencies,
\[
E_{t+1}\leftarrow E_t\setminus\{e:s(e)<\epsilon\},
\]
adding candidate dependencies,
\[
E_{t+1}\leftarrow E_t\cup E_{\mathrm{candidate}},
\]
revising dependency types,
\[
\tau(e)\leftarrow \tau'(e),
\]
revising contexts,
\[
\chi(e)\leftarrow \chi'(e),
\]
increasing uncertainty when evidence is insufficient,
\[
u(e)\leftarrow \min(1,u(e)+\eta_u),
\]
and compressing redundant structure,
\[
G_{t+1}\leftarrow \Compress(G_{t+1}).
\]
\subsection{Revision as Constrained Edit Selection}
Revision can be formalized as an action-selection problem over possible graph edits. Let
\[
\Act(G_t)
\]
be the set of available graph actions, including adding, deleting, weakening, strengthening, retyping, contextualizing, merging, and compressing.
The system chooses
\[
r_t^*
=
\argmin_{r\in \Act(G_t)}
\mathbb{E}
\left[
\LossD(r(G_t),O_t)
+
\lambda_k K(r(G_t))
+
\lambda_r C(r)
+
\lambda_u U(r(G_t))
\right],
\]
where \(K(r(G_t))\) measures graph complexity, \(C(r)\) measures revision cost, and \(U(r(G_t))\) measures unresolved uncertainty. Then
\[
G_{t+1}=r_t^*(G_t).
\]
This prevents revision from becoming arbitrary. The preferred revision is the one expected to reduce dependency loss while controlling complexity, edit cost, and uncertainty.
\section{Compression and Graph Explosion}
A dependency graph can grow indefinitely if every failure creates new nodes and edges. \(\SCDD\) therefore requires explicit compression.
Compression may include:
\[
\Compress(G)
=
\argmin_{G'}
\left[
\LossD(G',O)
+
\lambda_k |E'|
+
\lambda_v |V'|
+
\lambda_i I(G;G')
\right],
\]
where \(|E'|\) penalizes edge count, \(|V'|\) penalizes node count, and \(I(G;G')\) measures information loss relative to the original graph.
The goal is not minimality alone. A graph that is too simple may fail to explain. A graph that is too complex may overfit. The aim is structured sufficiency: the smallest graph that preserves the dependencies needed for prediction, explanation, intervention, and revision.
\section{Worked Example: Revising a Breakage Model}
This section gives a small numeric example. It is not an empirical result. It is an operational demonstration of the proposed update cycle.
\subsection{Initial Graph}
Suppose a system initially learns the following dependency model:
\[
G_t =
\{
e_1,e_2,e_3
\},
\]
where
\[
e_1:
\text{dropped from height}
\rightarrow
\text{high impact energy},
\]
\[
e_2:
\text{glass material}
\rightarrow
\text{brittleness},
\]
and
\[
e_3:
\text{high impact energy}+\text{brittleness}
\rightarrow
\text{breakage}.
\]
Assign initial weights and support:
\[
w(e_1)=0.90,\quad s(e_1)=0.85,
\]
\[
w(e_2)=0.95,\quad s(e_2)=0.90,
\]
\[
w(e_3)=0.92,\quad s(e_3)=0.88.
\]
The context assignment for \(e_3\) is initially too broad:
\[
\chi(e_3)=\text{general surface}.
\]
The system predicts breakage:
\[
\hat{O}_t=\text{breakage}.
\]
Now suppose the observed outcome is:
\[
O_t:
\text{a glass cup dropped from one meter onto thick carpet does not break.}
\]
\subsection{Dependency Loss}
The system's prediction fails. A prediction-only model may simply reduce the probability of breakage. \(\SCDD\), however, asks which dependency structure failed.
Let the loss components be
\[
L_{\mathrm{pred}}=1.0,
\quad
L_{\mathrm{expl}}=0.8,
\quad
L_{\mathrm{ctx}}=0.9,
\quad
L_{\mathrm{type}}=0.0,
\quad
L_{\mathrm{complexity}}=0.2.
\]
With equal weights for the first three major terms and smaller complexity weight,
\[
\LossD(G_t,O_t)
=
1.0+0.8+0.9+0.1(0.2)
=
2.72.
\]
Dependency Credit Assignment identifies that \(e_3\) is undercontextualized. The edge
\[
\text{high impact energy}+\text{brittleness}
\rightarrow
\text{breakage}
\]
is not universally valid. It depends on transferred impact energy, which itself depends on surface hardness and energy absorption.
\subsection{Candidate Missing Dependencies}
The system introduces candidate dependencies:
\[
m_1:
\text{carpet}
\rightarrow
\text{energy absorption},
\]
\[
m_2:
\text{energy absorption}
\rightarrow
\text{reduced transferred impact},
\]
\[
m_3:
\text{reduced transferred impact}
\rightarrow
\text{lower breakage probability}.
\]
The candidate revision is
\[
G_{t+1}
=
G_t
\cup
\{m_1,m_2,m_3\}
\]
with the original breakage dependency recontextualized as
\[
e_3':
\text{transferred impact energy}+\text{brittleness}
\rightarrow
\text{breakage}.
\]
The revised edge has
\[
\chi(e_3')=\text{insufficient energy absorption}.
\]
\subsection{Numeric Revision}
Suppose \(\DCA\) assigns the following failure credits:
\[
a_\chi(e_3)=0.85,
\quad
a_m(m_1)=0.70,
\quad
a_m(m_2)=0.65,
\quad
a_m(m_3)=0.60.
\]
The revision operator chooses the following edits:
\[
\chi(e_3)\leftarrow \text{insufficient energy absorption},
\]
\[
E_{t+1}\leftarrow E_t\cup\{m_1,m_2,m_3\},
\]
\[
w(e_3)\leftarrow 0.78,
\]
\[
s(e_3)\leftarrow 0.80.
\]
After revision, suppose the loss components on the same case become
\[
L_{\mathrm{pred}}=0.2,
\quad
L_{\mathrm{expl}}=0.1,
\quad
L_{\mathrm{ctx}}=0.1,
\quad
L_{\mathrm{type}}=0.0,
\quad
L_{\mathrm{complexity}}=0.5.
\]
Then
\[
\LossD(G_{t+1},O_t)
=
0.2+0.1+0.1+0.1(0.5)
=
0.45.
\]
Thus,
\[
\LossD(G_{t+1},O_t)<\LossD(G_t,O_t).
\]
The system has not merely updated a probability. It has revised its model of what makes breakage possible.
\subsection{Before-and-After Graph}
\begin{figure}[H]
\centering
\begin{tikzpicture}[
node distance=1.3cm,
box/.style={
rectangle,
rounded corners,
draw=black,
align=center,
minimum width=2.5cm,
minimum height=0.8cm,
font=\small
},
arrow/.style={-{Latex[length=2.5mm]}, thick},
weak/.style={-{Latex[length=2.5mm]}, thick, dashed}
]
\node[box] (h1) {Dropped\\from height};
\node[box, below=of h1] (i1) {High impact\\energy};
\node[box, right=of i1] (b1) {Breakage};
\node[box, above=of b1] (g1) {Glass\\material};
\node[box, below=of b1] (label1) {Before revision};
\draw[arrow] (h1) -- (i1);
\draw[arrow] (i1) -- (b1);
\draw[arrow] (g1) -- (b1);
\node[box, right=4.1cm of h1] (h2) {Dropped\\from height};
\node[box, below=of h2] (i2) {High impact\\energy};
\node[box, right=of i2] (ti2) {Transferred\\impact energy};
\node[box, right=of ti2] (b2) {Breakage};
\node[box, above=of b2] (g2) {Glass\\material};
\node[box, below=of i2] (c2) {Carpet};
\node[box, below=of ti2] (a2) {Energy\\absorption};
\node[box, below=of b2] (label2) {After revision};
\draw[arrow] (h2) -- (i2);
\draw[arrow] (i2) -- (ti2);
\draw[arrow] (ti2) -- (b2);
\draw[arrow] (g2) -- (b2);
\draw[arrow] (c2) -- (a2);
\draw[weak] (a2) -- node[right,font=\scriptsize] {reduces} (ti2);
\end{tikzpicture}
\caption{Before and after dependency revision in the cup-breakage example. The revised graph inserts surface-mediated energy absorption as an enabling condition affecting transferred impact energy.}
\label{fig:cup_graph}
\end{figure}
This example shows the central distinction. A prediction-centered system learns that breakage was less likely than expected. A dependency-centered system learns that surface-mediated energy transfer is an enabling condition for breakage.
\section{Proposition 1: Local Loss Reduction}
The following proposition is intentionally modest. It does not prove \(\SCDD\) as a general theory of intelligence. It states a local condition under which dependency revision reduces dependency loss.
\begin{proposition}[Local Loss Reduction Under Correct Dependency Credit]
Let \(G_t\) contain an edge \(e\) whose strength \(w(e)\) contributes monotonically to dependency loss in a repeated context \(c\). Suppose \(\DCA\) assigns positive failure credit to \(e\), and suppose the revision operator weakens \(e\) by
\[
w_{t+1}(e)=w_t(e)-\eta a_e(e),
\]
where \(\eta>0\) and \(a_e(e)>0\). If \(\LossD\) is locally differentiable in \(w(e)\) and the update direction is aligned with the negative local gradient of \(\LossD\), then for sufficiently small \(\eta\),
\[
\LossD(G_{t+1},O_t)<\LossD(G_t,O_t).
\]
\end{proposition}
\begin{proof}
By local differentiability, a first-order expansion gives
\[
\LossD(G_{t+1},O_t)
\approx
\LossD(G_t,O_t)
+
\frac{\partial \LossD}{\partial w(e)}
\Delta w(e).
\]
The update is
\[
\Delta w(e)=-\eta a_e(e).
\]
If \(\DCA\) is aligned with the loss-increasing contribution of \(e\), then
\[
\frac{\partial \LossD}{\partial w(e)}a_e(e)>0.
\]
Therefore,
\[
\frac{\partial \LossD}{\partial w(e)}
\Delta w(e)
=
-\eta
\frac{\partial \LossD}{\partial w(e)}
a_e(e)
<0.
\]
For sufficiently small \(\eta\), higher-order terms do not dominate the first-order decrease. Hence
\[
\LossD(G_{t+1},O_t)<\LossD(G_t,O_t).
\]
This proves local loss reduction under the stated assumptions.
\end{proof}
\begin{remark}
The proposition is deliberately local. It does not guarantee global convergence, correct ontology, or complete understanding. It shows that if dependency credit assignment correctly identifies a failure-producing dependency and the revision operator modifies it in the right direction, then dependency loss can decrease.
\end{remark}
\section{Proposition 2: Context-Transfer Advantage}
The next proposition connects \(\SCDD\) to its primary empirical promise: improved transfer under context shift.
\begin{proposition}[Context-Transfer Improvement Under Invariant Dependency Repair]
Let \(\Contexts\) be a family of contexts, and suppose there exists an invariant enabling dependency \(e^\star\) that is relevant across contexts \(c\in\Contexts'\subseteq\Contexts\). Let \(G_t\) omit or miscontextualize \(e^\star\), and let \(G_{t+1}\) be produced by a revision \(r\) that adds or correctly recontextualizes \(e^\star\). Suppose expected dependency loss decomposes as
\[
\E_{c\sim \Contexts'}
[\LossD(G,O_c)]
=
B
+
\alpha \Mismatch(G,\TrueGraph)
+
\lambda_k K(G),
\]
where \(B\) is irreducible noise, \(\Mismatch(G,\TrueGraph)\) is structural mismatch with the target dependency graph, \(K(G)\) is graph complexity, and \(\alpha,\lambda_k>0\). If revision reduces structural mismatch by \(\Delta_M>0\) and increases complexity by \(\Delta_K\geq 0\), then expected loss decreases whenever
\[
\alpha \Delta_M > \lambda_k \Delta_K.
\]
That is,
\[
\E_{c\sim \Contexts'}
[\LossD(G_{t+1},O_c)]
<
\E_{c\sim \Contexts'}
[\LossD(G_t,O_c)].
\]
\end{proposition}
\begin{proof}
By assumption,
\[
\E[\LossD(G_t,O_c)]
=
B+\alpha \Mismatch(G_t,\TrueGraph)+\lambda_k K(G_t),
\]
and
\[
\E[\LossD(G_{t+1},O_c)]
=
B+\alpha \Mismatch(G_{t+1},\TrueGraph)+\lambda_k K(G_{t+1}).
\]
Let
\[
\Delta_M
=
\Mismatch(G_t,\TrueGraph)-\Mismatch(G_{t+1},\TrueGraph)>0
\]
and
\[
\Delta_K
=
K(G_{t+1})-K(G_t)\geq 0.
\]
Then
\[
\E[\LossD(G_t,O_c)]
-
\E[\LossD(G_{t+1},O_c)]
=
\alpha \Delta_M-\lambda_k \Delta_K.
\]
If
\[
\alpha \Delta_M > \lambda_k \Delta_K,
\]
then the difference is positive, so
\[
\E[\LossD(G_{t+1},O_c)]
<
\E[\LossD(G_t,O_c)].
\]
This proves the claim.
\end{proof}
\begin{remark}
The proposition does not claim that any added dependency improves transfer. It states that transfer improves when the revision captures an invariant enabling dependency and the structural benefit exceeds the complexity cost. This is the formal version of the claim that understanding supports generalization when it captures what remains stable across changing contexts.
\end{remark}
\section{Algorithmic Form}
\begin{algorithm}[H]
\caption{Structural-Contextual Dependency Discovery}
\begin{algorithmic}[1]
\Require Initial dependency graph \(G_0\)
\Require Observed outcomes \(O_t\)
\Require Dependency loss function \(\LossD\)
\Require Dependency Credit Assignment operator \(\DCA\)
\Require Graph revision operator \(R\)
\Ensure Revised dependency graph \(G_{t+1}\)
\For{each revision cycle \(t=0,1,2,\ldots\)}
\State Use \(G_t\) to infer, explain, predict, or act.
\State Observe outcome \(O_t\).
\State Compute dependency loss \(L_t \gets \LossD(G_t,O_t)\).
\State Assign dependency credit \(A_t \gets \DCA(L_t)\).
\State Generate candidate graph revisions \(\Act(G_t)=\{r_1,r_2,\ldots,r_n\}\).
\State Select \(r_t^*=\argmin_{r\in \Act(G_t)} \E[\LossD(r(G_t),O_t)+\lambda_k K(r(G_t))+\lambda_r C(r)+\lambda_u U(r(G_t))]\).
\State Revise graph \(G_{t+1}\gets r_t^*(G_t)\).
\State Compress or regularize \(G_{t+1}\) when appropriate.
\EndFor
\State \Return \(G_{t+1}\)
\end{algorithmic}
\end{algorithm}
The important feature is that the system does not merely update an output distribution. It updates the structure by which outputs are made intelligible.
\section{Dependency Revision Cycle}
The revision cycle can be represented schematically as follows.
\begin{figure}[H]
\centering
\begin{tikzpicture}[
node distance=1.45cm,
box/.style={
rectangle,
rounded corners,
draw=black,
align=center,
minimum width=3.1cm,
minimum height=0.95cm,
font=\small
},
arrow/.style={-{Latex[length=3mm]}, thick}
]
\node[box] (graph) {Current Dependency\\Graph \(G_t\)};
\node[box, right=of graph] (act) {Inference, Action,\\or Explanation};
\node[box, right=of act] (obs) {Observed Outcome\\\(O_t\)};
\node[box, below=of obs] (loss) {Dependency Loss\\\(\LossD(G_t,O_t)\)};
\node[box, left=of loss] (dca) {Dependency Credit\\Assignment};
\node[box, left=of dca] (revise) {Graph Revision\\\(R(G_t,A_t)\)};
\node[box, below=of graph] (newgraph) {Revised Dependency\\Graph \(G_{t+1}\)};
\draw[arrow] (graph) -- (act);
\draw[arrow] (act) -- (obs);
\draw[arrow] (obs) -- (loss);
\draw[arrow] (loss) -- (dca);
\draw[arrow] (dca) -- (revise);
\draw[arrow] (revise) -- (newgraph);
\draw[arrow] (newgraph.west) .. controls +(-1.0,0) and +(-1.0,0) .. (graph.west);
\end{tikzpicture}
\caption{The dependency revision cycle. Observed failure is converted into dependency loss, assigned to responsible graph components, and used to revise the system's model of enabling conditions.}
\label{fig:revision_cycle}
\end{figure}
\section{Implementation Blueprint}
A practical \(\SCDD\) system can be built as a modular architecture.
\begin{enumerate}[leftmargin=1.2cm]
\item \textbf{Graph store.} Maintains \(G_t=(V_t,E_t,\tau,\chi,w,s,u)\).
\item \textbf{Inference substrate.} Uses the graph to predict, explain, plan, or act. This may be symbolic, neural, probabilistic, or hybrid.
\item \textbf{Outcome evaluator.} Compares predicted, explained, or intended outcomes with observed outcomes.
\item \textbf{Dependency-loss module.} Computes \(\LossD(G_t,O_t)\).
\item \textbf{Credit-assignment module.} Localizes failure to nodes, edges, types, contexts, or missing dependencies.
\item \textbf{Revision proposer.} Generates candidate graph edits.
\item \textbf{Revision selector.} Chooses edits by expected loss reduction, complexity, uncertainty, and edit cost.
\item \textbf{Compression module.} Removes redundant, unsupported, or over-specific graph structure.
\item \textbf{Audit layer.} Records why each revision occurred, preserving interpretability and revision provenance.
\end{enumerate}
This architecture can sit above a neural model, inside a neuro-symbolic system, or alongside a causal-discovery engine. \(\SCDD\) is therefore not committed to replacing deep learning. It specifies a structural layer that makes the object of revision explicit.
\section{What \(\SCDD\) Contributes}
The most important contribution of \(\SCDD\) is the following reframing:
\[
\boxed{
\text{Intelligence may be fundamentally about revising models of what makes things possible.}
}
\]
This reframes intelligence away from prediction as the primary object.
Prediction asks
\[
\text{What will happen?}
\]
Dependency understanding asks
\[
\text{What must be true for this to happen?}
\]
The second question is more general. It supports prediction, explanation, counterfactual reasoning, intervention, planning, repair, design, and discovery.
This gives a unified account of several capacities that are often treated separately:
\[
\begin{aligned}
\text{Reasoning} &= \text{dependency traversal},\\
\text{Learning} &= \text{dependency revision},\\
\text{Explanation} &= \text{dependency tracing},\\
\text{Planning} &= \text{dependency projection},\\
\text{Creativity} &= \text{dependency recombination},\\
\text{Diagnosis} &= \text{dependency fault localization},\\
\text{Science} &= \text{dependency restructuring},\\
\text{Self-reflection} &= \text{dependency modeling applied to the model itself}.
\end{aligned}
\]
That is the central unification.
\section{Distinction from Existing Theories}
\(\SCDD\) should not claim that all its components are new. They are not. Its novelty is the combination of four claims.
First, understanding is represented as a graph of enabling conditions.
Second, failure is measured not only as predictive error but as dependency loss.
Third, credit assignment is applied to dependencies rather than only to parameters, actions, or propositions.
Fourth, intelligence is defined as the recursive revision of this dependency graph.
In compact form,
\[
\text{Intelligence}\neq \text{prediction alone}.
\]
Rather,
\[
\text{Intelligence}=
\text{recursive revision of enabling structure}.
\]
This makes \(\SCDD\) closest to truth-maintenance systems, causal discovery, and graph-based learning, but not identical to any of them. Truth-maintenance systems localize contradictions. Causal discovery learns causal structure. Graph networks compute over relational structures. Predictive processing minimizes prediction error. \(\SCDD\) integrates these into a broader dependency-revision cycle.
\section{Testable Predictions}
\(\SCDD\) becomes scientifically valuable only if it produces testable differences. The framework predicts that systems with explicit dependency revision should outperform prediction-only systems on tasks where structure matters more than surface regularity.
\subsection{Counterfactual Transfer}
A dependency-revision system should generalize better when surface features change but enabling conditions remain stable.
For example, a system trained on cases where glass cups break when dropped on tile should distinguish among ceramic cups on concrete, rubber cups on tile, and glass cups on carpet. A predictive model may rely on surface correlations. A dependency model should identify impact energy, material brittleness, surface hardness, geometry, and context.
\subsection{Explanation Repair}
When an explanation fails, an \(\SCDD\) system should revise the failed dependency rather than merely output a different explanation.
For example, suppose the initial explanation is
\[
\text{bridge failure} \leftarrow \text{material weakness}.
\]
If correction reveals that the material was within tolerance but support geometry was compromised, the system should weaken the material-defect edge, strengthen the geometry-instability edge, and revise the context of the failure explanation.
\subsection{Type-Consistency Advantage}
Because \(\SCDD\) tracks dependency type, it should reduce category errors.
For example, the claim
\[
\text{moral value} \rightarrow \text{mathematical validity}
\]
should be rejected or revised because moral value may affect acceptance, use, interpretation, or institutional priority, but not formal validity itself.
\subsection{Contextual Validity}
\(\SCDD\) should outperform systems that preserve globally stated rules when those rules only apply locally.
For example,
\[
\text{water boils at }100^\circ\mathrm{C}
\]
should be revised to
\[
\text{water boils at approximately }100^\circ\mathrm{C}\text{ at }1\text{ atm pressure}.
\]
The revision is not merely numerical. It attaches a context of validity.
\subsection{Scientific Discovery}
In scientific reasoning tasks, \(\SCDD\) systems should produce better mechanistic hypotheses because they search for enabling dependencies rather than merely fitting equations. Scientific understanding involves discovering which dependencies generate a phenomenon, not only describing the observed pattern.
\section{Evaluation Metrics}
The key evaluation should not be only whether the model answered correctly. It should also ask whether the model revised the right dependency after failure.
One possible metric is Dependency Revision Accuracy:
\[
\operatorname{DRA}
=
\frac{
\text{correctly revised dependencies}
}{
\text{total dependencies requiring revision}
}.
\]
Another is Dependency Fault Localization:
\[
\operatorname{DFL}
=
\frac{
\text{correctly localized failed dependencies}
}{
\text{total failure-relevant dependencies}
}.
\]
A third is Contextual Generalization Stability:
\[
\operatorname{CGS}
=
\frac{
\text{performance under changed context}
}{
\text{performance under original context}
}.
\]
A fourth is Explanation Dependency Fidelity:
\[
\operatorname{EDF}
=
\frac{
\text{explanatory dependencies supported by the graph}
}{
\text{total dependencies used in explanation}
}.
\]
A fifth is Type Revision Precision:
\[
\operatorname{TRP}
=
\frac{
\text{correctly retyped dependencies}
}{
\text{total retyped dependencies}
}.
\]
A sixth is Compression Retention:
\[
\operatorname{CR}
=
\frac{
\text{task-relevant dependencies preserved after compression}
}{
\text{task-relevant dependencies before compression}
}.
\]
Together, these metrics test the theory directly.
\section{Experimental Design}
A minimal experiment would compare three systems:
\begin{enumerate}[label=\textbf{System \Alph*:},leftmargin=2cm]
\item a prediction-only model;
\item a prediction model with explanation generation;
\item an \(\SCDD\) model with an explicit dependency graph, dependency loss, dependency credit assignment, and graph revision.
\end{enumerate}
Tasks should include prediction accuracy, counterfactual generalization, explanation fidelity, dependency fault localization, context-sensitive revision, resistance to spurious correlations, graph compression quality, and recovery after correction.
The distinguishing question is not only
\[
\text{Did the system get the answer right?}
\]
but
\[
\text{Did the system revise the dependency structure correctly after failure?}
\]
If \(\SCDD\) is correct, the third system should outperform the other two on tasks requiring structural repair, explanation transfer, context revision, and counterfactual reasoning.
\subsection{Minimal Benchmark Structure}
A benchmark for \(\SCDD\) should include paired cases:
\[
(X_{\mathrm{train}},O_{\mathrm{train}})
\quad\text{and}\quad
(X_{\mathrm{shift}},O_{\mathrm{shift}}),
\]
where surface features change but relevant enabling conditions remain stable, or where surface features remain stable but relevant enabling conditions change.
For each case, the benchmark should provide:
\begin{enumerate}
\item the observed outcome;
\item the correct dependency explanation;
\item one or more plausible but incorrect dependency explanations;
\item the dependency that must be revised after feedback;
\item a context shift that tests whether revision transfers.
\end{enumerate}
This allows direct measurement of dependency revision rather than only answer accuracy.
\subsection{Proposed Benchmark Domains}
Suitable domains include:
\begin{enumerate}[leftmargin=1.2cm]
\item physical reasoning under changed material or environmental conditions;
\item medical-style diagnosis with confounders and missing enabling conditions;
\item legal or policy reasoning where rules are context-dependent;
\item mathematical reasoning where logical dependencies must not be confused with empirical or normative dependencies;
\item engineering fault diagnosis where functional dependencies must be localized;
\item scientific-discovery tasks where the goal is to infer hidden generative structure.
\end{enumerate}
\section{Limits and Risks}
\(\SCDD\) is not yet a proven theory of intelligence. It is a formal proposal. Several risks remain.
First, dependency graphs may become too large. Without compression, the system could accumulate dependencies faster than it can revise them.
Second, \(\DCA\) may be difficult. Assigning responsibility to the correct dependency is often as hard as solving the original problem.
Third, not all intelligence may require explicit graph representation. Some forms of skill, perception, or motor control may operate through implicit dynamics.
Fourth, dependency types may be difficult to define cleanly. Causal, constitutive, semantic, and contextual dependencies can overlap.
Fifth, \(\SCDD\) may reproduce a classic problem in symbolic AI: hand-building too much structure. Sutton's ``Bitter Lesson'' argues that general methods that scale with computation have historically beaten systems that rely too heavily on human-designed knowledge \cite{sutton2019bitter}. \(\SCDD\) must therefore be implemented as a learnable, scalable revision process, not as a manually curated ontology.
Sixth, graph revision may introduce false structure. A system can overfit dependencies just as a statistical model can overfit data. This is why complexity penalties, uncertainty tracking, and transfer evaluation are required.
The best version of \(\SCDD\) is not anti-deep-learning. It is a structural layer that can sit above, inside, or alongside neural systems.
\section{Strongest Defensible Claim}
The strongest defensible claim is
\begin{center}
\fbox{
\begin{minipage}{0.88\linewidth}
\(\SCDD\) proposes that understanding is the recursive revision of dependency structures that model enabling conditions.
\end{minipage}
}
\end{center}
The strongest exciting claim is
\begin{center}
\fbox{
\begin{minipage}{0.88\linewidth}
Prediction is not the essence of intelligence; prediction is one consequence of modeling what makes outcomes possible.
\end{minipage}
}
\end{center}
The strongest testable claim is
\begin{center}
\fbox{
\begin{minipage}{0.88\linewidth}
Systems that explicitly revise dependency graphs should outperform prediction-only systems on explanation, counterfactual transfer, causal repair, context-sensitive generalization, and scientific discovery tasks.
\end{minipage}
}
\end{center}
\section{Conclusion}
This paper proposed a dependency-centric theory of intelligence.
The central question was:
\[
\text{What is the object being updated when a system understands?}
\]
The answer proposed here is:
\[
\text{a graph of enabling conditions.}
\]
Under \(\SCDD\), intelligence is not merely the optimization of predictions. It is the recursive construction, evaluation, and revision of the structures that define what the system believes to be possible.
This view unifies prediction, explanation, reasoning, planning, diagnosis, creativity, scientific discovery, and self-reflection. Each becomes a different operation over dependency structure.
The framework is grounded in existing traditions: causal models, Bayesian network learning, belief revision, truth-maintenance systems, predictive processing, graph networks, mechanistic interpretability, causal representation learning, world models, and automated scientific discovery. Its central contribution is to make dependency revision the general object of understanding.
In plain terms, a system understands when it knows not only what happens, but what makes it possible. It becomes more intelligent when it can revise that model after failure.
\section*{Condensed Thesis Statement}
\begin{center}
\fbox{
\begin{minipage}{0.9\linewidth}
Understanding is not prediction alone.
\vspace{0.5em}
Understanding is the coherent representation of enabling conditions.
\vspace{0.5em}
Intelligence is the recursive process by which a system detects failures in that representation, assigns responsibility to the dependencies that produced the failure, and revises the graph of possibility.
\end{minipage}
}
\end{center}
\section*{Canonical Equation}
\[
\boxed{
G_{t+1}
=
R\!\left(
G_t,\;
\DCA\!\left(
\LossD(G_t,O_t)
\right)
\right)
}
\]
\section*{One-Sentence Contribution}
\begin{center}
\fbox{
\begin{minipage}{0.9\linewidth}
\(\SCDD\) reframes intelligence as dependency revision: the continual reconstruction of a graph representing what makes outcomes possible.
\end{minipage}
}
\end{center}
% ------------------------------------------------------------
% Bibliography
% ------------------------------------------------------------
\begin{thebibliography}{99}
\bibitem{alchourron1985logic}
Alchourrón, C. E., Gärdenfors, P., \& Makinson, D. (1985).
On the logic of theory change: Partial meet contraction and revision functions.
\emph{The Journal of Symbolic Logic}, 50(2), 510--530.
\bibitem{battaglia2018relational}
Battaglia, P. W., Hamrick, J. B., Bapst, V., Sanchez-Gonzalez, A., Zambaldi, V., Malinowski, M., Tacchetti, A., Raposo, D., Santoro, A., Faulkner, R., Gulcehre, C., Song, F., Ballard, A., Gilmer, J., Dahl, G., Vaswani, A., Allen, K., Nash, C., Langston, V., Dyer, C., Heess, N., Wierstra, D., Kohli, P., Botvinick, M., Vinyals, O., Li, Y., \& Pascanu, R. (2018).
Relational inductive biases, deep learning, and graph networks.
\emph{arXiv preprint arXiv:1806.01261}.
\bibitem{clark2013whatever}
Clark, A. (2013).
Whatever next? Predictive brains, situated agents, and the future of cognitive science.
\emph{Behavioral and Brain Sciences}, 36(3), 181--204.
\bibitem{doyle1979truth}
Doyle, J. (1979).
A truth maintenance system.
\emph{Artificial Intelligence}, 12(3), 231--272.
\bibitem{friston2010free}
Friston, K. (2010).
The free-energy principle: A unified brain theory?
\emph{Nature Reviews Neuroscience}, 11, 127--138.
\bibitem{gardenfors1988knowledge}
Gärdenfors, P. (1988).
\emph{Knowledge in Flux: Modeling the Dynamics of Epistemic States}.
MIT Press.
\bibitem{ha2018world}
Ha, D., \& Schmidhuber, J. (2018).
World models.
\emph{arXiv preprint arXiv:1803.10122}.
\bibitem{koller2009probabilistic}
Koller, D., \& Friedman, N. (2009).
\emph{Probabilistic Graphical Models: Principles and Techniques}.
MIT Press.
\bibitem{olah2020zoom}
Olah, C., Cammarata, N., Schubert, L., Goh, G., Petrov, M., \& Carter, S. (2020).
Zoom in: An introduction to circuits.
\emph{Distill}.
\bibitem{pearl2009causality}
Pearl, J. (2009).
\emph{Causality: Models, Reasoning, and Inference}.
Second edition.
Cambridge University Press.
\bibitem{schmidt2009distilling}
Schmidt, M., \& Lipson, H. (2009).
Distilling free-form natural laws from experimental data.
\emph{Science}, 324(5923), 81--85.
\bibitem{scholkopf2021causal}
Schölkopf, B., Locatello, F., Bauer, S., Ke, N. R., Kalchbrenner, N., Goyal, A., \& Bengio, Y. (2021).
Toward causal representation learning.
\emph{Proceedings of the IEEE}, 109(5), 612--634.
\bibitem{spirtes2000causation}
Spirtes, P., Glymour, C., \& Scheines, R. (2000).
\emph{Causation, Prediction, and Search}.
Second edition.
MIT Press.
\bibitem{stallman1977forward}
Stallman, R. M., \& Sussman, G. J. (1977).
Forward reasoning and dependency-directed backtracking in a system for computer-aided circuit analysis.
\emph{Artificial Intelligence}, 9(2), 135--196.
\bibitem{sutton2019bitter}
Sutton, R. S. (2019).
The bitter lesson.
\emph{Incomplete Ideas Blog}.
\end{thebibliography}
\end{document}