\documentclass[11pt]{article}
\usepackage[margin=1in]{geometry}
\usepackage{amsmath,amssymb,amsthm,mathtools}
\usepackage{enumitem}
\usepackage{hyperref}
\usepackage{booktabs}
\usepackage{microtype}
\usepackage{listings}
\usepackage{xcolor}
\usepackage{array}
\usepackage{longtable}
% ---------------------------------------------------------------------
% Theorem environments
% ---------------------------------------------------------------------
\newtheorem{definition}{Definition}
\newtheorem{assumption}{Assumption}
\newtheorem{proposition}{Proposition}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{remark}{Remark}
\newtheorem{corollary}{Corollary}
\newtheorem{example}{Example}
% ---------------------------------------------------------------------
% Commands
% ---------------------------------------------------------------------
\newcommand{\RDD}{\mathrm{RDD}}
\newcommand{\SCDD}{\mathrm{SCDD}}
\newcommand{\Poss}{\operatorname{Poss}}
\newcommand{\doop}{\operatorname{do}}
\newcommand{\Val}{\operatorname{Val}}
\newcommand{\Conf}{\operatorname{Conf}}
\newcommand{\Gen}{\operatorname{Gen}}
\newcommand{\Eval}{\operatorname{Eval}}
\newcommand{\Ctx}{\mathcal{C}}
\newcommand{\Obs}{\mathcal{O}}
\newcommand{\Adm}{\mathcal{A}}
\newcommand{\Graph}{\mathcal{G}}
\newcommand{\Loss}{\mathcal{L}}
\newcommand{\Evidence}{\mathcal{E}}
\newcommand{\Types}{\mathcal{T}}
\newcommand{\Rev}{\mathcal{R}}
\newcommand{\Ind}{\mathbb{I}}
\newcommand{\Rset}{\mathbb{R}}
\newcommand{\Nset}{\mathbb{N}}
\newcommand{\Prb}{\mathbb{P}}
\newcommand{\Ex}{\mathbb{E}}
\newcommand{\supp}{\operatorname{supp}}
\newcommand{\clip}{\operatorname{clip}}
\newcommand{\argmax}{\operatorname*{argmax}}
\newcommand{\argmin}{\operatorname*{argmin}}
% ---------------------------------------------------------------------
% Listings
% ---------------------------------------------------------------------
\lstdefinestyle{pythonstyle}{
language=Python,
basicstyle=\ttfamily\small,
breaklines=true,
columns=fullflexible,
frame=single,
showstringspaces=false,
tabsize=2
}
% ---------------------------------------------------------------------
% Hyperref
% ---------------------------------------------------------------------
\hypersetup{
colorlinks=true,
linkcolor=black,
citecolor=black,
urlcolor=black
}
% ---------------------------------------------------------------------
% Title
% ---------------------------------------------------------------------
\title{\textbf{Recursive Dependency Discovery:}\\
A Typed, Context-Indexed Graph Formalism for Possibility, Explanation, and Cross-Domain Dependency Reasoning}
\author{C. L. Vaillant\\
Recursive Generative Emergence Project\\
\texttt{rgemergence.com}}
\date{\today}
\begin{document}
\maketitle
\begin{abstract}
Knowledge systems are commonly organized by category, chronology, citation, lexical similarity,
statistical association, or predefined semantic relations. These structures support retrieval,
classification, and link prediction, but they do not directly answer a distinct explanatory
question: what must be in place for a concept, process, system, capacity, or phenomenon to be
possible, operational, or intelligible? This paper introduces Recursive Dependency Discovery
(RDD), a typed, weighted, context-indexed graph formalism for constructing, testing, traversing,
and revising dependency claims. In an RDD graph, an edge \(A \rightarrow B\) means that \(A\)
contributes to the possibility, operation, intelligibility, or instantiation of \(B\) under an
explicit context and observer model. The framework is built around two operators: the backward
operator \(B_G(X)\), which asks what makes \(X\) possible, and the forward operator \(F_G(X)\),
which asks what \(X\) makes possible.
The central contribution is methodological rather than metaphysical. RDD does not claim that
there is one universal dependency graph for all domains. It claims that dependency can be
represented as an auditable graph object with source, target, type, strength, confidence,
context, observer model, evidence, and revision history. The paper defines three basis dependency
types, ontological, functional, and epistemic, and treats causal, semantic, emergent, historical,
and observer-relative relations as subtypes, annotations, or indices over this basis. It further
distinguishes hard necessity, defeasible necessity, contributory support, sufficiency, redundant
support, essentiality, substitutability, and mutual dependency.
The main formal addition is a semantic layer beneath the graph syntax. RDD edges are interpreted
inside semantic frames over admissible configurations, node-status functions, weighting measures,
and observer-indexed interpretation maps. This allows dependency to be evaluated not only as an
edge \(A \rightarrow B\), but also as a support relation \(\Gamma \Rightarrow B\) between a set
of enabling conditions and a target. The framework introduces validity predicates, anti-vacuity
conditions, mediator screening, context-restricted revision, evidence-tiered confidence,
observer translation, graph equivalence, and bounded recursive traversal. Theoretical results are
stated as structural properties under explicit assumptions, not as guarantees that generated
edges are true. A minimal toy demonstration shows executability, while the main empirical
contribution is a preregisterable validation program: hidden-edge recovery, functional ablation
prediction, epistemic prerequisite testing, context restriction under counterexamples, and
expert-rated explanation quality. If validated, RDD would provide a common formal language for
explanation across ontology construction, curriculum design, scientific reasoning, mechanistic
interpretability, and human-AI collaboration.
\end{abstract}
\tableofcontents
\newpage
% ---------------------------------------------------------------------
\section{Introduction}
% ---------------------------------------------------------------------
Most knowledge systems organize information by resemblance, topic, sequence, citation, or
co-occurrence. A search engine asks which documents match a query. A citation graph asks which
works refer to which other works. A knowledge graph asks which entities stand in predefined
relations. A language model often encodes knowledge through distributed statistical regularities.
These systems are powerful, but they often leave a deeper explanatory question under-specified:
\begin{quote}
What makes this possible?
\end{quote}
This question is not equivalent to asking what something resembles, what category it belongs to,
or which terms tend to appear near it. To ask what makes prediction possible is not merely to list
related terms such as forecasting, uncertainty, planning, probability, or inference. It is to ask
which conditions, operations, representations, or constraints must be available for prediction to
occur at all. Prediction may require information, a state space, a transition model, access to
present conditions, or memory of prior states. Those requirements form a dependency structure.
Recursive Dependency Discovery begins from this observation. Many concepts, systems, and
phenomena sit inside networks of enabling conditions. Information depends on distinction.
Measurement depends on distinguishable outcomes and comparison procedures. Planning often depends
on prediction. Scientific modeling depends on measurement, prediction, explanation, and
correction. Self-modeling depends on modeling, memory, boundary formation, and reflexive
representation. These are not merely associative relations. They are claims about what enables,
constrains, supports, or makes intelligible what.
RDD represents such relations as directed, typed, weighted, context-sensitive graph edges. For any
target \(X\), the framework asks two primitive questions:
\[
B_G(X)=\{Y\in V:Y\rightarrow X\in E\},
\]
\[
F_G(X)=\{Z\in V:X\rightarrow Z\in E\}.
\]
The backward operator asks what makes \(X\) possible. The forward operator asks what \(X\) makes
possible. Repeated application produces a branching dependency graph. The graph may be partial,
uncertain, observer-relative, and revisable.
\subsection{Core Claim}
The core claim is:
\begin{quote}
A class of explanatory, curricular, scientific, and mechanistic reasoning tasks can be improved by
representing knowledge as a typed graph of enabling conditions rather than by similarity,
category membership, chronology, or citation structure alone.
\end{quote}
This is a testable methodological claim, not a claim that reality contains one final dependency
graph. RDD asserts that explicit dependency graphs can be constructed, compared, challenged, and
revised, and that doing so can expose hidden prerequisites, missing assumptions, unsupported
explanatory jumps, false equivalences, and overgeneralized claims.
\subsection{Research Problem}
The motivating problem is that standard graph structures often conflate several distinct
relations:
\[
\begin{gathered}
\text{association},\quad \text{causation},\quad \text{prerequisite},\quad \text{constitution},\\
\text{semantic dependence},\quad \text{functional support}.
\end{gathered}
\]
A concept map may show that two ideas are related. A knowledge graph may record a predicate
between two entities. A causal graph may represent intervention-sensitive influence. A curriculum
graph may encode prerequisite order. But explanation often requires a broader relation:
\[
A \text{ is part of what makes } B \text{ possible, operational, or intelligible}.
\]
RDD formalizes this broader relation while preserving distinctions among its subtypes.
\subsection{Scope and Non-Claims}
Because the language of dependency can invite overstatement, this paper makes four limitations
explicit.
First, RDD is not a replacement for causal discovery. Causal discovery concerns causal
identifiability under assumptions such as intervention, statistical dependence, faithfulness, and
causal sufficiency. RDD includes causal claims as one important subtype of functional dependency,
but it also represents epistemic, ontological, semantic, and context-relative dependencies that
need not be manipulable causal variables.
Second, RDD is not a proof that any generated dependency is correct. It is a representation and
validation scheme. Edge truth remains domain-dependent and evidence-dependent.
Third, RDD is not a complete semantics for all dependency. This paper supplies basis types,
strength categories, validity predicates, confidence tiers, structural revision operations, and
empirical protocols. A full theory of dependency is expected to grow through formal refinement
and domain validation.
Fourth, the toy demonstration in this paper establishes executability only. It does not establish
general empirical superiority. The stronger empirical claims are intentionally deferred to larger,
preregistered benchmark programs.
\subsection{Contributions}
This paper makes the following contributions:
\begin{enumerate}
\item It defines RDD as a typed, weighted, context-indexed dependency graph formalism.
\item It defines dependency claims as auditable objects with type, strength, confidence,
context, observer model, evidence, and revision history.
\item It defines a semantic frame for dependency claims over admissible configurations,
node-status functions, and observer-indexed interpretation maps.
\item It treats individual edges \(A\rightarrow B\) as projections of set-based support
relations \(\Gamma\Rightarrow B\).
\item It defines observer models as general evaluators rather than merely human judges.
\item It supplies validity conditions for ontological, functional, and epistemic dependency.
\item It distinguishes hard necessity, defeasible necessity, contributory support, sufficiency,
redundancy, essentiality, substitutability, and mutual dependency.
\item It reduces a broader dependency taxonomy to three basis types while preserving causal,
semantic, emergent, historical, and observer-relative distinctions as refinements.
\item It separates candidate generation from edge validation and confidence assignment.
\item It distinguishes direct from indirect dependency through mediator screening.
\item It states structural properties for context restriction, hidden-edge recovery,
direct-edge sparsity, observer aggregation, semantic revision, and bounded traversal.
\item It gives a reproducible validation program with metrics, baselines, failure conditions,
and preregistration-ready experiment designs.
\end{enumerate}
% ---------------------------------------------------------------------
\section{Related Work}
\label{sec:related}
% ---------------------------------------------------------------------
RDD intersects several established areas but is not identical to any one of them.
\paragraph{Causal graph theory.}
In Pearl-style causal models, an edge \(A\rightarrow B\) usually means that intervening on
\(A\) may change \(B\) under a specified causal model~\cite{pearl}. RDD includes this kind of
claim but is broader. An RDD edge may be ontological, epistemic, or functional without being a
standard manipulable causal variable.
\paragraph{Causal discovery.}
Causal discovery methods infer causal structure from statistical or experimental data under
assumptions such as faithfulness, causal sufficiency, and constraints on interventions
\cite{spirtes,peters}. RDD is not a competitor to causal discovery in the narrow sense. It can
use causal discovery as an evidence source when the dependency type is causal-functional, but it
also represents claims such as ``a scale is required for measurement'' or ``function is an
epistemic prerequisite for derivative,'' which are not naturally handled as causal discovery
problems.
\paragraph{Knowledge graphs.}
Knowledge graphs represent entities and relations, often through triples such as subject,
predicate, and object~\cite{brachman,sowa}. Knowledge graph completion methods optimize link
prediction over fixed relation types~\cite{bordes,nickel,sun}. RDD differs by privileging a
family of relations: enabling conditions. It is not merely a graph of facts; it is a graph of
claims about what makes other nodes possible, operational, or intelligible.
\paragraph{Ontology engineering.}
Formal ontology provides methods for representing concepts, categories, and relations
\cite{guarino}. RDD may be used as an ontology-construction procedure, but it is not itself a
fixed ontology. It is a method for discovering and revising dependency structures across
ontologies.
\paragraph{Prerequisite and curriculum learning.}
Educational systems often ask what must be learned before a later concept can be learned.
Prerequisite skill modeling and concept prerequisite recovery provide empirical templates for
RDD validation~\cite{liu,liang,talukdar}. RDD generalizes prerequisite structure beyond education
by treating prerequisite relations as one observer-indexed instance of dependency.
\paragraph{Relation extraction.}
Relation extraction identifies structured relations from corpora. Pattern-based methods,
including Hearst patterns, show that language can reveal conceptual relations~\cite{hearst}.
RDD can use relation extraction as a candidate-generation route, but linguistic evidence alone is
insufficient for edge acceptance.
\paragraph{Neuro-symbolic and mechanistic approaches.}
Neuro-symbolic approaches combine learned representations with explicit symbolic structure
\cite{garcez}. Mechanistic interpretability attempts to identify internal components and circuits
that support model behavior. RDD can be interpreted as a symbolic scaffolding layer for systems
that need explicit dependency tracking rather than latent association alone.
\paragraph{Explanation and philosophy of science.}
RDD also belongs to the tradition of analyzing explanation as more than description. It treats
explanation as a structured account of enabling conditions, constraints, and admissible
transformations. Its novelty lies not in claiming that dependency matters, but in making
dependency a typed, testable, revisable graph object.
% ---------------------------------------------------------------------
\section{The Dependency Question}
\label{sec:dependency-question}
% ---------------------------------------------------------------------
The dependency question is:
\[
\text{What makes } X \text{ possible?}
\]
Its forward complement is:
\[
\text{What does } X \text{ make possible?}
\]
These questions define two directions of explanation. Backward dependency identifies
prerequisites, enabling conditions, substrates, operational supports, or interpretive
requirements. Forward dependency identifies consequences, affordances, capacities, higher-order
structures, or systems made possible by the node.
RDD treats dependency as broader than causation and narrower than association:
\begin{quote}
Association says that two items appear together. Dependency says that one item contributes to the
possibility, operation, intelligibility, or instantiation of the other.
\end{quote}
The distinction is directional. Prediction and planning are associated, but RDD asks whether one
supports the possibility of the other. In many planning systems,
\[
\text{Prediction}\rightarrow\text{Planning},
\]
because planning requires anticipatory evaluation of possible outcomes. Prediction does not, in
general, require planning. Directionality is therefore part of the explanatory claim.
% ---------------------------------------------------------------------
\section{Formal Setting}
\label{sec:formal-setting}
% ---------------------------------------------------------------------
Let \(\Ctx\) be a set of contexts and \(\Obs\) a set of observer models. A context \(c\in\Ctx\)
specifies the domain, scale, system boundary, task, and interpretive frame. An observer model
\(o\in\Obs\) specifies the representational, computational, instrumental, or epistemic standpoint
from which dependency is assessed.
For each pair \((c,o)\), let
\[
\Adm_{c,o}
\]
be the set of admissible configurations under context \(c\) and observer model \(o\). A
configuration may be a physical state, computational system, curriculum state, conceptual model,
biological system, experimental arrangement, or explanatory structure.
A node is not a raw word. It is a meaning-resolved object:
\[
v=(\ell,c,o,b),
\]
where \(\ell\) is a label, \(c\) is context, \(o\) is observer model, and \(b\) is an ontology
branch or disambiguation index. Let
\[
\iota_{c,o}(v,a)\in\{0,1\}
\]
indicate whether node \(v\) is instantiated, available, operational, or understood in
configuration \(a\in\Adm_{c,o}\).
The possibility of \(X\) under \((c,o)\) is:
\[
\Poss_{c,o}(X)=1
\quad\Longleftrightarrow\quad
\exists a\in\Adm_{c,o}:\iota_{c,o}(X,a)=1.
\]
The impossibility of \(X\) without \(A\) is:
\[
\neg\Poss_{c,o}(X\wedge \neg A)
\quad\Longleftrightarrow\quad
\nexists a\in\Adm_{c,o}:
\iota_{c,o}(X,a)=1
\ \text{and}\
\iota_{c,o}(A,a)=0.
\]
\subsection{Scoped Dependency Claims}
A dependency claim is written:
\[
A\rightarrow_{c,o,t,s,w}B,
\]
where \(A\) is the dependency source, \(B\) is the target, \(c\) is context, \(o\) is observer
model, \(t\) is dependency type, \(s\) is dependency strength, and \(w\in[0,1]\) is confidence.
\begin{definition}[Scoped dependency claim]
A scoped dependency claim is an edge object:
\[
e=(A,B,t,s,w,c,o,\phi,r),
\]
where \(\phi\) records evidence and provenance, and \(r\) records revision history.
\end{definition}
\begin{remark}
RDD edges are not bare metaphysical assertions. They are indexed claims. Changing the context,
observer model, strength, or evidence can change the status of the edge.
\end{remark}
\subsection{Compatibility of Contexts and Observers}
Because RDD edges are indexed, graph traversal requires compatibility.
\begin{definition}[Index compatibility]
Two indexed claims \(e_i\) and \(e_j\) are compatible when their contexts and observer models
can be composed without changing the intended interpretation of the nodes:
\[
\kappa(e_i,e_j)=1.
\]
If \(\kappa(e_i,e_j)=0\), then paths through \(e_i\) and \(e_j\) are not automatically licensed.
\end{definition}
For example, a path from a biological ``cell'' to a spreadsheet ``cell'' is invalid unless an
explicit analogy or cross-domain mapping has been introduced.
% ---------------------------------------------------------------------
\section{Formal Semantics of Dependency}
\label{sec:formal-semantics}
% ---------------------------------------------------------------------
The graph formalism defines how dependency claims are represented. This section defines their
semantic interpretation. The purpose is to specify when an edge is true, false, weakly supported,
necessary, sufficient, redundant, substitutable, mutually dependent, or context-relative under an
explicit model.
\subsection{Semantic Frame}
An RDD semantic frame is a tuple:
\[
\mathcal{M}_{c,o}=(\Adm_{c,o},V,\iota,m,\mu,\preceq),
\]
where:
\begin{itemize}
\item \(\Adm_{c,o}\) is the set of admissible configurations under context \(c\) and observer
model \(o\);
\item \(V\) is the set of interpreted nodes;
\item \(\iota_{c,o}:V\times\Adm_{c,o}\rightarrow\{0,1\}\) indicates whether a node is
instantiated, available, operational, or understood in a configuration;
\item \(m_{c,o}:V\times\Adm_{c,o}\rightarrow[0,1]\) assigns a graded status value to a node;
\item \(\mu_{c,o}\) is a probability, plausibility, or weighting measure over admissible
configurations;
\item \(\preceq\) is a dependency-strength preorder over edge claims.
\end{itemize}
The binary indicator \(\iota\) is used for hard possibility claims. The graded measure \(m\) is
used for functional, probabilistic, epistemic, and partial-support claims.
\subsection{Node Status and Conditional Possibility}
For a node \(X\), define its expected status under context \(c\) and observer \(o\) as:
\[
\bar m_{c,o}(X)
=
\Ex_{a\sim\mu_{c,o}}
\left[
m_{c,o}(X,a)
\right].
\]
Given a set of active or available nodes \(\Gamma\subseteq V\), define the conditional status of
\(X\) as:
\[
\bar m_{c,o}(X\mid \Gamma)
=
\Ex_{a\sim\mu_{c,o}}
\left[
m_{c,o}(X,a)
\mid
\forall G\in\Gamma,\ \iota_{c,o}(G,a)=1
\right].
\]
The hard possibility of \(X\) relative to \(\Gamma\) is:
\[
\Poss_{c,o}(X\mid\Gamma)=1
\quad\Longleftrightarrow\quad
\exists a\in\Adm_{c,o}:
\iota_{c,o}(X,a)=1
\ \text{and}\
\forall G\in\Gamma,\ \iota_{c,o}(G,a)=1.
\]
\subsection{Set-Based Dependency}
Dependency is often not a relation between one source and one target, but between a support set
and a target. RDD therefore treats the individual edge \(A\rightarrow B\) as a projection of a
more general relation:
\[
\Gamma\Rightarrow B,
\]
where \(\Gamma\subseteq V\) is a dependency set.
\begin{definition}[Dependency set]
A set \(\Gamma\subseteq V\) is a dependency set for \(B\) under \((c,o)\) if the status of \(B\)
depends on the availability, operation, interpretation, or instantiation of members of
\(\Gamma\).
\end{definition}
Individual edges are recovered by marginalizing over the set:
\[
A\rightarrow B
\quad\text{when}\quad
A\in\Gamma
\ \text{and}\
\Gamma\Rightarrow B.
\]
This prevents RDD from treating every dependency as isolated. Many targets require combinations
of conditions rather than single enabling nodes.
\subsection{Necessity}
\begin{definition}[Hard necessity]
A node \(A\) is hard necessary for \(B\) under \((c,o)\) if:
\[
\forall a\in\Adm_{c,o},
\quad
\iota_{c,o}(B,a)=1
\Rightarrow
\iota_{c,o}(A,a)=1.
\]
Equivalently:
\[
\nexists a\in\Adm_{c,o}:
\iota_{c,o}(B,a)=1
\ \text{and}\
\iota_{c,o}(A,a)=0.
\]
\end{definition}
\begin{definition}[Defeasible necessity]
A node \(A\) is defeasibly necessary for \(B\) under \((c,o)\) at tolerance \(\epsilon\) if:
\[
\mu_{c,o}
\left(
\left\{
a\in\Adm_{c,o}:
\iota_{c,o}(B,a)=1
\ \text{and}\
\iota_{c,o}(A,a)=0
\right\}
\right)
\leq \epsilon.
\]
When \(\epsilon=0\), defeasible necessity reduces to hard necessity.
\end{definition}
This allows RDD to distinguish strict formal dependency from high-confidence empirical
dependency.
\subsection{Sufficiency}
\begin{definition}[Hard sufficiency]
A set \(\Gamma\subseteq V\) is sufficient for \(B\) under \((c,o)\) if:
\[
\forall a\in\Adm_{c,o},
\quad
\left(
\forall G\in\Gamma,\ \iota_{c,o}(G,a)=1
\right)
\Rightarrow
\iota_{c,o}(B,a)=1.
\]
\end{definition}
\begin{definition}[Defeasible sufficiency]
A set \(\Gamma\subseteq V\) is defeasibly sufficient for \(B\) under \((c,o)\) at tolerance
\(\epsilon\) if:
\[
\mu_{c,o}
\left(
\left\{
a\in\Adm_{c,o}:
\forall G\in\Gamma,\ \iota_{c,o}(G,a)=1
\ \text{and}\
\iota_{c,o}(B,a)=0
\right\}
\right)
\leq \epsilon.
\]
\end{definition}
Necessity and sufficiency are not equivalent. \(A\) may be necessary for \(B\) without being
sufficient for \(B\), and a set \(\Gamma\) may be sufficient for \(B\) without each member being
individually necessary.
\subsection{Contributory Support}
\begin{definition}[Contributory support]
A node \(A\) contributes to \(B\) under \((c,o)\) if:
\[
\Delta_{c,o}(A;B)
=
\bar m_{c,o}(B\mid A)
-
\bar m_{c,o}(B\mid \neg A)
>0.
\]
\end{definition}
For interventionist contexts, define:
\[
\Delta^{\doop}_{c,o}(A;B)
=
\bar m_{c,o}(B\mid \doop(A))
-
\bar m_{c,o}(B\mid \doop(\neg A)).
\]
A functional dependency is stronger when:
\[
\Delta^{\doop}_{c,o}(A;B)
\]
is positive, stable, and larger than the corresponding passive association estimate, because
intervention-sensitive support is stronger than co-occurrence.
\subsection{Minimal Dependency Sets}
\begin{definition}[Minimal sufficient dependency set]
A set \(\Gamma\subseteq V\) is a minimal sufficient dependency set for \(B\) under \((c,o)\) if:
\[
\Gamma\Rightarrow B
\]
and no proper subset \(\Gamma'\subsetneq\Gamma\) is sufficient for \(B\).
\end{definition}
Minimality prevents unnecessary graph inflation. If:
\[
\Gamma=\{A_1,A_2,A_3\}
\]
is sufficient for \(B\), but
\[
\{A_1,A_2\}
\]
is also sufficient, then \(A_3\) is not part of the minimal sufficient set in that context.
\subsection{Redundancy}
\begin{definition}[Redundant dependency]
A node \(A\in\Gamma\) is redundant in dependency set \(\Gamma\Rightarrow B\) if:
\[
\Gamma\Rightarrow B
\quad\text{and}\quad
\Gamma\setminus\{A\}\Rightarrow B.
\]
\end{definition}
A node can be contributory without being non-redundant. Redundancy means that alternate support
paths preserve the target even when \(A\) is removed.
\subsection{Essentiality}
\begin{definition}[Essential dependency]
A node \(A\in\Gamma\) is essential for \(B\) relative to \(\Gamma\) if:
\[
\Gamma\Rightarrow B
\quad\text{and}\quad
\Gamma\setminus\{A\}\not\Rightarrow B.
\]
\end{definition}
Essentiality is local to a dependency set. \(A\) may be essential inside one support set while
being replaceable by another support set.
\subsection{Substitutability}
\begin{definition}[Substitutable dependency]
Two nodes \(A\) and \(A'\) are substitutable for target \(B\) under \((c,o)\) if there exists a
dependency set \(\Gamma\) such that:
\[
\Gamma\cup\{A\}\Rightarrow B
\quad\text{and}\quad
\Gamma\cup\{A'\}\Rightarrow B.
\]
\end{definition}
Substitutability allows RDD to represent multiple realizability. A target may remain possible
when one dependency is replaced by another functionally equivalent dependency.
\subsection{Mutual Dependency and Cycles}
Some systems contain mutual enabling relations:
\[
A\rightarrow B
\quad\text{and}\quad
B\rightarrow A.
\]
RDD distinguishes valid mutual dependency from circular explanation.
\begin{definition}[Mutual dependency]
Nodes \(A\) and \(B\) are mutually dependent under \((c,o)\) if:
\[
A\rightarrow_{c,o}B
\quad\text{and}\quad
B\rightarrow_{c,o}A
\]
are both valid under their respective edge types and strengths.
\end{definition}
\begin{definition}[Circular explanatory defect]
A cycle is explanatorily defective when every edge in the cycle depends for its justification on
the conclusion that the cycle itself is supposed to establish.
\end{definition}
Thus cycles are not automatically invalid. Biological, cognitive, social, and computational
systems may contain legitimate feedback loops. RDD rejects only unsupported circular
justification, not cyclic structure itself.
\subsection{Graph Equivalence}
Two RDD graphs may differ syntactically while preserving the same dependency content.
\begin{definition}[Reachability equivalence]
Two graphs \(G_1\) and \(G_2\) are reachability-equivalent under \((c,o)\) if for every pair
\((A,B)\):
\[
A\leadsto_{G_1}B
\quad\Longleftrightarrow\quad
A\leadsto_{G_2}B,
\]
where \(\leadsto_G\) denotes an accepted directed path in \(G\).
\end{definition}
\begin{definition}[Typed semantic equivalence]
Two graphs \(G_1\) and \(G_2\) are typed-semantically equivalent under \((c,o)\) if they preserve
the same accepted dependency paths together with their dependency types and strengths.
\end{definition}
Reachability equivalence is weaker than typed semantic equivalence. Two graphs may preserve the
same paths while disagreeing about whether those paths are ontological, functional, or epistemic.
\subsection{Observer Translation}
Different observers may represent the same dependency at different levels of abstraction. RDD
therefore requires a translation relation between observer-indexed graphs.
\begin{definition}[Observer translation map]
An observer translation map from \(o_i\) to \(o_j\) is a partial function:
\[
\tau_{i\rightarrow j}:V_{o_i}\rightarrow V_{o_j}
\]
that maps nodes interpreted by observer \(o_i\) into corresponding nodes interpreted by observer
\(o_j\), when such a mapping exists.
\end{definition}
\begin{definition}[Observer-preserving dependency]
A dependency edge \(A\rightarrow B\) is preserved under observer translation
\(\tau_{i\rightarrow j}\) if:
\[
A\rightarrow_{c,o_i}B
\]
implies:
\[
\tau_{i\rightarrow j}(A)\rightarrow_{c,o_j}\tau_{i\rightarrow j}(B),
\]
provided both translated nodes are defined.
\end{definition}
This allows RDD to compare dependency graphs produced by novices, experts, instruments, models,
or formal systems without assuming that their node vocabularies are identical.
\subsection{Dependency Strength Ordering}
The dependency strengths form a partial order:
\[
\text{hard necessity}
\succ
\text{defeasible necessity}
\succ
\text{contributory support}.
\]
Sufficiency and redundancy are not simply weaker or stronger forms of necessity. They are
orthogonal semantic properties:
\[
\text{necessity}\not\equiv\text{sufficiency},
\]
\[
\text{redundancy}\not\equiv\text{irrelevance}.
\]
A dependency can be:
\begin{itemize}
\item necessary but not sufficient;
\item sufficient but not necessary;
\item contributory but redundant;
\item non-redundant but defeasible;
\item mutually dependent without being circularly defective.
\end{itemize}
\subsection{Semantic Edge Acceptance}
An edge \(A\rightarrow B\) is accepted when three conditions hold:
\[
\Val_t(A,B,c,o)=1,
\]
\[
S(A,B,c,o)\in\mathcal{S}_{accepted},
\]
\[
W(A,B,c,o)\geq\tau.
\]
Thus acceptance is not based on graph form alone. It requires type validity, strength assignment,
and sufficient confidence.
\begin{definition}[Accepted dependency edge]
An edge \(e=(A,B,t,s,w,c,o,\phi,r)\) is accepted at threshold \(\tau\) if:
\[
\Val_t(e)=1
\quad\text{and}\quad
w\geq\tau.
\]
It is provisional if:
\[
\Val_t(e)=\star
\quad\text{or}\quad
w<\tau.
\]
It is rejected if:
\[
\Val_t(e)=0.
\]
\end{definition}
\subsection{Semantic Revision}
A counterexample is a configuration:
\[
a^\ast\in\Adm_{c,o}
\]
such that an accepted edge predicts that \(B\) should fail without \(A\), but:
\[
\iota_{c,o}(B,a^\ast)=1
\quad\text{and}\quad
\iota_{c,o}(A,a^\ast)=0.
\]
RDD responds by one of four operations:
\[
\text{delete},\quad
\text{weaken},\quad
\text{context-restrict},\quad
\text{retype}.
\]
The preferred operation is the one that minimizes semantic loss:
\[
r^\ast
=
\argmin_{r\in\Rev}
\Loss(G,r\mid a^\ast),
\]
where \(\Rev\) is the set of available revision operations.
Context restriction is preferred when there exists \(c'\subset c\) such that:
\[
\Val_t(A,B,c',o)=1
\]
and:
\[
a^\ast\notin\Adm_{c',o}.
\]
This formalizes the principle that a counterexample should narrow an overbroad dependency before
destroying a valid local dependency.
\subsection{Semantic Summary}
The syntax of RDD specifies the graph:
\[
G=(V,E,T,S,W,C,O,\Phi,R).
\]
The semantics specifies when the graph's edges are true, weakly supported, necessary, sufficient,
redundant, substitutable, context-limited, or observer-relative. The key shift is from individual
edges alone to dependency sets:
\[
\Gamma\Rightarrow B.
\]
Individual edges remain useful, but the deeper semantic object is the support relation between a
set of enabling conditions and a target node.
% ---------------------------------------------------------------------
\section{Dependency Strength}
\label{sec:dependency-strength}
% ---------------------------------------------------------------------
The word ``dependency'' can be too broad unless its strength is specified. RDD therefore
distinguishes five strength categories.
\begin{definition}[Hard necessity]
\(A\) is a hard necessary dependency of \(B\) under \((c,o)\) if:
\[
\nexists a\in\Adm_{c,o}:
\iota_{c,o}(B,a)=1
\ \text{and}\
\iota_{c,o}(A,a)=0.
\]
Hard necessity should be reserved for definitional, formal, constitutive, or strongly established
cases.
\end{definition}
\begin{definition}[Defeasible necessity]
\(A\) is a defeasible necessary dependency of \(B\) under \((c,o)\) when available evidence
supports the claim that \(B\) fails without \(A\), but the claim remains open to counterexample,
confidence revision, and context restriction.
\end{definition}
\begin{definition}[Contributory support]
\(A\) is a contributory dependency of \(B\) under \((c,o)\) when changing, removing, or degrading
\(A\) changes the probability, quality, stability, performance, intelligibility, or likelihood of
\(B\), without requiring that \(B\) be impossible without \(A\).
\end{definition}
\begin{definition}[Sufficient support]
\(A\) is sufficient support for \(B\) under \((c,o)\) if the presence of \(A\), together with the
background conditions encoded in \(c\), is enough to instantiate, operate, or explain \(B\).
\end{definition}
\begin{definition}[Redundant support]
\(A\) is redundant support for \(B\) under \((c,o)\) if \(A\) supports \(B\), but alternate
dependency sets can support \(B\) when \(A\) is absent.
\end{definition}
Where a probabilistic or performance measure is available, contributory support can be represented
by:
\[
D_{c,o}(A,B)=m_{c,o}(B\mid A)-m_{c,o}(B\mid \neg A),
\]
where \(m_{c,o}\) is a domain-specific measure such as accuracy, mastery, survival, explanatory
score, or stability. For interventionist cases:
\[
D^{\doop}_{c,o}(A,B)
=
m_{c,o}(B\mid \doop(A))
-
m_{c,o}(B\mid \doop(\neg A)).
\]
\begin{remark}
This distinction prevents weak support from being overstated as necessity. Many useful RDD edges
are contributory rather than necessary.
\end{remark}
% ---------------------------------------------------------------------
\section{Observer Models}
\label{sec:observer-models}
% ---------------------------------------------------------------------
In RDD, an observer is not restricted to a human subject. An observer is any system that can
resolve states, apply interpretive constraints, register distinctions, evaluate evidence, and
update dependency representations. An observer may be a human reasoner, scientific instrument,
trained model, symbolic architecture, experimental protocol, collective research community, or AI
system acting as a structured evaluator.
Formally, an observer model is a tuple:
\[
o=(\Sigma_o,I_o,M_o,E_o,U_o),
\]
where:
\begin{itemize}
\item \(\Sigma_o\) is the observer's representational state space;
\item \(I_o\) is its interpretation map;
\item \(M_o\) is its measurement or evidence-access procedure;
\item \(E_o\) is its edge-evaluation procedure;
\item \(U_o\) is its update rule.
\end{itemize}
\paragraph{Observer-indexed example.}
For a novice calculus learner,
\[
\text{Function}\xrightarrow{epi}\text{Derivative}
\]
is a strong epistemic prerequisite. For a mathematics professor, the same dependency may be
trivial background rather than an active learning constraint. For an automated theorem checker,
the dependency may be transformed into a formal definitional relation over a specified theory.
RDD preserves these differences by indexing dependency to observer model rather than treating all
forms of understanding as identical.
% ---------------------------------------------------------------------
\section{Dependency Types and Validity Conditions}
\label{sec:dependency-types}
% ---------------------------------------------------------------------
RDD uses three basis types:
\[
\Types=\{\mathrm{ontological},\mathrm{functional},\mathrm{epistemic}\}.
\]
An edge may carry more than one type, but each type must be justified separately.
\subsection{Ontological Dependency}
\begin{definition}[Ontological dependency]
An edge \(A\xrightarrow{ont}B\) is an ontological dependency under \((c,o)\) when \(A\) is part
of the constitution, identity condition, substrate, or admissibility condition of \(B\). Strong
acceptance requires:
\begin{enumerate}[label=(\roman*)]
\item no admissible configuration instantiates or preserves \(B\) without \(A\);
\item the impossibility follows from constitution or identity conditions, not merely observed
co-occurrence;
\item the dependency is non-vacuous, meaning that \(A\) was not inserted into the definition of
\(B\) solely to make the edge true.
\end{enumerate}
\end{definition}
\begin{example}
\[
\text{Distinction}\xrightarrow{ont}\text{Information}.
\]
Information, in Shannon's formal treatment and in broader distinction-based accounts, requires
discriminable alternatives~\cite{shannon,spencerbrown}. A system with no possible distinction has
no information-bearing contrast.
\end{example}
\subsection{Functional Dependency}
\begin{definition}[Functional dependency]
An edge \(A\xrightarrow{fun}B\) is a functional dependency under \((c,o)\) when the operation,
performance, stability, or task-relevant behavior of \(B\) in a system \(S\) depends on the
activity, output, structure, or availability of \(A\). Strong acceptance requires:
\begin{enumerate}[label=(\roman*)]
\item an explicitly specified system \(S\) and task-relevant measure \(m_B\);
\item evidence that intervening on, ablating, disabling, masking, or degrading \(A\) degrades
\(B\):
\[
D^{\doop}_{c,o}(A,B)>0;
\]
\item an argument that degradation follows from \(A\)'s role in \(S\), not from unrelated damage
or coincidental correlation.
\end{enumerate}
\end{definition}
\begin{example}
\[
\text{Memory}\xrightarrow{fun}\text{Prediction}
\]
holds in contexts where temporally extended prediction requires retained state. Recurrent
networks and long short-term memory architectures make this dependency explicit in artificial
systems~\cite{rumelhart,hochreiter}. Memory systems are also central to flexible cognition and
future-oriented behavior in biological systems~\cite{scoville,squire}.
\end{example}
\subsection{Epistemic Dependency}
\begin{definition}[Epistemic dependency]
An edge \(A\xrightarrow{epi}B\) is an epistemic dependency for observer model \(o\) under context
\(c\) when understanding, learning, explaining, or correctly using \(B\) requires prior access to
\(A\) for the specified observer class. Strong acceptance requires:
\begin{enumerate}[label=(\roman*)]
\item a specified observer class;
\item a measurable proxy for understanding, mastery, explanation, or correct use;
\item evidence or argument that omitting \(A\) impairs acquisition or intelligibility of \(B\);
\item a distinction between genuine prerequisite structure and mere historical teaching order.
\end{enumerate}
\end{definition}
\begin{example}
\[
\text{Function}\xrightarrow{epi}\text{Derivative}.
\]
For introductory calculus learners, a derivative cannot be understood as the rate of change of a
function unless the concept of function is already available.
\end{example}
% ---------------------------------------------------------------------
\section{Anti-Vacuity and Edge Discipline}
\label{sec:anti-vacuity}
% ---------------------------------------------------------------------
Dependency formalisms are vulnerable to trivialization. If one defines \(B\) as ``\(B\) plus
\(A\),'' then \(A\rightarrow B\) becomes true by construction. RDD prevents this with
anti-vacuity conditions.
\begin{definition}[Anti-vacuity condition]
A dependency edge \(A\rightarrow B\) is non-vacuous only if at least one of the following holds:
\begin{enumerate}[label=(\roman*)]
\item \(A\) can be independently specified without presupposing \(B\);
\item removing or altering \(A\) changes an observable, formal, functional, or epistemic status
of \(B\);
\item independent observers can evaluate the edge without simply restating the definition of
\(B\);
\item the edge distinguishes at least one admissible case from one inadmissible case.
\end{enumerate}
\end{definition}
\begin{remark}
Anti-vacuity is necessary for RDD to remain explanatory rather than merely definitional.
\end{remark}
% ---------------------------------------------------------------------
\section{Precedence, Prediction, and Contradiction}
\label{sec:three-way-test}
% ---------------------------------------------------------------------
Each accepted edge should answer three questions.
\paragraph{Precedence.}
For epistemic dependency, does \(A\) need to precede \(B\) for the relevant observer or learning
sequence? This is not merely chronological priority. It is a claim that omitting \(A\) impairs
acquisition, explanation, or use of \(B\).
\paragraph{Prediction.}
For functional dependency, does changing \(A\) predict a change in \(B\)? An accepted functional
edge should support an ablation, intervention, degradation, or counterfactual test where such a
test is possible.
\paragraph{Contradiction.}
What observation would weaken, restrict, or falsify the edge? A dependency claim that cannot
state any possible counterexample should remain provisional unless it is explicitly definitional
or formal.
\paragraph{Example.}
The unrestricted edge
\[
\text{Memory}\xrightarrow{fun}\text{Prediction}
\]
is too broad. A fully observable one-step Markov predictor can predict the next state from the
present state alone. RDD therefore restricts the context:
\[
\text{Memory}\xrightarrow{fun}_{c=\mathrm{hidden\text{-}state\ temporal\ prediction}}
\text{Prediction}.
\]
The contradiction does not destroy the local edge. It reveals that the original edge was
overgeneralized.
% ---------------------------------------------------------------------
\section{Taxonomy Reduction}
\label{sec:taxonomy}
% ---------------------------------------------------------------------
A broad candidate taxonomy includes ontological, functional, epistemic, causal, semantic,
emergent, historical, and observer-relative dependencies. The formal basis reduces to:
\[
\{\mathrm{ontological},\mathrm{functional},\mathrm{epistemic}\}.
\]
Causal dependency is treated as a subtype of functional dependency with an interventionist
criterion. Semantic dependency is treated as a subtype of epistemic dependency over meaning and
interpretation. Emergent dependency is treated as ontological dependency across levels of
description. Historical dependency is treated as a temporal annotation unless paired with one of
the basis types. Observer-relative dependency is an index, because all edges already carry an
observer model.
\begin{proposition}[Basis sufficiency]
For the purposes of RDD graph construction, the three basis types are sufficient to encode the
larger taxonomy as primitive types, subtypes, annotations, or indices.
\end{proposition}
\begin{proof}
Causal edges require an intervention-sensitive operational relation and therefore satisfy the
functional form with an additional causal constraint. Semantic edges require prior meaning
availability for interpretation and therefore satisfy the epistemic form for a specified observer
model. Emergent edges require lower-level enabling structures for higher-level instantiation and
therefore satisfy the ontological form across levels. Historical relations specify temporal order
but do not by themselves establish possibility dependence, so they are annotations unless paired
with one of the three basis types. Observer-relative relations specify the observer index under
which an edge is evaluated, not a separate edge predicate. Thus the larger taxonomy can be
represented without adding additional primitive basis types.
\end{proof}
% ---------------------------------------------------------------------
\section{Graph Formalism}
\label{sec:graph-formalism}
% ---------------------------------------------------------------------
An RDD graph is:
\[
G=(V,E,T,S,W,C,O,\Phi,R),
\]
where:
\begin{itemize}
\item \(V\) is the set of interpreted nodes;
\item \(E\subseteq V\times V\) is the set of directed dependency edges;
\item \(T(e)\subseteq\Types\) assigns one or more basis types;
\item \(S(e)\) assigns dependency strength;
\item \(W(e)\in[0,1]\) assigns confidence;
\item \(C(e)\in\Ctx\) assigns context;
\item \(O(e)\in\Obs\) assigns observer model;
\item \(\Phi(e)\) stores evidence, justification, source route, and provenance;
\item \(R(e)\) stores revision history.
\end{itemize}
The edge object is:
\[
e=(u,v,t,s,w,c,o,\phi,r).
\]
The graph is a directed multigraph. Multiple edges may connect the same ordered pair under
different types, strengths, contexts, observer models, or evidential justifications.
\subsection{Validity Predicate}
For each type \(t\), define a validity predicate:
\[
\Val_t(e)\in\{0,1,\star\},
\]
where \(1\) means accepted, \(0\) means rejected, and \(\star\) means provisional. A provisional
edge may be retained as a hypothesis, but it should not be used as a high-confidence premise.
\subsection{Confidence Tiers}
The confidence weight \(W(e)\) is grounded by evidential tier.
\begin{center}
\begin{tabular}{p{0.12\linewidth}p{0.62\linewidth}p{0.16\linewidth}}
\toprule
Tier & Description & Default range \\
\midrule
1 & Formal, definitional, or logically necessary under explicit assumptions & 0.95--1.00 \\
2 & Strong experimental, ablation, formal, or expert-consensus evidence & 0.75--0.94 \\
3 & Theoretically supported with no known counterexample in the stated context & 0.50--0.74 \\
4 & Plausible by analogy, weak evidence, or limited expert intuition & 0.25--0.49 \\
5 & Speculative, contradicted, underspecified, or insufficiently grounded & 0.00--0.24 \\
\bottomrule
\end{tabular}
\end{center}
These ranges are calibration targets, not universal probabilities. A mature implementation should
learn or calibrate them against domain-specific evidence.
% ---------------------------------------------------------------------
\section{Direct and Indirect Dependency}
\label{sec:direct-indirect}
% ---------------------------------------------------------------------
\begin{definition}[Indirect dependency]
Node \(A\) is an indirect dependency of \(B\) if there exists a directed path
\[
A=v_0\rightarrow v_1\rightarrow\cdots\rightarrow v_k=B
\]
with \(k\geq 2\), evaluated under compatible contexts and observer models.
\end{definition}
\begin{definition}[Direct dependency]
Node \(A\) is a direct dependency of \(B\) under type \(t\) and context \((c,o)\) if
\(A\rightarrow B\) satisfies the validity predicate for \(t\), and there is no accepted mediator
set \(M\) such that \(A\)'s contribution to \(B\) is fully screened off by paths through \(M\).
\end{definition}
If
\[
\text{Distinction}\rightarrow\text{Information}\rightarrow\text{Prediction},
\]
then distinction is an indirect dependency of prediction. It should not automatically be inserted
as a direct dependency unless a separate direct validity argument is supplied.
\begin{proposition}[Path inheritance for indirect dependency]
If \(A\) is a dependency of \(B\), and \(B\) is a dependency of \(C\), then \(A\) is an indirect
dependency of \(C\), provided all edges are evaluated under compatible contexts and observer
models.
\end{proposition}
\begin{proof}
The accepted edges \(A\rightarrow B\) and \(B\rightarrow C\) form a directed path of length two
from \(A\) to \(C\). By definition, \(A\) is an indirect dependency of \(C\). Compatibility of
context and observer model is required because RDD edges are indexed claims. A path joining
incompatible indexed claims does not license inheritance.
\end{proof}
% ---------------------------------------------------------------------
\section{Structural Properties}
\label{sec:structural-properties}
% ---------------------------------------------------------------------
The following results are structural properties of the formalism under stated assumptions. They
do not prove that generated dependency claims are true. They show how RDD handles revision,
mediator screening, hidden-edge recovery, observer aggregation, semantic revision, and bounded
traversal once candidate edges and validity predicates are specified.
\subsection{Context Restriction}
Let a broad context \(c\) be partitioned into subcontexts:
\[
c=c_1\cup c_2\cup\cdots\cup c_k.
\]
Let \(e:A\rightarrow B\) satisfy its validity predicate on a non-empty subset
\(S\subseteq\{c_1,\ldots,c_k\}\) and fail on the complement \(\bar S\).
\begin{proposition}[Context restriction dominance]
Replacing the broad edge \(A\rightarrow_cB\) with the restricted edge
\[
A\rightarrow_{c'=\cup S}B
\]
preserves the valid subcontext instances and excludes the invalid subcontext instances, whereas
unrestricted acceptance preserves false instances and total deletion discards true instances.
\end{proposition}
\begin{proof}
Unrestricted acceptance asserts the edge over every \(c_i\), including cases in \(\bar S\) where
it fails. Total deletion rejects the edge over every \(c_i\), including cases in \(S\) where it
holds. Context restriction asserts the edge only over \(c'=\cup S\). By assumption the edge
satisfies the validity predicate on \(S\) and fails outside \(S\). Therefore restriction preserves
valid local claims while excluding known invalid cases.
\end{proof}
\subsection{Mediator Screening and Sparsity}
Let \(G=(V,E)\) be an acyclic dependency graph. A candidate direct edge \(A\rightarrow B\) is
mediator-screened if there exists a set \(M\subset V\) such that all accepted explanatory paths
from \(A\) to \(B\) pass through \(M\), and \(A\)'s contribution to \(B\) is fully accounted for by
those paths.
\begin{proposition}[Mediator-screened sparsity]
In a directed acyclic RDD graph, rejecting mediator-screened edges as direct edges preserves
reachability while reducing direct-edge density. If all mediation relations are correctly
identified, the accepted direct-edge graph approaches the transitive reduction of the relevant
reachability graph.
\end{proposition}
\begin{proof}
The transitive closure contains an edge for every reachable pair and therefore includes indirect
relations. If \(A\)'s contribution to \(B\) is fully mediated by accepted paths through \(M\), the
edge \(A\rightarrow B\) is redundant as a direct edge. Removing it preserves reachability through
\(M\). In an acyclic graph, the transitive reduction is the minimal graph preserving reachability.
Thus correct mediator screening reduces direct-edge density toward the transitive reduction.
\end{proof}
\subsection{Hidden-Edge Recovery}
Let \(H\) be the set of hidden true dependency edges in a benchmark. Let
\[
\rho=P(\text{generator proposes a hidden true edge}),
\]
\[
\sigma=P(\text{validator accepts a proposed true edge}),
\]
\[
\pi=P(\text{accepted true edge passes the confidence threshold}).
\]
Let \(F\) be the set of possible false candidate edges, and let
\[
\begin{aligned}
\alpha&=P(\text{generator proposes a false edge}),\\
\varphi&=P(\text{validator incorrectly accepts a false edge}),\\
\kappa&=P(\text{accepted false edge passes the threshold}).
\end{aligned}
\]
\begin{proposition}[Expected recovery under independence]
Under independent proposal, validation, and threshold events, expected recall on hidden true edges
is
\[
\Ex[\mathrm{Recall}]=\rho\sigma\pi,
\]
and expected precision is
\[
\Ex[\mathrm{Precision}]
=
\frac{|H|\rho\sigma\pi}{|H|\rho\sigma\pi+|F|\alpha\varphi\kappa}.
\]
\end{proposition}
\begin{proof}
A hidden true edge is recovered only if it is proposed, accepted, and retained after thresholding.
Under independence, that probability is \(\rho\sigma\pi\). Expected true recoveries are
\(|H|\rho\sigma\pi\). Expected false accepted edges are \(|F|\alpha\varphi\kappa\). Substituting
these expectations into the definition of precision gives the stated expression.
\end{proof}
\subsection{Observer Aggregation}
Let \(o_1,\ldots,o_k\) be independent observer models evaluating the same binary edge decision.
Let each observer have probability \(p_i>1/2\) of making the correct decision.
\begin{theorem}[Majority reliability under independence]
If observer errors are independent and each observer is better than chance, then majority vote has
lower error probability than a single observer in the limit as \(k\) increases.
\end{theorem}
\begin{proof}
For identical observer accuracy \(p_i=p>1/2\), the probability of majority correctness is
\[
\sum_{j=\lceil(k+1)/2\rceil}^{k}\binom{k}{j}p^j(1-p)^{k-j},
\]
which tends to \(1\) as \(k\rightarrow\infty\) by the standard Condorcet-style aggregation
argument. The non-identical case follows under the same independence and better-than-chance
conditions by standard extensions of the same result.
\end{proof}
\begin{remark}
This theorem is not a license to average arbitrary opinions. Its assumptions are strong:
observer competence, approximate independence, and a shared target decision. If observers share
the same bias, use incompatible edge definitions, or evaluate different contexts, aggregation can
amplify error rather than reduce it.
\end{remark}
\subsection{Bounded Traversal}
\begin{proposition}[Finite termination under bounded traversal]
If recursive traversal is bounded by finite depth \(d\), finite branching factor \(b\), and a
finite candidate set at each expansion step, then the RDD traversal procedure terminates.
\end{proposition}
\begin{proof}
At each level, at most \(b\) candidates are generated per frontier node. With finite depth \(d\),
the total number of generated candidates is bounded above by a finite geometric sum:
\[
1+b+b^2+\cdots+b^d.
\]
Because candidate validation and graph update are finite operations by assumption, the traversal
terminates.
\end{proof}
\subsection{Minimal Set Sparsity}
\begin{proposition}[Minimal support sparsity]
If \(B\) has a minimal sufficient dependency set \(\Gamma\), then any proper superset
\(\Gamma'\supsetneq\Gamma\) is non-minimal, even if it also supports \(B\).
\end{proposition}
\begin{proof}
By definition, \(\Gamma\) is sufficient for \(B\), and no proper subset of \(\Gamma\) is
sufficient. If \(\Gamma'\supsetneq\Gamma\), then \(\Gamma\subsetneq\Gamma'\) is a proper subset of
\(\Gamma'\) that is already sufficient for \(B\). Therefore \(\Gamma'\) is not minimal.
\end{proof}
\begin{remark}
This result prevents dependency inflation at the set level. RDD should prefer minimal support
sets when the explanatory task is to identify necessary structure rather than all possible
supporting background.
\end{remark}
% ---------------------------------------------------------------------
\section{Meaning Resolution}
\label{sec:meaning-resolution}
% ---------------------------------------------------------------------
RDD requires meaning resolution before traversal. The same lexical item may refer to different
nodes. ``Cell'' may mean a biological cell, spreadsheet cell, prison cell, battery cell, or
cellular automaton unit. Meaning resolution maps a query \(q\) to:
\[
q\mapsto(X,c,o,b).
\]
For example, the query ``What makes a cell possible?'' has different resolutions. Under a
biological context:
\[
\text{Membrane}\rightarrow\text{Cell},\quad
\text{Metabolism}\rightarrow\text{Cell},\quad
\text{Replication Machinery}\rightarrow\text{Cell}.
\]
Under a spreadsheet context:
\[
\text{Grid}\rightarrow\text{Cell},\quad
\text{Addressing Scheme}\rightarrow\text{Cell}.
\]
Neither graph is globally correct. Each is locally correct under its resolved context.
% ---------------------------------------------------------------------
\section{Recursive Operators}
\label{sec:operators}
% ---------------------------------------------------------------------
The backward operator is:
\[
B_G(X)=\{Y\in V:Y\rightarrow X\in E\}.
\]
The forward operator is:
\[
F_G(X)=\{Z\in V:X\rightarrow Z\in E\}.
\]
Recursive traversal is:
\[
B_G^n(X)=B_G(B_G^{n-1}(X)),
\qquad
F_G^n(X)=F_G(F_G^{n-1}(X)).
\]
Because the operators return sets, traversal produces branching subgraphs rather than single
chains. Traversal is bounded by maximum depth \(d\), confidence threshold \(\tau\), context
\(c\), observer model \(o\), dependency type \(t\), and strength \(s\).
\subsection{Set-Based Backward Operator}
Because formal semantics treats dependency as a support-set relation, the backward operator can
also be lifted from edges to sets:
\[
\mathcal{B}_G(X)=\{\Gamma\subseteq V:\Gamma\Rightarrow X\}.
\]
This operator returns dependency sets rather than individual predecessors.
Similarly, the forward set operator is:
\[
\mathcal{F}_G(\Gamma)=\{Z\in V:\Gamma\Rightarrow Z\}.
\]
The edge-based operators remain useful for graph traversal. The set-based operators are more
semantically precise when the target depends on combinations of conditions.
% ---------------------------------------------------------------------
\section{Candidate Edge Generation}
\label{sec:generation}
% ---------------------------------------------------------------------
RDD separates candidate generation from validation. Candidate generation asks which edges should
be considered. Validation asks which proposed edges deserve acceptance.
\subsection{Generation Routes}
\paragraph{Definitional decomposition.}
If \(B\) is defined by component conditions \(\{A_1,\ldots,A_k\}\), each \(A_i\rightarrow B\) is
proposed as an ontological or epistemic candidate.
\paragraph{Support-set decomposition.}
If a target \(B\) appears to require a set of conditions \(\Gamma\), RDD proposes both the
set-based relation:
\[
\Gamma\Rightarrow B
\]
and candidate projected edges:
\[
A_i\rightarrow B
\quad\text{for each}\quad
A_i\in\Gamma.
\]
The set relation receives priority over isolated edge interpretation.
\paragraph{Intervention and ablation reasoning.}
If disabling \(A\) is expected to degrade \(B\), RDD proposes \(A\xrightarrow{fun}B\).
\paragraph{Expert and prerequisite graph extraction.}
If expert annotations or prerequisite datasets identify \(A\) as required for \(B\), RDD proposes
\(A\xrightarrow{epi}B\).
\paragraph{Corpus-supported relation mining.}
Textual patterns such as ``requires,'' ``depends on,'' ``enables,'' and ``presupposes'' generate
hypotheses only. They do not validate edges.
\paragraph{Model-generated hypotheses.}
A human reasoner, symbolic system, language model, or recursive architecture may propose
candidate dependencies. The generator is a hypothesis source, not an authority.
\subsection{Generation-to-Validation Pipeline}
The pipeline is:
\begin{enumerate}
\item Generate candidate edge \(A\rightarrow B\) or support-set claim \(\Gamma\Rightarrow B\).
\item Resolve meanings of all nodes.
\item Assign context \(c\) and observer model \(o\).
\item Assign candidate type \(T(e)\) and strength \(S(e)\).
\item Apply type-specific validity predicates.
\item Evaluate necessity, sufficiency, redundancy, substitutability, and essentiality where
applicable.
\item Assign evidence tier and confidence weight.
\item Store source route and evidence in \(\Phi(e)\).
\item Accept, reject, weaken, context-restrict, retype, or mark provisional.
\end{enumerate}
% ---------------------------------------------------------------------
\section{Algorithm}
\label{sec:algorithm}
% ---------------------------------------------------------------------
\noindent\textbf{Input:} query \(q\), depth \(d\), context \(c\), observer model \(o\), threshold
\(\tau\).\\
\textbf{Output:} dependency graph \(G_q\).
\begin{enumerate}
\item Resolve \(q\mapsto(X,c,o,b)\).
\item Initialize graph \(G_q\) with target node \(X\).
\item Generate backward candidates for \(X\).
\item Generate forward candidates for \(X\).
\item Generate candidate support sets where available.
\item Validate candidate edges and support-set claims.
\item Assign types, strengths, weights, contexts, observer models, and provenance.
\item Recursively expand accepted nodes to bounded depth.
\item Preserve rejected and provisional edges in audit history.
\item Mark unresolved lowest nodes as provisional primitives.
\item Evaluate graph coherence, compression, semantic equivalence, and contradiction status.
\item Revise graph when evidence, counterexamples, missing dependencies, or redundant support
sets appear.
\end{enumerate}
\begin{lstlisting}[style=pythonstyle,caption={High-level RDD pseudocode.}]
def RDD(query, depth, context, observer, tau):
X = resolve_meaning(query, context, observer)
G = initialize_graph(X)
frontier_back = {X}
frontier_fwd = {X}
for level in range(depth):
for node in frontier_back:
candidates = generate_backward_candidates(node, context, observer)
support_sets = generate_support_sets(node, context, observer)
for gamma in support_sets:
claim = create_support_claim(gamma, node)
claim = annotate_claim(claim, context, observer)
claim.validity = test_support_set_validity(claim)
claim.redundancy = test_redundancy(claim)
claim.minimality = test_minimality(claim)
update_graph_with_support_claim(G, claim, tau)
for p in candidates:
e = create_edge(p, node)
e = annotate_edge(e, context, observer)
e.type = classify_basis_type(e)
e.strength = classify_strength(e)
e.validity = test_validity(e)
e.evidence = record_evidence(e)
update_graph_with_candidate(G, e, tau)
for node in frontier_fwd:
candidates = generate_forward_candidates(node, context, observer)
for z in candidates:
e = create_edge(node, z)
e = annotate_edge(e, context, observer)
e.type = classify_basis_type(e)
e.strength = classify_strength(e)
e.validity = test_validity(e)
e.evidence = record_evidence(e)
update_graph_with_candidate(G, e, tau)
frontier_back = newly_accepted_prerequisites(G)
frontier_fwd = newly_accepted_consequences(G)
mark_provisional_primitives(G)
coherence_check(G)
compression_check(G)
semantic_equivalence_check(G)
return G
\end{lstlisting}
% ---------------------------------------------------------------------
\section{Graph Revision}
\label{sec:revision}
% ---------------------------------------------------------------------
RDD revises graphs under seven conditions:
\begin{enumerate}
\item a proposed edge fails a validity predicate;
\item new evidence changes confidence;
\item a counterexample appears;
\item a missing dependency is detected;
\item complexity increases without explanatory gain;
\item a support set is found to be non-minimal;
\item observer translation reveals incompatible node meanings.
\end{enumerate}
A simple confidence update is:
\[
w_i^{new}=\clip_{[0,1]}(w_i^{old}+\Delta_i),
\]
where \(\Delta_i\) is determined by new evidence. Revision may also alter context, observer
model, edge type, dependency strength, support-set membership, or evidence tier:
\[
C(e)\leftarrow C'(e),
\quad
O(e)\leftarrow O'(e),
\quad
T(e)\leftarrow T'(e),
\quad
S(e)\leftarrow S'(e).
\]
\subsection{Non-Destructive Context Revision}
\begin{definition}[Context restriction]
Given an edge \(e:A\rightarrow B\) accepted under context \(c\), context restriction replaces
\(c\) with a narrower context \(c'\subset c\) when the edge fails in \(c\) but remains valid in
\(c'\).
\end{definition}
\begin{proposition}[Non-destructive revision]
If an edge fails under broad context \(c\) but satisfies its validity predicate under narrower
context \(c'\subset c\), RDD should revise \(C(e)=c\) to \(C(e)=c'\) rather than delete the edge.
\end{proposition}
\begin{proof}
The failure under \(c\) shows that the universal claim over \(c\) is false. It does not show that
the edge is false under every subcontext. If the validity predicate is satisfied under \(c'\),
deletion discards a true narrower dependency. Context restriction preserves valid information
while eliminating overgeneralization.
\end{proof}
\subsection{Support-Set Revision}
If a set \(\Gamma\Rightarrow B\) is accepted but later a smaller set
\(\Gamma'\subsetneq\Gamma\) is found sufficient for \(B\), then RDD revises:
\[
\Gamma\Rightarrow B
\quad\longrightarrow\quad
\Gamma'\Rightarrow B,
\]
while preserving the larger set as a non-minimal or redundant support structure.
This prevents the system from treating background conditions as essential dependencies.
% ---------------------------------------------------------------------
\section{Worked Example: Prediction}
\label{sec:prediction-example}
% ---------------------------------------------------------------------
Let
\[
X=\text{Prediction}.
\]
Meaning resolution:
\[
\text{Prediction}=\text{capacity to anticipate future or unobserved states from present or prior information}.
\]
A candidate backward dependency set is:
\[
B_G(\text{Prediction})=
\{
\text{Information},
\text{Dynamics},
\text{State Space},
\text{Memory}
\}.
\]
A support-set interpretation may be:
\[
\Gamma_{\text{prediction}}
=
\{\text{Information},\text{State Space},\text{Dynamics}\}
\Rightarrow
\text{Prediction}.
\]
This is stronger than treating each edge as individually sufficient. Information alone may not be
enough for prediction. A state space alone may not be enough. Dynamics alone may not be enough.
Together, under a suitable observer model and context, they form a candidate sufficient support
set.
The edge
\[
\text{Information}\xrightarrow{fun,epi}\text{Prediction}
\]
is high-confidence in most contexts because prediction requires some informative relation to
current, prior, or latent states.
The edge
\[
\text{Dynamics}\xrightarrow{fun}\text{Prediction}
\]
means that prediction requires an assumed or learned transition relation, law, tendency, or model
of change.
The edge
\[
\text{Memory}\xrightarrow{fun}\text{Prediction}
\]
requires context restriction. It is strong for hidden-state, temporally extended, or
history-dependent prediction, but weak or false for fully observable one-step prediction. RDD
therefore records:
\[
\text{Memory}\xrightarrow{fun}_{c=\mathrm{history\text{-}dependent\ prediction}}
\text{Prediction}.
\]
Forward traversal gives:
\[
F_G(\text{Prediction})=
\{
\text{Planning},
\text{Control},
\text{Learning},
\text{Scientific Modeling}
\}.
\]
A compact path is:
\[
\text{Distinction}\rightarrow\text{Information}\rightarrow\text{Prediction}\rightarrow\text{Planning}.
\]
This path should not be collapsed into a direct edge from distinction to planning unless a direct
validity argument is supplied.
% ---------------------------------------------------------------------
\section{Worked Example: Derivative}
\label{sec:derivative-example}
% ---------------------------------------------------------------------
Let:
\[
X=\text{Derivative}.
\]
A single-edge graph may propose:
\[
\text{Function}\rightarrow\text{Derivative},
\quad
\text{Limit}\rightarrow\text{Derivative},
\quad
\text{Difference Quotient}\rightarrow\text{Derivative}.
\]
The semantic view is more precise:
\[
\Gamma_{\text{derivative}}
=
\{
\text{Function},
\text{Limit},
\text{Difference Quotient}
\}
\Rightarrow
\text{Derivative}.
\]
For an introductory calculus learner, each member of \(\Gamma_{\text{derivative}}\) may be an
epistemic dependency. But none of the individual nodes is necessarily sufficient by itself. A
student may understand functions without understanding derivatives. A student may understand
limits without understanding their application to rates of change. A student may manipulate
difference quotients procedurally without understanding the derivative concept. The support-set
relation captures the combined prerequisite structure.
This example illustrates why \(\Gamma\Rightarrow B\) is the deeper semantic object and why
edge-only dependency graphs should be interpreted carefully.
% ---------------------------------------------------------------------
\section{Toy Demonstration}
\label{sec:toy-demo}
% ---------------------------------------------------------------------
The purpose of the toy demonstration is executability, not empirical proof. Consider a small
expert graph of conceptual dependencies:
\[
\text{Distinction}\rightarrow\text{Information}\rightarrow\text{Prediction}\rightarrow\text{Planning},
\]
\[
\text{State Space}\rightarrow\text{Prediction},
\quad
\text{Dynamics}\rightarrow\text{Prediction},
\]
\[
\text{Function}\rightarrow\text{Derivative},
\quad
\text{Limit}\rightarrow\text{Derivative},
\quad
\text{Difference Quotient}\rightarrow\text{Derivative}.
\]
Suppose a benchmark hides the edge:
\[
\text{Dynamics}\rightarrow\text{Prediction}.
\]
An RDD generator receives the target ``Prediction'' and proposes candidates from definitional,
functional, and corpus-supported cues. A validator then checks whether each candidate satisfies
type and strength criteria. If the hidden edge is proposed and accepted, it counts as a recovered
dependency.
Metrics include:
\[
\mathrm{Precision}=\frac{TP}{TP+FP},
\qquad
\mathrm{Recall}=\frac{TP}{TP+FN},
\qquad
F_1=\frac{2PR}{P+R}.
\]
This toy case verifies that the representation can encode nodes, edges, types, strengths,
confidence, support sets, and revision operations. It does not show that RDD outperforms
baselines. That question belongs to the validation program.
% ---------------------------------------------------------------------
\section{Validation Program}
\label{sec:validation}
% ---------------------------------------------------------------------
The empirical claim of RDD should be tested against baselines. A preregisterable validation
program can be organized around five tasks.
\subsection{Hidden-Edge Recovery}
Given expert dependency graphs with some edges hidden, compare RDD against association-based,
embedding-based, citation-based, and ordinary knowledge-graph baselines. The task is to recover
hidden enabling relations.
\paragraph{Metrics.}
Precision, recall, \(F_1\), calibration error, edge-type accuracy, and context-restriction
accuracy.
\paragraph{Failure condition.}
RDD fails this task if it does not recover hidden dependencies better than simpler association or
embedding baselines.
\subsection{Functional Ablation Prediction}
Given a system with components and performance measures, RDD predicts which components support
which functions. The predictions are tested by ablation, disabling, perturbation, masking, or
controlled degradation.
\paragraph{Metrics.}
Ablation-prediction accuracy, effect-size rank correlation, false-positive dependency rate, and
false necessity rate.
\paragraph{Failure condition.}
RDD fails if it labels components as necessary when ablation shows no relevant degradation, or if
it consistently misses components whose removal strongly degrades performance.
\subsection{Epistemic Prerequisite Testing}
Given educational or explanatory domains, RDD predicts prerequisite relations. These are tested
against learner performance, expert curriculum graphs, prerequisite datasets, or controlled
instructional interventions.
\paragraph{Metrics.}
Prerequisite precision, learner-outcome prediction, mastery transfer, and expert agreement.
\paragraph{Failure condition.}
RDD fails if its predicted prerequisites merely reproduce teaching chronology without improving
prediction of learning or explanation quality.
\subsection{Context Restriction Under Counterexamples}
Given overbroad dependency claims, RDD should revise them into narrower valid contexts rather
than simply delete them.
\paragraph{Example.}
The claim:
\[
\text{Memory}\rightarrow\text{Prediction}
\]
should be weakened or restricted when a fully observable Markov predictor is introduced.
\paragraph{Metrics.}
Correct restriction rate, over-deletion rate, false universalization rate, and post-revision
validity.
\subsection{Expert-Rated Explanation Quality}
Experts compare RDD-generated dependency explanations against baseline explanations. Raters
evaluate correctness, usefulness, missing assumptions, overclaiming, directness, and context
sensitivity.
\paragraph{Metrics.}
Mean expert rating, inter-rater reliability, unsupported-edge count, and false-direct-edge count.
\paragraph{Failure condition.}
RDD fails if experts judge its explanations less correct, less useful, or more overgeneralized
than simpler alternatives.
% ---------------------------------------------------------------------
\section{Applications}
\label{sec:applications}
% ---------------------------------------------------------------------
\subsection{Ontology Construction}
RDD can be used to build ontologies by asking what makes each node possible, what it makes
possible, which edges are direct, which are indirect, and which support sets are minimal. This
helps prevent ontologies from becoming mere category trees.
\subsection{Curriculum Design}
RDD can identify prerequisite structures and distinguish genuine conceptual prerequisites from
historical or conventional teaching order. This supports adaptive learning systems and curriculum
diagnostics.
\subsection{Scientific Reasoning}
In scientific reasoning, RDD can expose hidden assumptions behind models, measurements, and
explanations. A model can be evaluated by asking which measurements, abstractions, boundary
conditions, and theoretical constructs make it operational.
\subsection{Mechanistic Interpretability}
In mechanistic interpretability, RDD can represent relations between components, circuits,
features, behaviors, and tasks. A component-to-behavior edge should be treated as functional only
when intervention, ablation, or mechanistic evidence supports it.
\subsection{Human-AI Collaboration}
RDD provides a shared audit format for collaborative reasoning. A human or AI system may propose
candidate dependencies, but acceptance requires type assignment, context indexing, observer
modeling, evidence, and revision history. This discourages hallucinated explanation while
preserving exploratory hypothesis generation.
% ---------------------------------------------------------------------
\section{Limitations}
\label{sec:limitations}
% ---------------------------------------------------------------------
RDD has important limits.
First, the formalism does not make false edges true. It only makes dependency claims explicit and
auditable. Poor candidate generation, poor validation, or weak evidence can still produce bad
graphs.
Second, context and observer models can be difficult to specify. If they are vague, dependency
claims become vague. The framework therefore shifts ambiguity into explicit indices rather than
eliminating it.
Third, strong necessity claims are rare outside formal or constitutive domains. Many practical
dependencies are defeasible, contributory, or context-bound. Overuse of necessity is a failure
mode.
Fourth, support sets can become combinatorially large. Minimality, redundancy detection, and
bounded traversal are necessary but may not be sufficient in large domains.
Fifth, observer aggregation assumes competence and partial independence. In practice, observers
may share biases, training data, institutions, instruments, or conceptual blind spots.
Sixth, corpus evidence is weak evidence. Language patterns can suggest dependency, but they often
reflect convention, rhetoric, teaching order, or correlation rather than genuine enablement.
Seventh, RDD does not replace specialized methods. Causal discovery, formal proof, experimental
ablation, ontology engineering, and educational measurement remain necessary in their respective
domains.
% ---------------------------------------------------------------------
\section{Open Problems}
\label{sec:open-problems}
% ---------------------------------------------------------------------
Several open problems remain.
\paragraph{Automated validity predicates.}
Can domain-specific validators be built for ontological, functional, and epistemic dependency
without collapsing into simple relation extraction?
\paragraph{Support-set search.}
How can RDD efficiently discover minimal sufficient support sets in large graphs?
\paragraph{Confidence calibration.}
How should confidence weights be calibrated across evidence tiers, observer models, and domains?
\paragraph{Observer translation.}
What formal conditions make observer translation reliable across humans, instruments, models, and
formal systems?
\paragraph{Graph compression.}
When are two dependency graphs semantically equivalent, and how can redundant or indirect edges
be compressed without losing explanatory information?
\paragraph{Mechanistic validation.}
In neural or biological systems, what counts as strong evidence that \(A\) functionally supports
\(B\)?
\paragraph{Counterexample handling.}
When should a counterexample cause deletion, weakening, retyping, or context restriction?
% ---------------------------------------------------------------------
\section{Conclusion}
\label{sec:conclusion}
% ---------------------------------------------------------------------
Recursive Dependency Discovery formalizes a simple but powerful question:
\[
\text{What makes this possible?}
\]
It turns this question into a typed, weighted, context-indexed graph formalism. In RDD, an edge
is not merely a relation between terms. It is an auditable claim about possibility, operation,
intelligibility, or instantiation under a specified context and observer model.
The framework distinguishes ontological, functional, and epistemic dependency; separates direct
from indirect dependency; treats edge truth as evidence-dependent; and introduces semantic
support sets of the form:
\[
\Gamma\Rightarrow B.
\]
This support-set layer captures the fact that many phenomena are enabled by structured
combinations of conditions rather than isolated predecessors.
RDD does not claim that there is one final graph of reality. It claims that dependency can be
made explicit, tested, revised, and compared. If validated, RDD can support ontology construction,
curriculum design, scientific reasoning, mechanistic interpretability, and human-AI collaboration.
The deepest shift is from association to enablement. Association asks what appears together. RDD
asks what makes something possible. That shift turns knowledge organization into a theory of
prerequisites, constraints, operations, support sets, and emergence. It makes explanation
inspectable.
\appendix
% ---------------------------------------------------------------------
\section{Reference Implementation Sketch}
\label{appendix:implementation}
% ---------------------------------------------------------------------
The following minimal sketch demonstrates the structure of an RDD hidden-edge recovery pipeline.
It is intentionally simple and does not claim benchmark-level performance.
\begin{lstlisting}[style=pythonstyle,caption={Minimal hidden-edge recovery sketch.}]
from dataclasses import dataclass, field
from typing import Dict, List, Tuple, Set, FrozenSet
Edge = Tuple[str, str]
SupportClaim = Tuple[FrozenSet[str], str]
@dataclass
class CandidateEdge:
source: str
target: str
edge_type: str
strength: str
weight: float
evidence: str
status: str = "provisional"
@dataclass
class CandidateSupportSet:
support_set: FrozenSet[str]
target: str
edge_type: str
strength: str
weight: float
evidence: str
minimal: bool = False
status: str = "provisional"
@dataclass
class RDDGraph:
nodes: Set[str] = field(default_factory=set)
accepted: List[CandidateEdge] = field(default_factory=list)
provisional: List[CandidateEdge] = field(default_factory=list)
rejected: List[CandidateEdge] = field(default_factory=list)
support_sets: List[CandidateSupportSet] = field(default_factory=list)
def add_edge(self, edge: CandidateEdge, tau: float = 0.50):
self.nodes.add(edge.source)
self.nodes.add(edge.target)
if edge.status == "accepted" and edge.weight >= tau:
self.accepted.append(edge)
elif edge.status == "rejected":
self.rejected.append(edge)
else:
self.provisional.append(edge)
def add_support_set(self, claim: CandidateSupportSet, tau: float = 0.50):
self.nodes.update(claim.support_set)
self.nodes.add(claim.target)
if claim.status == "accepted" and claim.weight >= tau:
self.support_sets.append(claim)
def lexical_generator(target: str, descriptions: Dict[str, str]) -> List[Edge]:
cues = [
"requires",
"defined as",
"depends on",
"rate of change",
"limit",
"sum",
"initial condition",
"composite"
]
candidates = []
target_text = descriptions.get(target, "").lower()
for node, text in descriptions.items():
if node == target:
continue
joined = (node + " " + text).lower()
if any(cue in target_text and cue in joined for cue in cues):
candidates.append((node, target))
return candidates
def validate(edge: Edge, expert_visible_edges: Set[Edge]) -> CandidateEdge:
source, target = edge
if edge in expert_visible_edges:
return CandidateEdge(
source=source,
target=target,
edge_type="epistemic",
strength="defeasible_necessity",
weight=0.85,
evidence="visible expert edge",
status="accepted"
)
return CandidateEdge(
source=source,
target=target,
edge_type="epistemic",
strength="contributory",
weight=0.45,
evidence="candidate generated by cues",
status="provisional"
)
def is_minimal_support_set(
claim: SupportClaim,
known_sufficient_sets: Set[SupportClaim]
) -> bool:
support_set, target = claim
for other_set, other_target in known_sufficient_sets:
if other_target != target:
continue
if other_set < support_set:
return False
return True
def precision_recall(recovered: Set[Edge], hidden: Set[Edge]):
tp = len(recovered & hidden)
precision = tp / len(recovered) if recovered else 0.0
recall = tp / len(hidden) if hidden else 0.0
f1 = (
2 * precision * recall / (precision + recall)
if precision + recall
else 0.0
)
return precision, recall, f1
\end{lstlisting}
% ---------------------------------------------------------------------
\section{Notation Summary}
\label{appendix:notation}
% ---------------------------------------------------------------------
\begin{center}
\begin{tabular}{ll}
\toprule
Symbol & Meaning \\
\midrule
\(V\) & Set of interpreted nodes \\
\(E\) & Directed dependency edges \\
\(\Ctx\) & Set of contexts \\
\(\Obs\) & Set of observer models \\
\(\Adm_{c,o}\) & Admissible configurations under context and observer \\
\(\iota_{c,o}\) & Instantiation, availability, operation, or understanding indicator \\
\(m_{c,o}\) & Graded node-status measure \\
\(\mu_{c,o}\) & Weighting or probability measure over configurations \\
\(T(e)\) & Edge type \\
\(S(e)\) & Edge strength \\
\(W(e)\) & Edge confidence weight \\
\(C(e)\) & Edge context \\
\(O(e)\) & Edge observer model \\
\(\Phi(e)\) & Evidence and provenance \\
\(R(e)\) & Revision history \\
\(B_G(X)\) & Backward dependency operator \\
\(F_G(X)\) & Forward dependency operator \\
\(\mathcal{B}_G(X)\) & Set-based backward dependency operator \\
\(\mathcal{F}_G(\Gamma)\) & Set-based forward dependency operator \\
\(\Gamma\Rightarrow B\) & Support-set dependency relation \\
\(\Val_t(e)\) & Type-specific validity predicate \\
\(\kappa(e_i,e_j)\) & Index compatibility predicate \\
\(\tau_{i\rightarrow j}\) & Observer translation map \\
\(\leadsto_G\) & Accepted directed path in graph \(G\) \\
\bottomrule
\end{tabular}
\end{center}
% ---------------------------------------------------------------------
\begin{thebibliography}{99}
\bibitem{bordes}
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., and Yakhnenko, O.
Translating embeddings for modeling multi-relational data.
\textit{Advances in Neural Information Processing Systems}, 2013.
\bibitem{brachman}
Brachman, R. J., and Levesque, H. J.
\textit{Knowledge Representation and Reasoning}.
Morgan Kaufmann, 2004.
\bibitem{garcez}
Garcez, A. d., Gori, M., Lamb, L. C., Serafini, L., Spranger, M., and Tran, S. N.
Neural-symbolic computing: An effective methodology for principled integration of machine learning and reasoning.
\textit{Journal of Applied Logics}, 2019.
\bibitem{guarino}
Guarino, N.
Formal ontology and information systems.
In \textit{Proceedings of FOIS}, 1998.
\bibitem{hearst}
Hearst, M. A.
Automatic acquisition of hyponyms from large text corpora.
In \textit{Proceedings of COLING}, 1992.
\bibitem{hochreiter}
Hochreiter, S., and Schmidhuber, J.
Long short-term memory.
\textit{Neural Computation}, 9(8):1735--1780, 1997.
\bibitem{liang}
Liang, C., Ye, J., Wu, Z., Pursel, B., and Giles, C. L.
Recovering concept prerequisite relations from university course dependencies.
In \textit{Proceedings of AAAI}, 2017.
\bibitem{liu}
Liu, H., Ma, W., Yang, Y., Carbonell, J., and Ros\'e, C. P.
Learning concept graphs from online educational data.
\textit{Journal of Educational Data Mining}, 2016.
\bibitem{nickel}
Nickel, M., Murphy, K., Tresp, V., and Gabrilovich, E.
A review of relational machine learning for knowledge graphs.
\textit{Proceedings of the IEEE}, 104(1):11--33, 2016.
\bibitem{pearl}
Pearl, J.
\textit{Causality: Models, Reasoning, and Inference}.
Cambridge University Press, 2nd edition, 2009.
\bibitem{peters}
Peters, J., Janzing, D., and Sch\"olkopf, B.
\textit{Elements of Causal Inference}.
MIT Press, 2017.
\bibitem{rumelhart}
Rumelhart, D. E., Hinton, G. E., and Williams, R. J.
Learning representations by back-propagating errors.
\textit{Nature}, 323:533--536, 1986.
\bibitem{scoville}
Scoville, W. B., and Milner, B.
Loss of recent memory after bilateral hippocampal lesions.
\textit{Journal of Neurology, Neurosurgery, and Psychiatry}, 20(1):11--21, 1957.
\bibitem{shannon}
Shannon, C. E.
A mathematical theory of communication.
\textit{Bell System Technical Journal}, 27:379--423 and 623--656, 1948.
\bibitem{spencerbrown}
Spencer-Brown, G.
\textit{Laws of Form}.
George Allen and Unwin, 1969.
\bibitem{spirtes}
Spirtes, P., Glymour, C., and Scheines, R.
\textit{Causation, Prediction, and Search}.
MIT Press, 2nd edition, 2000.
\bibitem{sowa}
Sowa, J. F.
\textit{Knowledge Representation: Logical, Philosophical, and Computational Foundations}.
Brooks/Cole, 2000.
\bibitem{squire}
Squire, L. R.
Memory and the hippocampus: A synthesis from findings with rats, monkeys, and humans.
\textit{Psychological Review}, 99(2):195--231, 1992.
\bibitem{sun}
Sun, Z., Deng, Z.-H., Nie, J.-Y., and Tang, J.
RotatE: Knowledge graph embedding by relational rotation in complex space.
In \textit{International Conference on Learning Representations}, 2019.
\bibitem{talukdar}
Talukdar, P. P., and Cohen, W. W.
Crowdsourced comprehension: Predicting prerequisite structure in Wikipedia.
In \textit{Proceedings of the Seventh Workshop on Building Educational Applications Using NLP}, 2012.
\end{thebibliography}
\end{document}