Mar 23, 2026

Deterministic first

Every time I caught myself reaching for an LLM call in a deterministic pipeline, I asked one question: could I write a script for this? The honest answer was usually yes.

Context

Why does every “agentic” pipeline I look at these days seem to reach for an LLM call at every junction? moitumi is a personal assistant built on top of Claude, and the temptation is real and constant — make every interesting decision an LLM call. “Should this node be archived?” “Which nodes are relevant to this query?” “Is this action safe?” “Where should I attach this incoming data?”

To be honest, every one of those questions can be answered by an LLM. None of them should be. That’s the principle. The CLAUDE.md captures it in one line:

Every pipeline step must pass through a deterministic check before reaching the LLM. If a script, algorithm, or scoring function can solve it — use that. Claude only handles the genuinely ambiguous residual.

This post is about why the principle exists, the four… well, five pipelines where it shows up, and the meta-rule that makes the whole thing tractable.

Why a principle is needed (and not just judgment)

Why bother codifying this as a principle rather than leaving it to taste? Three reasons “just use the LLM” is the wrong default:

Latency. An LLM call is ~hundreds of milliseconds, minimum. A hash-map lookup is nanoseconds. If the pipeline runs on every keystroke (search), every notification (ingestion routing), or every tick (state evaluation), every LLM call becomes a UI hang or a thermal event. Needless to say, that’s not a great experience.
Cost. Cents per call adds up faster than you’d think when the pipeline runs 10,000 times a day. Personal use isn’t free use.
Inspectability. A deterministic decision can be unit-tested, audit-logged with its inputs, and reproduced. An LLM decision is a black box — when it’s wrong, the only recourse is “prompt better.” Effectively, you’re debugging by vibes.

However, the LLM is genuinely better at one thing: resolving ambiguity in unstructured text. That’s it. For everything that can be expressed as a typed schema, a graph traversal, or arithmetic, deterministic code wins on every axis.

The four pipelines

1. Context assembly

The question: “For this query, what subset of the cortex (within a token budget) should Claude see?”

The naive answer: ask the LLM to pick. It would work. It would also burn tokens to decide which tokens to send… which is kind of absurd when you stop to think about it.

moitumi’s answer: a deterministic multi-component scoring function.

score(node) = 0.30 * graph_distance_from_focal_node
            + 0.25 * temporal_decay_factor
            + 0.20 * degree_centrality
            + 0.15 * recency
            + 0.10 * edge_count_weight

Then RRF (Reciprocal Rank Fusion) combines rankings from different signals:

combined_score = Σ over signals s of (1 / (rank_in_s + 1))

The result is a ranked list. The token budget is enforced by sliding down the list until the next node would overflow. Simple.

The LLM only enters when the deterministic budget enforcement hits a genuinely ambiguous trim decision — usually a content node where the next paragraph might or might not be relevant. That residual is small enough to handle with a single targeted call, not the budget-shape of “decide which 200 of these 2000 nodes matter.”

2. Graph traversal and node selection

The question: “Find me nodes related to this one.”

The naive answer: embedding similarity, ask the LLM.

moitumi’s answer: BFS with metadata scoring. Per-edge-type weights. A deterministic radius. Embedding similarity is added later as a discovery lens (per ADR-011), but the result still materializes as an explicit edge the user confirms. The LLM is never the traversal algorithm.

3. Token counting and budget enforcement

The question: “Will this fit in the context window?”

The naive answer: ask the LLM to estimate.

moitumi’s answer: count. Tokenizers exist. They run in microseconds. The token budget is arithmetic; there is no ambiguity to resolve.

The reason this even needs to be stated… I have seen pipelines where the LLM is asked “is this prompt under the budget?” The model guesses. The guess is wrong with frequency that surprises people. This is not a judgment call. Just count.

4. Action validation

The question: “Should this action be allowed?”

The naive answer: ask the LLM to evaluate the action against the user’s guardrails.

moitumi’s answer: a typed schema check. The hard guardrail in moitumi is “no financial transactions.” That’s a deterministic check against the action type. It’s not a judgment call; if the action’s type or parameters match the schema for “moves money,” it’s rejected. End.

LLM-as-policy-engine is the worst place to use an LLM. The whole point of a guardrail is that it can’t be argued out of. An LLM can always be argued out of something, by definition. That’s not a guardrail, it’s a suggestion.

5. Data ingestion and routing

The question: “This new piece of data arrived. Where should it go?”

The naive answer: ask the LLM.

moitumi’s answer: rule-based matching first. Location type (file:, http:, git:, calendar:) determines the adapter. Adapter-level rules determine the parent node (a journal entry’s path becomes journal/2026-03/{filename}). The LLM is invoked only after deterministic routing places the data — and only to enrich it (extract entities, attach to additional related nodes), never to place it.

The pattern

The pattern across all five pipelines:

flowchart LR
    In[Input] --> D{Deterministic check<br/>can resolve?}
    D -- Yes --> Out1[Done]
    D -- No --> Reduce[Reduce search space<br/>deterministically]
    Reduce --> L[LLM handles<br/>ambiguous residual]
    L --> Validate{Deterministic<br/>validation}
    Validate -- Pass --> Out2[Done]
    Validate -- Fail --> Reject[Reject / retry]

    style D fill:#dcfce7,stroke:#16a34a
    style L fill:#fef3c7,stroke:#d97706
    style Validate fill:#dcfce7,stroke:#16a34a

Three rules visible in the flow:

Try to resolve deterministically first. Most queries never reach the LLM.
If you must call the LLM, reduce the problem deterministically before the call. Don’t ask the model to find a needle in a haystack — find the haystack section deterministically and ask about the needle.
Validate the LLM’s output deterministically. Schema-check the response. Type-check the arguments. Re-run the budget arithmetic. The LLM is a generator; the validator is code.

The agent vs. process distinction

There’s a useful sharpening of the principle, captured in a separate DEVLOG entry: agents are not processes.

An agent works on a non-deterministic problem. An agent can be composed of deterministic actions but must have at least one non-deterministic component — judgment, ambiguity resolution, open-ended interpretation. If a sequence only does deterministic actions, it’s a process, not an agent. Just run it directly.

So when designing an agent, the first question is: what is the non-deterministic decision point? If you can’t identify one, you’re not building an agent. You’re building a process. Implement it as a deterministic pipeline.

That being said, this is the same principle, one level up. Deterministic-first applies inside each pipeline step. Agent-vs-process applies at the boundary of whether a pipeline needs an LLM at all.

The routing decision is itself deterministic

The meta-rule that makes this whole thing tractable:

The routing decision — “is this deterministically solvable?” — is itself deterministic.

If you can write a script that classifies a task as deterministically solvable, take the deterministic route. If the task definition is itself ambiguous enough that you can’t write the classifier, the LLM gets the whole task.

This sounds tautological but it’s load-bearing. The alternative — “ask the LLM whether to use the LLM” — has been the actual implementation strategy in some agentic systems I’ve reviewed. It pushes the latency and cost down one level without saving any of it. A trainwreck, basically.

Where this lives in moitumi

ContextAssembly::assemble() — deterministic scoring, RRF combination, budget arithmetic. LLM only for final ambiguous trim.
BoostCortexAnalytics — graph traversal, degree centrality, bridge detection. Pure algorithms, no LLM.
PlasticityEngine — Hebbian co-activation, STDP, decay. Arithmetic over synapse weights.
ActionValidation — typed schema checks. Financial-tx rejected at the type level.
GenericDataSource — deterministic routing by URI scheme. LLM only for content enrichment after placement.

Overall, the LLM shows up where the input is unstructured text and the question is “what does this mean?” Everywhere else, code. At least, that’s the line I keep coming back to whenever the temptation creeps in.

Source pointers

CLAUDE.md — “Design Principle: Deterministic First”
DEVLOG.md — entries of 2026-04-06 (“Insight: Agent vs Process distinction”), 2026-03-16 (context assembly patterns from analysis of 10 OSS projects)
PLAN.md — ADR-011 (Explicit Graph over Implicit Learning) is the philosophical cousin
docs/research/graph-traversal-and-search.md — the deterministic scoring breakdown

Cross-posted from the moitumi dev blog.