The Superset Language

1 day ago

Direct translation is the wrong mental model for serious language conversion.

If the system goes straight from TypeScript to Rust, or from TSX to HTML, CSS, and JavaScript, it has to do too many things at once. It has to parse source, infer identity, preserve behavior, choose target idioms, account for missing features, and decide whether the result is trustworthy.

That arrow is too small.

The better shape is a superset language.

Not a language humans necessarily write by hand. Not a new runtime. Not a claim that every language can be flattened into one perfect syntax.

A superset language is an interlingua: a richer semantic form that can carry more meaning than any one output language needs.

RUST

JSX

CSS

LIFT

SUPERSET

LOWER

TARGET

LOSS LOG

Source languages lift into a richer semantic record. Target languages are lowering decisions, not the place where meaning is first discovered.

Direct Translation Hides Too Much

A direct translator looks clean:

TypeScript -> Rust
TSX -> HTML + CSS + JS
Python -> JavaScript

But the clean arrow hides the useful questions.

Which source span produced the output? Which exported symbol stayed the same? Which type fact survived? Which effect changed shape? Which runtime behavior was proved? Which target feature could not express the original meaning?

If the translator cannot answer those questions, the output might still run, but the system cannot explain what it preserved.

For agent work, that is not enough.

The coordinator needs to know what can be merged, rebased, reviewed, regenerated, or blocked.

Lift First

The first step is lifting.

Lifting turns source code into a semantic record:

source text
source spans
comments and trivia
bindings and declarations
module edges
type facts
effect boundaries
runtime surfaces
package contracts
proof obligations

That record is deliberately bigger than the source language.

TypeScript can express some type facts directly. JavaScript may need inference or weaker claims. CSS has selectors, cascade, variables, media state, and asset references. Rust has ownership and trait facts. HTML has document structure, ids, ARIA relationships, and form behavior.

The superset has to hold all of those facts without pretending they are the same fact.

TEXT

SPANS

IDENTITY

TYPES

EFFECTS

RUNTIME

PROOF

The interlingua keeps text, identity, types, effects, runtime meaning, and proof as separate layers.

One Source, Many Lowerings

The source should lift once.

Different targets can then lower different parts of the same record.

A Semantic Record, Not A Direct Rewrite

1export interface User {2  id: string;3  fullName: string;4}5 6export function label(user: User) {7  return user.fullName;8}

Source languageThe TypeScript syntax is only one surface over identity, type shape, export shape, and behavior.

The useful intermediate form is richer than each output. Rust, JavaScript, or another target can lower from the same semantic record with different proof obligations.

The superset is not a prettier AST.

It is the place where the system keeps the facts that would otherwise disappear between languages.

Loss Accounting Is The Point

The important feature is not that the superset can magically represent everything.

The important feature is that it can represent the difference between:

preserved exactly
preserved through an adapter
lowered with a runtime proof
lowered with a known loss
not representable in this target

That is the difference between a code generator and a review system.

A code generator emits code.

A review system says what the emitted code means, where it came from, and where it stopped being equivalent.

SurfaceClaimRequired proofCurrent evidenceRoute

ProvedSource spanOutput can be traced backSpan and hash bindingFile hash, byte range, generated span mapPreserve

RequiredIdentityThe same thing moved languagesStable symbol or region identityExport, binding, selector, id, key, trait, routePreserve or rebase

RequiredType shapeThe public contract survivedTarget type or declaration checkAssignable shape, generated interface, adapter typeGate

MissingEffect boundarySide effects stayed boundedEffect graph or runtime traceIO, mutation, async, DOM, storage, networkRequire proof

MissingRuntime meaningThe output behaves under declared conditionsRuntime capsuleDOM, layout, events, canvas, API traceProbe

A superset language is useful when it carries meaning and also records which claims still need evidence.

Lowering Is A Decision

Once the source has been lifted, lowering is not just code generation.

Lowering asks what the target can honestly express.

RECORD

TARGET DIALECT

EXACT

ADAPTER

LOSSY

BLOCK

EVIDENCE

OUTPUT

Lowering is an admission route: exact output, adapter output, lossy output, or no output.

Some facts lower cleanly.

TypeScript interface -> Rust struct shape
named export -> module export edge
JSX static attribute -> HTML attribute
CSS module class -> generated class map

Some facts need an adapter.

Promise boundary -> async runtime adapter
optional property -> option-like wrapper
DOM event handler -> generated listener
CSS variable contract -> theme token binding

Some facts are target-specific and should not be silently erased.

Rust lifetime information
TypeScript conditional type behavior
CSS cascade order
React component lifecycle
browser focus behavior
canvas draw sequence

If those facts matter to the claim, the output should carry proof, a loss record, or a refusal.

RouteSignalRequiresProduces

ExactTarget can express the semantic record directlySpan map, identity map, target syntax gateGenerated target code

AdapterTarget can express the behavior with a helper boundaryAdapter contract and focused proofTarget code plus adapter evidence

Runtime proofMeaning depends on execution, browser state, or effectsProbe capsule bound to source and outputAdmissible bounded claim

Loss recordTarget cannot preserve a fact but the loss is acceptedExplicit non-claim and owner decisionLowered output with known limitation

BlockRequired meaning cannot be represented or provedNo silent erasureQuestion, redesign, or target-specific implementation

A conversion system becomes trustworthy when every output path records what it preserved and what it did not.

Translation Helps Merge

This is where language conversion connects back to semantic merging.

If every language lifts into the same kind of semantic record, merge does not have to be invented separately for every syntax.

The syntax adapters stay language-specific.

The core questions become shared:

what identity changed?
what public contract changed?
what effect boundary changed?
what runtime claim changed?
what proof admits the output?
what loss was accepted?

That does not make all languages equivalent.

It gives the coordinator one place to compare meaning before writing target text back to disk.

Not Concept Merge

This should stop below taste, intention, and vibe.

The system should not decide that one branch made the product more minimal and another made it more art deco, then synthesize a new aesthetic.

That is too high in the stack.

The useful boundary is lower and stricter:

source identity
program structure
public contracts
effect boundaries
runtime probes
declared layout or behavior claims
explicit loss records

Humans can still choose direction.

The system should make the consequences of that choice portable, inspectable, and mergeable.

The Mental Model

The superset language is not the destination.

It is the customs form between languages.

The source language declares what it had. The interlingua records what was understood. The target language receives only what it can express. Anything else becomes proof, adapter, loss, question, or refusal.

That is the shape that makes universal semantic merging and cross-language conversion feel like the same project.

SHAPE

SHIFT