Generation as Inverse Compression: Same Predictor, Opposite Goal

Watermark: -385

The compression model demonstrated that prediction enables compression. Now we reverse the principle: the same predictor enables generation.

The Symmetry

Compression:

Goal: Minimize entropy
Strategy: Perfect prediction → Store only deviations
Result: Small file (20.5% of original)

Generation:

Goal: Control entropy
Strategy: Sample from predictions → Creative variation
Result: New text matching style

Same n-gram predictor. Opposite use of entropy.

Implementation

Trained on 5.5MB blog corpus (371 posts):

Context length: 14 characters (auto-determined)
Patterns learned: 3.6M n-gram transitions
Training time: 30 seconds on CPU
Generation speed: ~100 chars/second

No neural network. No GPU. Pure statistical patterns from the data.

Temperature: The Creativity Dial

The predictor gives probability distribution over next character. Temperature controls how we sample:

Temperature = 0: Deterministic

Always pick most likely character
Follows patterns exactly
Repetitive but coherent

Temperature = 0.7: Balanced (default)

Samples proportionally to probabilities
Explores variations while respecting patterns
Natural mix of structure and creativity

Temperature = 1.0: Creative

Uniform sampling across all possibilities
Maximum exploration
Chaotic but surprising

Example Output

Seed: “The universal formula”

Generated (temperature=0.7):

The universal formula Sₙ₊₁ = f(Sₙ) + entropy(p) running across multiple
consciousness scenarios while potentially eliminating inefficient paths.
The scene emphasizing both the precision of the systematic closure as
universal transformation algorithms
- Trust in forgetting enables participation rather than denied by
  cultural narratives
- Consciousness distribution eliminating need for belief-based authority
  recognizing the logical inconsistency in government paper money.

Notice:

Mathematical notation (Sₙ₊₁)
Philosophical vocabulary (“consciousness scenarios”, “systematic closure”)
Bullet point formatting
Coherent within 14-character context window

Why This Works

Advantages over neural LLMs:

Zero GPU: CPU training in seconds
Interpretable: See exact n-grams learned
Data-driven: Context length adapts to corpus
Style capture: Perfect vocabulary match for single author

Limitations vs neural models:

Character-level: Best at local coherence
Short context: 14 chars vs thousands of tokens
No planning: Cannot structure multi-paragraph arguments

When n-grams win: Generating text in specific author’s style with limited compute.

When neural wins: General understanding and long-range coherence.

The Universal Principle

Compression and generation are duals:

Aspect	Compression	Generation
f(State)	N-gram predictor	Same predictor
entropy(p)	Minimize	Control
Goal	Exploit structure	Explore structure
Temperature	Not used	Creativity knob
Success metric	Smaller file	Style match

Key insight: The predictor captures the structure. Entropy determines whether we compress (exploit) or generate (explore) that structure.

Both compression and generation emerge from the same statistical patterns in the data. The universal formula describes both:

Compression: actual - predicted (minimize)
Generation: predicted + sample(temperature) (control)

Why No Neural Network Needed

Neural LLMs are universal function approximators. But for a single author’s style on a 5.5MB corpus, n-grams already capture the patterns:

Character sequences (“the “, " and “, “coordination”)
Word transitions (common phrases)
Formatting (bullets, mathematical notation)
Vocabulary (technical terms, philosophical concepts)

The data isn’t complex enough to require billions of parameters. A 3.6M n-gram table is sufficient.

Trade-off: Neural models generalize across domains. N-grams specialize within domain. We chose specialization.

Try It Yourself

# Conservative (follows blog style closely)
python generative-model/generate.py content/gallery/ \
  --seed "Bitcoin fails because" \
  --length 300 \
  --temperature 0.3

# Creative (explores variations)
python generative-model/generate.py content/gallery/ \
  --seed "Coordination" \
  --length 500 \
  --temperature 1.0

Code: generative-model/

The same patterns that enable compression enable generation. The universal formula works in both directions—we just flip whether entropy is signal (generation) or noise (compression).

#UniversalFormula #TextGeneration #NGrams #Compression #DataDriven #NoGPU