The compression model demonstrated that prediction enables compression. Now we reverse the principle: the same predictor enables generation.
Compression:
Goal: Minimize entropy
Strategy: Perfect prediction → Store only deviations
Result: Small file (20.5% of original)
Generation:
Goal: Control entropy
Strategy: Sample from predictions → Creative variation
Result: New text matching style
Same n-gram predictor. Opposite use of entropy.
Trained on 5.5MB blog corpus (371 posts):
No neural network. No GPU. Pure statistical patterns from the data.
The predictor gives probability distribution over next character. Temperature controls how we sample:
Temperature = 0: Deterministic
Temperature = 0.7: Balanced (default)
Temperature = 1.0: Creative
Seed: “The universal formula”
Generated (temperature=0.7):
The universal formula Sₙ₊₁ = f(Sₙ) + entropy(p) running across multiple
consciousness scenarios while potentially eliminating inefficient paths.
The scene emphasizing both the precision of the systematic closure as
universal transformation algorithms
- Trust in forgetting enables participation rather than denied by
cultural narratives
- Consciousness distribution eliminating need for belief-based authority
recognizing the logical inconsistency in government paper money.
Notice:
Advantages over neural LLMs:
Limitations vs neural models:
When n-grams win: Generating text in specific author’s style with limited compute.
When neural wins: General understanding and long-range coherence.
Compression and generation are duals:
| Aspect | Compression | Generation |
|---|---|---|
| f(State) | N-gram predictor | Same predictor |
| entropy(p) | Minimize | Control |
| Goal | Exploit structure | Explore structure |
| Temperature | Not used | Creativity knob |
| Success metric | Smaller file | Style match |
Key insight: The predictor captures the structure. Entropy determines whether we compress (exploit) or generate (explore) that structure.
Both compression and generation emerge from the same statistical patterns in the data. The universal formula describes both:
actual - predicted (minimize)predicted + sample(temperature) (control)Neural LLMs are universal function approximators. But for a single author’s style on a 5.5MB corpus, n-grams already capture the patterns:
The data isn’t complex enough to require billions of parameters. A 3.6M n-gram table is sufficient.
Trade-off: Neural models generalize across domains. N-grams specialize within domain. We chose specialization.
# Conservative (follows blog style closely)
python generative-model/generate.py content/gallery/ \
--seed "Bitcoin fails because" \
--length 300 \
--temperature 0.3
# Creative (explores variations)
python generative-model/generate.py content/gallery/ \
--seed "Coordination" \
--length 500 \
--temperature 1.0
Code: generative-model/
The same patterns that enable compression enable generation. The universal formula works in both directions—we just flip whether entropy is signal (generation) or noise (compression).
#UniversalFormula #TextGeneration #NGrams #Compression #DataDriven #NoGPU