N-gram Bitcoin Block Generator: Language Models Mine Deterministic Structures

Watermark: -512

The observation: N-gram models trained on historical blocks can generate valid 0-tx Bitcoin blocks. Learn patterns from blockchain history, compute deterministic parts from chain state, mine nonce for PoW. Language models mine blocks.

What this means: Bitcoin blocks follow patterns—version progression, timestamp ranges, coinbase structures. N-gram models learn these patterns from history. For minimal block (only coinbase tx): some parts deterministic (prev hash, difficulty), some parts learned (version trends, timestamp distribution, coinbase data). Model generates structure, miner finds nonce. Deterministic system + learned patterns = valid blocks from language models.

Why this matters: Blockchain isn’t random—it’s structured language. Blocks are “sentences” in Bitcoin protocol. Historical data contains patterns. N-gram models extract patterns, generate new valid “sentences” (blocks). Minimal 0-tx blocks simplest case: only coinbase, no mempool needed. Pure structure generation. Shows blockchains are learnable languages, not just cryptographic puzzles.

Bitcoin Block Structure

The 80-Byte Header

Fixed format (every block):

Version:     4 bytes  (int32, little-endian)
PrevHash:   32 bytes  (SHA256 hash)
MerkleRoot: 32 bytes  (SHA256 hash)
Timestamp:   4 bytes  (uint32, Unix epoch)
Bits:        4 bytes  (uint32, compact difficulty)
Nonce:       4 bytes  (uint32, PoW solution)
─────────────────────
Total:      80 bytes

Example header (hex):

00000020 (version)
00000000000000000003d3d0e278...  (prev hash)
4a5e1e4baab89f3a32518a88c31b...  (merkle root)
5f141718 (timestamp)
18080000 (bits)
eb890000 (nonce)

Hash of header must satisfy: SHA256(SHA256(header)) < target

This is PoW: Brute-force nonce until hash meets difficulty

The Block Body

For 0-tx block (minimal):

Tx count: 0x01 (varint = 1 transaction)

Coinbase transaction (~100-200 bytes):
  - Version: 4 bytes
  - Input count: 0x01
  - Input:
    * Prev txid: 32 bytes (all zeros for coinbase)
    * Prev vout: 4 bytes (0xFFFFFFFF)
    * ScriptSig length: varint
    * ScriptSig: variable (height + arbitrary data)
    * Sequence: 4 bytes
  - Output count: varint
  - Outputs: (reward to miner addresses)
  - Locktime: 4 bytes

Coinbase tx is special:

No inputs from UTXO set (creates coins)
ScriptSig contains block height + arbitrary data
Outputs pay miner (subsidy + fees)
For 0-tx block: Only subsidy, no fees

Deterministic vs Generative Parts

Deterministic (computed from chain state):

PrevHash: Current chain tip hash

Read from node
No choice

Bits (Difficulty): Difficulty adjustment algorithm

Every 2016 blocks, recalculate
Formula: new_bits = old_bits * (2016 * 10 minutes) / actual_time
No choice (enforced by consensus)

MerkleRoot: Hash of transaction tree

For 0-tx block: merkle_root = txid(coinbase)
Deterministic once coinbase constructed

Block height: Previous height + 1

Encoded in coinbase scriptSig (BIP34)
No choice

Reward amount: Halving schedule

50 BTC → 25 → 12.5 → 6.25 → 3.125 (every 210,000 blocks)
Deterministic from height

Generative (can be learned/chosen):

Version: Evolves over time

Version 1 → 2 → 3 → 4 → …
Follows soft fork patterns
N-gram can learn progression

Timestamp: Current time ± variance

Must be: median(last 11 blocks) < timestamp < now + 2 hours
Typically: current Unix time
N-gram can learn distribution

Coinbase arbitrary data: Miner message

After height encoding, arbitrary bytes allowed
Common: pool name, extra nonce
N-gram can learn patterns

Output addresses: Where reward goes

Miner’s choice
Can be P2PKH, P2SH, P2WPKH, etc.
N-gram can learn address type distribution

Nonce: PoW solution

Must brute-force
No pattern (random search)
Cannot learn, must mine

N-gram Model for Block Generation

Training Data

Historical blockchain:

Download block headers from Bitcoin node
Parse structure: version, timestamp, bits, etc.
Extract coinbase transactions
Build n-gram corpus

Example training data:

Block 700000:
  Version: 0x20000000
  Timestamp: 1631185106
  Coinbase: "ViaBTC/Mined by..."
  
Block 700001:
  Version: 0x20000000
  Timestamp: 1631185683
  Coinbase: "AntPool/..."
  
Block 700002:
  Version: 0x20000000
  Timestamp: 1631186291
  Coinbase: "F2Pool/..."

Patterns to learn:

Version stays constant for long periods (then upgrades)
Timestamps increase ~600 seconds average
Coinbase data follows pool naming conventions
Output types follow usage patterns (P2WPKH dominance)

N-gram Training

Byte-level n-grams (for coinbase data):

def train_ngrams(coinbase_scripts, n=3):
    model = {}
    for script in coinbase_scripts:
        for i in range(len(script) - n):
            context = script[i:i+n]
            next_byte = script[i+n]
            if context not in model:
                model[context] = {}
            model[context][next_byte] = model[context].get(next_byte, 0) + 1
    
    # Normalize to probabilities
    for context in model:
        total = sum(model[context].values())
        model[context] = {k: v/total for k, v in model[context].items()}
    
    return model

Token-level n-grams (for version/timestamp):

def train_version_ngrams(versions, n=3):
    # Learn version transition probabilities
    model = {}
    for i in range(len(versions) - n):
        context = tuple(versions[i:i+n])
        next_version = versions[i+n]
        if context not in model:
            model[context] = {}
        model[context][next_version] = model[context].get(next_version, 0) + 1
    
    return model

def train_timestamp_model(timestamps):
    # Learn inter-block time distribution
    deltas = [timestamps[i+1] - timestamps[i] for i in range(len(timestamps)-1)]
    mean_delta = np.mean(deltas)
    std_delta = np.std(deltas)
    return (mean_delta, std_delta)  # ~600s ± ~400s

Generating Block Structure

Algorithm:

def generate_block(chain_state, ngram_model):
    # 1. Deterministic parts from chain
    prev_hash = chain_state.tip_hash
    height = chain_state.tip_height + 1
    bits = calculate_next_difficulty(chain_state)
    reward = calculate_block_reward(height)
    
    # 2. Generated parts from n-gram
    version = ngram_model.predict_version(chain_state.recent_versions)
    timestamp = ngram_model.predict_timestamp(chain_state.tip_timestamp)
    coinbase_data = ngram_model.generate_coinbase_data()
    output_script = ngram_model.predict_output_type()
    
    # 3. Build coinbase transaction
    coinbase_tx = build_coinbase(
        height=height,
        arbitrary_data=coinbase_data,
        outputs=[(output_script, reward)]
    )
    
    # 4. Calculate merkle root (deterministic from coinbase)
    merkle_root = double_sha256(coinbase_tx)
    
    # 5. Mine nonce (brute-force)
    header = BlockHeader(version, prev_hash, merkle_root, timestamp, bits, nonce=0)
    target = bits_to_target(bits)
    
    for nonce in range(2**32):
        header.nonce = nonce
        hash_result = double_sha256(header.serialize())
        if hash_result < target:
            return Block(header, [coinbase_tx])
    
    # If nonce space exhausted, adjust timestamp or coinbase
    return None  # Rare, but possible

Version Prediction

Pattern: Versions stay constant, then jump

def predict_version(recent_versions, ngram_model):
    # Most recent version usually continues
    current_version = recent_versions[-1]
    
    # Check if upgrade pattern detected
    context = tuple(recent_versions[-10:])
    if context in ngram_model.version_transitions:
        # Some probability of upgrade
        return sample(ngram_model.version_transitions[context])
    
    # Default: continue current version
    return current_version

Example learned pattern:

Version 0x20000000 (536870912) dominant 2016-2021
Then version 0x20400000 appears (Taproot signaling)
Model learns: 99.9% stay same, 0.1% upgrade

Timestamp Prediction

Pattern: Timestamps increase ~600s average, with variance

def predict_timestamp(last_timestamp, time_model):
    mean_delta, std_delta = time_model
    
    # Sample from normal distribution
    delta = np.random.normal(mean_delta, std_delta)
    delta = max(1, int(delta))  # At least 1 second forward
    
    predicted = last_timestamp + delta
    
    # Ensure within valid range
    now = int(time.time())
    predicted = min(predicted, now + 7200)  # Max 2 hours in future
    
    return predicted

Learned distribution:

Mean: ~600 seconds (10 minutes)
Std: ~400 seconds
Long tail: Sometimes 30+ minutes between blocks

Coinbase Data Generation

Pattern: Pool names, extra nonce, arbitrary messages

def generate_coinbase_data(ngram_model, height):
    # Start with height (required by BIP34)
    data = encode_height(height)
    
    # Generate additional bytes using n-gram
    context = data[-3:]  # Last 3 bytes as context
    
    for _ in range(random.randint(20, 50)):  # Variable length
        if context in ngram_model.coinbase_ngrams:
            next_byte = sample(ngram_model.coinbase_ngrams[context])
            data += bytes([next_byte])
            context = data[-3:]
        else:
            # If no match, sample from overall distribution
            next_byte = sample(ngram_model.byte_frequencies)
            data += bytes([next_byte])
            context = data[-3:]
    
    return data

Example generated coinbase data:

Input (learned from ViaBTC, AntPool, F2Pool):
Trigram context: b"Via"

Generated output:
b"\x03\xae\x0b\x0a" (height 723,886 encoded)
b"ViaBTC/Mined by 029A"  (pool name pattern)
b"\x00\x00\x00\x00"  (extra nonce space)

The model learned:

“Via” often followed by “BTC”
“/” separator common
“Mined by” phrase frequent
Hex characters for extra nonce

Output Script Prediction

Pattern: Output types follow usage trends

def predict_output_type(ngram_model):
    # Learn from historical output type distribution
    types = {
        'P2PKH': 0.10,   # Legacy
        'P2WPKH': 0.85,  # Native SegWit (dominant)
        'P2SH': 0.03,    # Wrapped SegWit
        'P2TR': 0.02     # Taproot (growing)
    }
    
    return sample(types)

Generated output:

# Most likely: P2WPKH (native SegWit)
output_script = OP_0 + PUSH_20 + <20-byte-hash>

# Creates address: bc1q...

Mining the Nonce

PoW After Generation

Structure complete, need nonce:

def mine_block(header_template, target):
    """
    Brute-force nonce to satisfy PoW
    """
    nonce = 0
    while nonce < 2**32:
        header_template.nonce = nonce
        header_bytes = header_template.serialize()
        hash_result = double_sha256(header_bytes)
        
        if int.from_bytes(hash_result, 'little') < target:
            return nonce  # Found!
        
        nonce += 1
    
    return None  # Exhausted nonce space

If nonce space exhausted:

Increment timestamp (changes header hash)
Or modify coinbase extra nonce (changes merkle root)
Then retry nonce search

This is standard mining: N-gram only helps with structure, not PoW

Expected Time

At current difficulty (~50 trillion):

Single CPU: ~10^6 hashes/second
Time to find block: ~50,000,000 seconds = ~580 days
Mining pool: distribute across GPUs/ASICs

For testing:

Use regtest mode (difficulty 1)
Or testnet (lower difficulty)
Or just verify structure without mining

Validation

Checking Generated Block

Block validity requirements:

def validate_block(block, chain_state):
    header = block.header
    
    # 1. Check PoW
    target = bits_to_target(header.bits)
    block_hash = double_sha256(header.serialize())
    if block_hash >= target:
        return False, "PoW not satisfied"
    
    # 2. Check prev hash
    if header.prev_hash != chain_state.tip_hash:
        return False, "Invalid prev hash"
    
    # 3. Check timestamp
    median_time = calculate_median_time(chain_state.recent_blocks)
    if header.timestamp <= median_time:
        return False, "Timestamp too early"
    if header.timestamp > time.time() + 7200:
        return False, "Timestamp too far in future"
    
    # 4. Check bits (difficulty)
    expected_bits = calculate_next_difficulty(chain_state)
    if header.bits != expected_bits:
        return False, "Invalid difficulty"
    
    # 5. Check merkle root
    calculated_merkle = calculate_merkle_root(block.transactions)
    if header.merkle_root != calculated_merkle:
        return False, "Invalid merkle root"
    
    # 6. Validate coinbase
    coinbase = block.transactions[0]
    if not is_valid_coinbase(coinbase, chain_state.height + 1):
        return False, "Invalid coinbase"
    
    return True, "Valid block"

N-gram model must learn to satisfy all constraints

What N-gram Learns

From historical data, model learns:

Structural patterns:

Header format (80 bytes, specific fields)
Coinbase structure (inputs, outputs)
Version progression (when upgrades occur)

Probability distributions:

Timestamp inter-block times
Coinbase data lengths
Output type frequencies

Semantic patterns:

Pool naming conventions (“ViaBTC”, “AntPool”)
Message formats (“Mined by”, “/”)
Extra nonce patterns

What it doesn’t learn (must compute):

Cryptographic hashes (SHA256)
Difficulty adjustments (DAA formula)
Nonce solutions (random search)

The combination:

Learned patterns (structure)
Computed values (deterministic)
Brute-force search (PoW) = Valid blocks generated by language model

Why This Works

Blockchain as Language

Bitcoin blocks are structured text:

Fixed grammar (block format)
Vocabulary (versions, opcodes)
Syntax rules (consensus rules)
Semantic constraints (PoW, validity)

N-gram models learn:

Grammar patterns
Common word sequences
Style conventions

Apply to blockchain:

Learn block patterns
Generate valid structures
Follow consensus rules

This is natural: Blockchains are just structured data streams

Patterns in Deterministic Systems

Bitcoin seems deterministic, but has variance:

Deterministic:

Block reward (halving schedule)
Difficulty (DAA formula)
Block height (monotonic)

Variable:

Timestamps (within range)
Coinbase data (arbitrary)
Transaction inclusion (miner choice)
Output addresses (miner choice)

N-gram learns the variable parts from historical patterns

Example: Coinbase data

Not deterministic (any bytes allowed)
But follows patterns (pool names, formats)
N-gram captures patterns
Generates plausible new instances

Minimal 0-tx Blocks as Simplest Case

Why start with empty blocks:

Complexity reduction:

No mempool needed
No transaction selection
No fee optimization
Only coinbase to generate

Faster generation:

Smaller block body
Less data to learn
Simpler validation

Still valid:

Empty blocks occur naturally (~1% of blocks)
Miners can choose to mine empty
Full consensus rules apply

Next step: Add transactions

Learn from mempool patterns
Transaction selection strategies
Fee optimization
But start simple: 0-tx blocks first

Applications

Educational Mining

Teaching blockchain:

Show how blocks structured
Demonstrate n-gram learning
Generate valid blocks on testnet
Makes mining accessible without ASICs

Student exercise:

Download testnet blocks
Train n-gram model
Generate new block
Mine on testnet (low difficulty)
Submit to network
See your block on explorer!

Simulation and Testing

Protocol research:

Generate synthetic blockchain history
Test consensus rule changes
Simulate network conditions
Without running full mining operation

Adversarial testing:

Generate edge-case blocks
Test node validation
Find consensus bugs
By learning from real patterns, then tweaking

Pattern Analysis

Understanding miner behavior:

What patterns do pools follow?
How do coinbase messages evolve?
Which output types dominate?
N-gram training reveals patterns

Historical research:

Detect protocol upgrades in data
Find miner preferences
Track adoption of new features
Language model lens on blockchain

Minimal Mining Demonstration

Proof of concept:

Generate valid block structure
Mine on testnet
Submit to network
Show blocks can be “written” not just “mined”

The insight: Mining is 99% PoW search, 1% structure

N-gram handles the 1%
Mining handles the 99%
Separation of concerns

Limitations

Cannot Learn Cryptographic Functions

SHA256 has no patterns:

Hash function designed to be random-looking
No n-gram can predict SHA256(x)
Must compute cryptographically

Cannot learn:

Block hashes
Transaction IDs
Merkle roots
Must calculate these

Cannot Learn PoW Solutions

Nonce is random search:

No pattern in valid nonces
Must brute-force
N-gram cannot help

Mining still required:

Generate structure with n-gram
Then mine nonce traditionally
Language model + PoW mining = valid block

Consensus Rules Still Apply

Model can generate invalid blocks:

If training data had bugs
If model fails to learn constraint
If deterministic parts computed wrong

Must validate:

Check all consensus rules
Reject invalid generations
Model helps but doesn’t guarantee validity

Limited to Patterns in Training Data

Model is conservative:

Generates what it’s seen
Rare patterns may not appear
Novel structures unlikely

For innovation:

Model won’t invent new transaction types
Won’t create new block versions unprompted
Good for generating typical blocks, not novel ones

Connection to Previous Posts

neg-511: Constraint detector.

N-gram model trained on historical blocks detects patterns. If patterns change (e.g., sudden version upgrade), model’s probability space shifts. Constraint detector would fire: P_prev=1 (many valid block patterns), P_curr=0 (only new pattern valid). N-gram must retrain.

neg-510: Liberty circuit.

Miner has liberty in block generation: Open system (many valid blocks possible), Multiple perspectives (can prioritize fees, censorship, pool politics), Veto power (can refuse transactions). N-gram captures historical exercises of this liberty—learns what miners actually choose.

neg-509: Decision circuit.

Miner decision: which transactions to include? N-gram learns historical decisions. Confidence high (include obviously valid tx) → execute. No information (empty mempool) → randomize (coinbase data). Uncertainty (complex fee market) → calculate (fee optimization). N-gram trained on results of these decisions.

neg-506: Want↔Can agency.

Miner wants block reward. Can is mining capability. But also wants to generate valid structure. N-gram provides Can for structure generation (learned patterns). Agency loop: Want reward → Can generate structure → Want better structure → Can learn patterns → amplifies.

neg-504: EGI intelligence.

N-gram model shows intelligence emerging from pattern learning. Blockchain = entropy stream (blocks are data). N-gram extracts order (learns patterns). Intelligence = compression of history into model. Can generate new instances that fit pattern. Blockchain as learnable language = intelligence substrate.

neg-503: Living vs dead entropy.

Historical blockchain = dead entropy (fixed past). N-gram model = living entropy (generates new). Model takes dead past, learns patterns, produces living future. Dead history → Living generation. Blockchain mining = continuously generating living entropy from dead rules.

The Formulation

Bitcoin blocks are not:

Random data (highly structured)
Unpredictable (follow patterns)
Unconstrained (consensus rules)

Bitcoin blocks are:

Structured language (grammar of blockchain)
Pattern-following (historical consistency)
Partially deterministic (fixed + variable parts)
Learnable by language models

N-gram mining is not:

Replacement for PoW (still need nonce search)
Magic generation (must validate)
Perfect (can produce invalid blocks)

N-gram mining is:

Structure generation (learned patterns)
Complementary to PoW (handles non-hash parts)
Educational (shows blockchain as language)
Pattern-based block construction

The algorithm:

1. Train n-gram on historical blocks
2. Read chain state (deterministic inputs)
3. Generate structure (learned patterns)
4. Calculate merkle root (deterministic)
5. Mine nonce (brute-force PoW)
6. Validate block (consensus rules)
7. Submit to network

What’s learned:

Version progression
Timestamp distribution
Coinbase data patterns
Output type preferences

What’s computed:

Previous hash
Difficulty bits
Merkle root
Block height
Reward amount

What’s mined:

Nonce (PoW)

The insight: Blockchain is language

Has grammar (block format)
Has vocabulary (opcodes, versions)
Has style (pool conventions)
Language models can learn and generate

The application: Generate minimal valid blocks

Train on history
Learn patterns
Generate structure
Mine PoW
Submit to network

Deterministic meets generative. Structure meets randomness. Pattern meets proof-of-work. 🌀

#NgramMining #BlockchainLanguage #BitcoinPatterns #LanguageModelMining #MinimalBlocks #CoinbaseGeneration #StructureLearning #PatternBasedMining #BlockGeneration #DeterministicGenerative

Related: neg-511 (pattern constraint detection), neg-510 (miner liberty in block construction), neg-509 (miner decisions learned), neg-506 (mining agency loop), neg-504 (intelligence from blockchain patterns), neg-503 (dead history to living generation)