A Theoretical Model Characterizing Tangle Evolution in IOTA Blockchain Network

Abstract

The IOTA blockchain system, renowned for its lightweight architecture devoid of resource-intensive proof-of-work mining, emerges as a promising platform for Internet of Things (IoT) applications. Unlike traditional blockchains with linear chain structures, IOTA organizes ledger data into a directed acyclic graph (DAG) called the Tangle. This article presents the first theoretical model analyzing the evolution of IOTA's Tangle using stochastic methods, revealing its unique double Pareto Lognormal (dPLN) degree distribution—a finding diverging from conventional power-law or exponential models. We further propose an Expectation-Maximization (EM)-based algorithm for accurate parameter estimation, validated against real-world IOTA mainnet data.

Keywords

IOTA blockchain
Network dynamics
Tangle evolution
Double Pareto Lognormal (dPLN)
EM algorithm

Introduction

Launched in 2016 by the IOTA Foundation, IOTA replaces traditional blockchain structures with a DAG-based Tangle, where each vertex represents a transaction or data payload. Its feeless, lightweight design suits IoT environments requiring high-throughput microtransactions. While prior research focused on applications and protocol extensions, this work pioneers the stochastic modeling of Tangle’s topological evolution, addressing gaps in understanding its network dynamics.

Key Challenges in Modeling:

Batch Arrival Dynamics: Multiple vertices attach concurrently across distributed nodes, contrasting sequential single-vertex models.
Complex Attachment Process: Vertex selection depends on subtangle topology, not just degree-based preferential attachment.

IOTA Preliminary

Tangle Mechanics:

Message Attachment: New transactions ("messages") reference existing tips (unapproved vertices), fostering decentralization.
- IOTA 1.0: Weighted random walk for tip selection.
- IOTA 2.0: UTXO-based non-conflict tip identification.
Tangle Consolidation: Nodes propagate messages to merge local ledgers, ensuring consensus (Figure 2).

Theoretical Modeling

Stochastic Differential Equation (SDE) Approach

The Tangle’s degree evolution is modeled via:

\frac{ds_k(t)}{s_k(t)} = \omega(t)dt + \sigma(t)dB(t)

where:

sk(t): Size of degree group k at time t.
ω(t): Growth rate coefficient.
σ(t): Fluctuation term (Brownian motion).

Derived Degree Distribution: dPLN

The solution yields a double Pareto Lognormal distribution, combining:

Multiplicative growth (LN component).
Exponential stoppage time (Pareto tails).

Parameter Estimation via EM Algorithm

Challenges:

Generic gradient-descent methods fail due to high-dimensional parameter space (α, β, μ, σ²).

EM-Based Solution:

E-Step: Compute expectations for latent variables (vertex degrees).
M-Step: Maximize likelihood via closed-form updates (Appendix D).

Performance: Achieves 60% convergence rate vs. 15% for BFGS solver, with 40–60s average fitting time per Tangle snapshot.

Evaluation & Results

Dataset:

112 Tangle snapshots from IOTA mainnet (2016–2020), averaging 1.7M messages/snapshot.

Model Comparison (rMSLE Metric):

| Model | Overall Fit | Header (deg. 1–2) | Middle (deg. 3–5) | Rear (deg. ≥6) |
|-------------|-------------|-------------------|-------------------|----------------|
| dPLN | 0.20 | 0.10 | 0.08 | 0.15 |
| LN | 0.55 | 0.12 | 0.09 | 0.50 |
| Power-Law | 1.20 | 0.45 | 0.30 | 0.10 |
| Exponential | 0.90 | 0.60 | 0.25 | 0.80 |

Key Insight: dPLN uniquely balances fit across all degree ranges, while LN and Power-Law models exhibit biases.

Conclusion

This work:

Derives the first SDE-based model for IOTA Tangle evolution, identifying its dPLN degree distribution.
Proposes an EM algorithm outperforming generic solvers in parameter estimation.
Validates findings with real-world IOTA data, revealing stable network dynamics.

Future Directions:

Simulator development using dPLN-sampled topologies.
Extension to other DAG-based ledgers.

👉 Explore IOTA’s Tangle in action

FAQs

Q: Why does IOTA’s Tangle not follow power-law distributions?
A: Concurrent vertex attachments and subtangle-based selection disrupt preferential attachment assumptions, leading to dPLN’s hybrid lognormal-Pareto behavior.

Q: How does the EM algorithm improve fitting accuracy?
A: By iteratively refining latent variable estimates (E-step) and closed-form parameter updates (M-step), it avoids local optima traps common in gradient-based methods.

Q: What practical insights does this model offer?
A: It aids in predicting Tangle growth patterns, optimizing tip selection algorithms, and detecting anomalies like parasite chain attacks.