announcing Triton VM v0.33.0

⚓ Triton VM    📅 2023-08-10    👤 jfs    👁️ 1969      

jfs

Warning

This post was published 469 days ago. The infomation described in this article may have changed.

Triton VM v0.33.0 is now live. The main highlights are:

Non-deterministic Initialization of RAM

In all prior versions, the only way for programs to receive input was through public or secret input, i.e. using instructions read_io, divine, or divine_sibling. As of this version, RAM can be initialized before Triton VM starts execution. This means that elements that might need to be processed don’t waste precious clock cycles if they don’t end up being processed, as they can happily live in RAM without ever being read.

In future versions of Triton VM, non-deterministic RAM initialization might also be useful to place objects in RAM that depend on the current state of the VM. The zero-knowledge proof system already supports this behavior.

Improved Speed

Through changes to the proof system, better use of parallelization, and more efficient initialization of certain memory objects, the speed of the VM has increased noticeably. Depending on the hardware, the speedup factor should lie somewhere between 3 and 4 compared to the previous version, and almost 10 compared to the version that shipped with the initial release of Neptune’s alphanet.

🏷️ release 🏷️ triton-vm

jfs    2023-08-10 👍 1 👎 [op]

In terms of speed, here’s a breakdown on where Triton VM’s prover spends how much time.

### Prove Fibonacci 12800                  8.29s    Share  Category
├─Fiat-Shamir: claim                      11.10µs   0.00% (hash –  0.00%)
├─derive additional parameters            57.89µs   0.00% 
├─base tables                              4.37s   52.71% 
│ ├─create                                73.84ms   0.89% 
│ ├─pad                                    1.09s   13.10% 
│ ├─LDE                                    1.91s   23.08% (LDE58.95%)
│ ├─Merkle tree                          545.86ms   6.59% (hash – 58.95%)
│ │ ├─leafs                              503.21ms   6.07% 
│ │ └─Merkle tree                         42.65ms   0.51% 
│ ├─Fiat-Shamir                           35.22µs   0.00% (hash –  0.00%)
│ └─extend                               750.07ms   9.05% 
├─ext tables                               1.02s   12.35% 
│ ├─LDE                                  733.51ms   8.85% (LDE22.60%)
│ └─Merkle tree                          289.83ms   3.50% (hash – 31.30%)
│   ├─leafs                              248.08ms   2.99% 
│   └─Merkle tree                         41.74ms   0.50% 
├─quotient-domain codewords                1.77µs   0.00% 
├─quotient codewords                     545.26ms   6.58% 
│ ├─malloc                                28.80µs   0.00% 
│ ├─initial                              155.29ms   1.87% (AIR28.48%)
│ ├─consistency                          120.84ms   1.46% (AIR22.16%)
│ ├─transition                           234.47ms   2.83% (AIR43.00%)
│ └─terminal                              34.62ms   0.42% (AIR6.35%)
├─linearly combine quotient codewords    150.89ms   1.82% (CC38.75%)
├─commit to quotient codeword segments   688.96ms   8.31% 
│ ├─LDE                                  598.75ms   7.22% (LDE18.45%)
│ ├─hash rows of quotient segments        42.50ms   0.51% (hash –  4.59%)
│ └─Merkle tree                           47.70ms   0.58% (hash –  5.15%)
├─out-of-domain rows                      77.43ms   0.93% 
├─Fiat-Shamir                             94.12µs   0.00% (hash –  0.01%)
├─linear combination                      56.12ms   0.68% 
│ ├─base                                  31.86ms   0.38% (CC8.18%)
│ ├─ext                                   16.54ms   0.20% (CC4.25%)
│ └─quotient                               5.11ms   0.06% (CC1.31%)
├─DEEP                                   480.98ms   5.80% 
│ ├─interpolate                          297.78ms   3.59% 
│ ├─base&ext next row                     56.47ms   0.68% 
│ ├─base&ext next row                     62.39ms   0.75% 
│ └─segmented quotient                    64.33ms   0.78% 
├─combined DEEP polynomial               185.00ms   2.23% 
│ ├─Fiat-Shamir                            4.31µs   0.00% (hash –  0.00%)
│ ├─sum                                  166.61ms   2.01% (CC42.79%)
│ └─add randomizer codeword               18.38ms   0.22% (CC4.72%)
├─FRI                                    191.11ms   2.31% 
└─open trace leafs                         1.35ms   0.02% 

### Categories
LDE      3.25s  39.16%
hash   926.04ms 11.17%
AIR    545.22ms  6.58%
CC     389.39ms  4.70%

Clock frequency is 15446 Hz (128009 clock cycles / 8287 ms)
Optimal clock frequency is 15816 Hz (131072 padded height / 8287 ms)
FRI domain length is 2^20
1