Warning
This post was published 469 days ago. The infomation described in this article may have changed.
Triton VM v0.33.0 is now live. The main highlights are:
In all prior versions, the only way for programs to receive input was through public or secret input, i.e. using instructions read_io
, divine
, or divine_sibling
. As of this version, RAM can be initialized before Triton VM starts execution. This means that elements that might need to be processed don’t waste precious clock cycles if they don’t end up being processed, as they can happily live in RAM without ever being read.
In future versions of Triton VM, non-deterministic RAM initialization might also be useful to place objects in RAM that depend on the current state of the VM. The zero-knowledge proof system already supports this behavior.
Through changes to the proof system, better use of parallelization, and more efficient initialization of certain memory objects, the speed of the VM has increased noticeably. Depending on the hardware, the speedup factor should lie somewhere between 3 and 4 compared to the previous version, and almost 10 compared to the version that shipped with the initial release of Neptune’s alphanet.
🏷️ release 🏷️ triton-vmIn terms of speed, here’s a breakdown on where Triton VM’s prover spends how much time.
### Prove Fibonacci 12800 8.29s Share Category ├─Fiat-Shamir: claim 11.10µs 0.00% (hash – 0.00%) ├─derive additional parameters 57.89µs 0.00% ├─base tables 4.37s 52.71% │ ├─create 73.84ms 0.89% │ ├─pad 1.09s 13.10% │ ├─LDE 1.91s 23.08% (LDE – 58.95%) │ ├─Merkle tree 545.86ms 6.59% (hash – 58.95%) │ │ ├─leafs 503.21ms 6.07% │ │ └─Merkle tree 42.65ms 0.51% │ ├─Fiat-Shamir 35.22µs 0.00% (hash – 0.00%) │ └─extend 750.07ms 9.05% ├─ext tables 1.02s 12.35% │ ├─LDE 733.51ms 8.85% (LDE – 22.60%) │ └─Merkle tree 289.83ms 3.50% (hash – 31.30%) │ ├─leafs 248.08ms 2.99% │ └─Merkle tree 41.74ms 0.50% ├─quotient-domain codewords 1.77µs 0.00% ├─quotient codewords 545.26ms 6.58% │ ├─malloc 28.80µs 0.00% │ ├─initial 155.29ms 1.87% (AIR – 28.48%) │ ├─consistency 120.84ms 1.46% (AIR – 22.16%) │ ├─transition 234.47ms 2.83% (AIR – 43.00%) │ └─terminal 34.62ms 0.42% (AIR – 6.35%) ├─linearly combine quotient codewords 150.89ms 1.82% (CC – 38.75%) ├─commit to quotient codeword segments 688.96ms 8.31% │ ├─LDE 598.75ms 7.22% (LDE – 18.45%) │ ├─hash rows of quotient segments 42.50ms 0.51% (hash – 4.59%) │ └─Merkle tree 47.70ms 0.58% (hash – 5.15%) ├─out-of-domain rows 77.43ms 0.93% ├─Fiat-Shamir 94.12µs 0.00% (hash – 0.01%) ├─linear combination 56.12ms 0.68% │ ├─base 31.86ms 0.38% (CC – 8.18%) │ ├─ext 16.54ms 0.20% (CC – 4.25%) │ └─quotient 5.11ms 0.06% (CC – 1.31%) ├─DEEP 480.98ms 5.80% │ ├─interpolate 297.78ms 3.59% │ ├─base&ext next row 56.47ms 0.68% │ ├─base&ext next row 62.39ms 0.75% │ └─segmented quotient 64.33ms 0.78% ├─combined DEEP polynomial 185.00ms 2.23% │ ├─Fiat-Shamir 4.31µs 0.00% (hash – 0.00%) │ ├─sum 166.61ms 2.01% (CC – 42.79%) │ └─add randomizer codeword 18.38ms 0.22% (CC – 4.72%) ├─FRI 191.11ms 2.31% └─open trace leafs 1.35ms 0.02% ### Categories LDE 3.25s 39.16% hash 926.04ms 11.17% AIR 545.22ms 6.58% CC 389.39ms 4.70% Clock frequency is 15446 Hz (128009 clock cycles / 8287 ms) Optimal clock frequency is 15816 Hz (131072 padded height / 8287 ms) FRI domain length is 2^20