Scripting language

⚓ Triton VM 📅 2024-08-14 👤 chancehudson 👁️ 697

Warning

This post was published 226 days ago. The infomation described in this article may have changed.

Hi, my name is Chance and I work on the privacy and scaling explorations team at the EF. I’ve been looking for a solid STARK proving system for a while and I think TritonVM is probably the most advanced/well built STARK based zkvm out right now.

To use this vm I have been designing a simple scripting language to express mathematical relationships. I’m trying to take learnings from pain points in circom, the simplicity of javascript, and the safety/usability of rust. I’ve held off on sharing this for a while because I wasn’t sure if the language could be better than writing raw assembly, but I’m now confident that it can be.

One important feature of this language (ashlang) is that each file is a distinct function. The function name is the file stem. This made it easy to assemble files that are written in different languages. In an ashlang project functions can be written either in ashlang, or tasm. Ashlang functions can call tasm functions, and tasm functions can call ashlang functions, provided they manually manipulate the stack in the correct way.

I had significant doubts about my ability to write an assembler that can produce tasm that is as efficient as handwrittem tasm, but it turns out these doubts are almost irrelevant. Ashlang can be used to manage the control flow and write the high level structure, then specific functions can be implemented efficiently using tasm.

To achieve this tasm source files specify a function signature as the first non-whitespace line. This allows the assembler to statically analyze the program and confirm that the tasm function calls are being passed the correct arguments. You can see an example of this in the shift left circular implementation. This function accepts two scalars and returns a single scalar.

The language still has some oddities with argument handling and return types. Specifically the arguments are reversed on the stack during calls (so to execute lt a swap 1 is necessary), and an extra argument is always passed to account for the case of a function returning a memory based variable. Both of these are simple changes that can be addressed immediately.

You can see the larger standard library here there are some example programs in the test-vectors directory in that repo as well. My immediate next steps are refactoring out some debt and fixing the issues i described above. After that I’m planning to make a STARK proof where a user decrypts data, operates on the data, and re-encrypts it (using chacha).

I’d love to get feedback from the team. I imagine learnings from the tasm-lib would be almost directly applicable. I’d also like to make the language useful to the team if possible.

Thanks

🏷️ language, assembler, dx

👍 󠁮󠁮󠁮󠁮 2 👎 󠁮󠁮󠁮󠁮

aszepieniec 2024-08-16 👍 1 👎

Hi Chance, welcome to the forum. This is a crazy-awesome project! Thanks for sharing and please let us know what we can do to help.

I’m planning to make a STARK proof where a user decrypts data, operates on the data, and re-encrypts it (using chacha).

I’m sure you are aware but for completeness of record, please note that sponge constructions have an authenticated encryption mode ¹. Triton VM exposes its natively supported hash function, Tip5, through a sponge interface, and so you could use that for encryption and decryption.

Reasons why you might not want to use this option:

The counterparty to this protocol needs to posses an implementation of Tip5 and use it in the right mode of operations. Chacha is so much more widespread.
Triton VM does not expose the state of the sponge. This limitation implies in particular that you can’t store the state, do some other things with the sponge (like hashing or decrypting something else) and then resume with the stored state.

See the sponge handbook, chapter 4: https://keccak.team/files/CSF-0.1.pdf

sword_smith 2024-08-16 👍 1 👎

Very, very cool project!

I just cloned the repo and ran the io_test.ash with the command that I could pattern match on from test.sh. cargo run --release -- $(basename test-vectors/io_test.ash | sed "s/.ash//") -i ./stdlib -i ./test-vectors --asm -p 1 -s 1

By doing that, I could see the assembly that was being produced by the compiler - - it looks quite reasonable :)

Things we’ve learned from tasm-lib:

Don’t use the call instruction as a replacement for a “goto” instruction, as it mutates that call-stack and will disallow you from using your code in a wider context
Be explicit about your stack manipulation. If you know how each function call mutates the stack, then things become quite manageable.
Performance: Control-flow doesn’t have to be too performant if you can call performant helper functions. This basically means that you can identify the expensive part of your programs and hand-write those in TASM, the rest can be handled by a compiler (which is exactly what you state above)

Those points are probably very obvious to you but for someone who’s new to assembler or compiler programming, it might not be.

I can think of a few feature requests:

The ability to output the compiled TASM as a file
The ability to use the compiler-output in connection with debugger triton-tui, i.e. that ashlang can output files that triton-tui understands. And maybe also produce an output that’s compatible with your pull-request to Triton VM here.
The ability to call snippets that are defined in tasm-lib.

The latter might be challenging with respect to memory conventions. But tasm-lib follows certain memory conventions, and we use a Rust-trait implementation to define all the snippets, and the trait that’s called Closure is guaranteed to only modify the stack, not memory. If integration with tasm-lib isn’t practical or possible, please just ignore this request.

What you could consider is to follow the memory convention that we have laid out for tasm-lib. It’s basically that the addresses in the ranges $[0,2^{32})$ ($2^{32}$ not included) is reserved for non-deterministically initialized memory, and that the address range $[2^{32}, 2^{63}-1]$ is reserved for dynamically allocated memory. The rest is for statically allocated memory. Or, if you disagree with this convention, come up with a better system. Agreeing on a memory layout would make it easier to make code from tasm-lib, triton-vm (which exposes the TASM for evaluating the AIR), and ashlang to be compatible.

chancehudson 2024-08-16 👍 👎 [op]

“please note that sponge constructions have an authenticated encryption mode”

I did not know this, thanks for the link! I went with chacha because it’s simple and secure. It’s also a decent test of the language abilities.

“Don’t use the call instruction as a replacement for a “goto” instruction, as it mutates that call-stack and will disallow you from using your code in a wider context”

Good recommendations, I made this mistake in my first go at loops haha. Good recommendations, clearly defined stack manipulations are super important as well.

“The ability to use the compiler-output in connection with debugger triton-tui”

Yes, writing the assembly and automatically launching it into triton-tui are planned in the immediate future. Right now i copy it into a file in /tmp and keep a shell there to run triton-tui. Automating this will get rid of a bunch of friction.

“The ability to call snippets that are defined in tasm-lib.”

I’ll look into this. The scripting language is statically laid out in memory so each function gets it’s own distinct memory space (right now ~16 GB/function). My understanding is, we want to minimize the number of distinct memory slots that are in use. e.g. if i assign 100 times to 100 distinct memory slots, this performs much worse than assigning 100 times to 10 distinct memory slots (because the memory table is 10x smaller?). The key here is figuring out when memory can be re-used. In ashlang it’s very simple, once a function invocation ends the function releases it’s memory (total static memory = max call stack depth). So functions can easily re-use the same memory regions, and the memory footprint is still statically determined.

“What you could consider is to follow the memory convention that we have laid out for tasm-lib”

If I understand correctly the statically allocated memory is a single page at the end of the memory array (e.g. [~2^64-2^32, 2^64)). If this is true I can put the memory used by an ashlang program before the tasm-lib static memory. Then we’re assuming that dynamic memory + ashlang static memory + tasm-lib static memory < 2^64 which should be safe.

Other than that I just need to figure out a way to bridge the tasm-lib functions to the compiler. But that shouldn’t be much of a problem.

sword_smith 2024-08-16 👍 👎

“My understanding is, we want to minimize the number of distinct memory slots that are in use. e.g. if i assign 100 times to 100 distinct memory slots, this performs much worse than assigning 100 times to 10 distinct memory slots (because the memory table is 10x smaller?)”

I don’t think this is correct. The only thing that matters for the proving time is how many words you read/write from/to memory. Where in memory the words are located and whether they’ve already been written to does not matter for proving times. I apologize if we’ve somehow given another impression or been unclear in our communication.

“If I understand correctly the statically allocated memory is a single page at the end of the memory array”

It’s even simpler than that: Addresses in the range $[2^{32},2^{63}-1]$ are dynamically allocated. Addresses $[2^{63}, 2^{64}-2^{32}]$ are statically allocated. Addresses $[0, 2^{32}-1]$ are reserved for nondeterministically allocated memory.

It’s true that the kmalloc method in tasm-lib only allocates within a range of $2^{32}$ words, but that’s just because we don’t need anymore for the snippets. We consider half of the available memory ($~2^{63}$ words) available for static allocation.

sword_smith 2024-08-16 👍 👎

Also: If you allow the imports of tasm-lib snippets, you can call the mother of all SNARK features: recursion.

chancehudson 2024-08-17 👍 👎 [op]

“I apologize if we’ve somehow given another impression or been unclear in our communication.”

Not at all, it was just an incorrect/unchecked inference on my part.

“We consider half of the available memory ($~2^{63}$ words) available for static allocation. “

Yep i’ll just stick ashlang memory somewhere in there. I can determine the amount of memory needed and do a shift to all the program memory after assembling (e.g. shift the program position in ram left/right). I’m going to use a similar approach to conditionally allow memory returns to maintain polymorphism in ashlang. If i bridge tasm-lib I’ll write it such that tasm-lib has 2^61 dynamic memory and use the rest for ashlang (except the static section at the end).

“Also: If you allow the imports of tasm-lib snippets, you can call the mother of all SNARK features: recursion.”

Yes! I’ve read some of the discussion about it. I’m somewhat more interested in compiling to many different proving systems. I’m trying to keep the language as generic/simple as possible so that I can easily compile for other proving systems and have a common ashlang library. I want to do the two language approach for each backend. For triton-vm i write tasm for the optimization layer. If i was compiling for r1cs I might write a simple language for expressing equations more directly using variables shared with ashlang (then bind sets of equations to functions in ashlang), etc.

I was introduced to the quicksilver proving system by a colleague recently. I think this will be the second backend I add support for. It’s an interesting scheme because proving time is independent of number of multiplications. It scales just based on number of variables and degree of polynomial.

I think it would be interesting to recursively verify triton-vm proofs in other systems like groth16, plonk, quicksilver. There could be some interesting performance tradeoffs. Like if a triton-vm proof can be verified in a quicksilver proof it would be in ~constant memory. So a user computer could do it, even if it takes a while.

chancehudson 2024-08-17 👍 👎 [op]

“It’s an interesting scheme because proving time is independent of number of multiplications”

This is erroneous. I meant proving memory is ~independent of number of multiplications. Proving time is improved by parallelism, but determined by number of multiplications. Every time the number of cores is quadrupled there’s a ~2x proving speedup (based on results in the paper).