Giter Site home page Giter Site logo

fuzzinglabs / thoth Goto Github PK

View Code? Open in Web Editor NEW
237.0 4.0 21.0 5.75 MB

Cairo/Starknet security toolkit (bytecode analyzer, disassembler, decompiler, symbolic execution, SBMC)

Home Page: https://fuzzinglabs.com/

License: GNU Affero General Public License v3.0

Python 63.27% Cairo 32.61% TypeScript 3.69% Rust 0.42%
analysis cairo-lang callflow cfg disassembler reversing security starknet decompiler sierra

thoth's Introduction

Thoth, the Cairo/Starknet security toolkit (analyzer, disassembler and decompiler)

Thoth (pronounced "taut" or "toss") is a Cairo/Starknet security toolkit including analyzers, disassemblers & decompilers written in Python 3. Thoth's features include the generation of the call graph, the control-flow graph (CFG) and the data-flow graph for a given Sierra file or Cairo/Starknet compilation artifact. It also includes some really advanced tools like a Symbolic execution engine and Symbolic bounded model checker.

Learn more about Thoth internals here: Demo video, StarkNetCC 2022 slides

Features

Installation

sudo apt install graphviz
git clone https://github.com/FuzzingLabs/thoth && cd thoth
pip install .
thoth -h

Decompile the contract's compilation artifact (JSON)

# Remote contrat deployed on starknet (mainnet/goerli)
thoth remote --address 0x0323D18E2401DDe9aFFE1908e9863cbfE523791690F32a2ff6aa66959841D31D --network mainnet -d
# Local contract compiled locally (JSON file)
thoth local tests/json_files/cairo_0/cairo_test_addition_if.json -d

Example 1 with strings:

source code

decompiler code

Example 2 with function call:

source code

decompiler code

Print the contract's call graph

The call flow graph represents calling relationships between functions of the contract. We tried to provide a maximum of information, such as the entry-point functions, the imports, decorators, etc.

thoth local tests/json_files/cairo_0/cairo_array_sum.json -call -view
# For a specific output format (pdf/svg/png):
thoth local tests/json_files/cairo_0/cairo_array_sum.json -call -view -format png

The output file (pdf/svg/png) and the dot file are inside the output-callgraph folder. If needed, you can also visualize dot files online using this website. The legend can be found here.

A more complexe callgraph:

Run the static analysis

The static analysis is performed using analyzers which can be either informative or security/optimization related.

Analyzer Command-Line argument Description Impact Precision Category Bytecode Sierra
ERC20 erc20 Detect if a contract is an ERC20 Token Informational High Analytics ✔️
ERC721 erc721 Detect if a contract is an ERC721 Token Informational High Analytics ✔️
Strings strings Detect strings inside a contract Informational High Analytics ✔️ ✔️
Functions functions Retrieve informations about the contract's functions Informational High Analytics ✔️ ✔️
Statistics statistics General statistics about the contract Informational High Analytics ✔️ ✔️
Test cases generator tests Automatically generate test cases for each function of the contract Informational High Analytics ✔️
Assignations assignations List of variables assignations Informational High Optimization ✔️
Integer overflow int_overflow Detect direct integer overflow/underflow High (direct) / Medium (indirect) Medium Security ✔️ ✔️
Function naming function_naming Detect functions names that are not in snake case Informational High Security ✔️
Variable naming variable_naming Detect variables names that are not in snake case Informational High Security ✔️
Delegate calls detector delegate_call Detect delegate calls Informational High Security ✔️
Dead code detector dead_code Detect dead code Informational High Security ✔️
Unused arguments detector unused_arguments Detect unused arguments Informational High Security ✔️
User defined function call detector user_defined Detect calls of user defined functions Informational High Security ✔️

Run all the analyzers

thoth local tests/json_files/cairo_0/cairo_array_sum.json -a

Selects which analyzers to run

thoth local tests/json_files/cairo_0/cairo_array_sum.json -a erc20 erc721

Only run a specific category of analyzers

thoth local tests/json_files/cairo_0/cairo_array_sum.json -a security
thoth local tests/json_files/cairo_0/cairo_array_sum.json -a optimization
thoth local tests/json_files/cairo_0/cairo_array_sum.json -a analytics

Print a list of all the available analyzers

thoth local tests/json_files/cairo_0/cairo_array_sum.json --analyzers-help

Use the symbolic execution

You can find a detailed documentation for the symbolic execution here.

Print the contract's data-flow graph (DFG)

thoth local tests/json_files/cairo_0/cairo_double_function_and_if.json -dfg -view
# For a specific output format (pdf/svg/png):
thoth local tests/json_files/cairo_0/cairo_double_function_and_if.json -dfg -view -format png
# For tainting visualization:
thoth remote --address 0x069e40D2c88F479c86aB3E379Da958c75724eC1d5b7285E14e7bA44FD2f746A8 -n mainnet  -dfg -view --taint

The output file (pdf/svg/png) and the dot file are inside the output-dfg folder.

Disassemble the contract's compilation artifact (JSON)

# Remote contrat deployed on starknet (mainnet/goerli)
thoth remote --address 0x0323D18E2401DDe9aFFE1908e9863cbfE523791690F32a2ff6aa66959841D31D --network mainnet -b
# Local contract compiled locally (JSON file)
thoth local tests/json_files/cairo_0/cairo_array_sum.json -b
# To get a pretty colored version:
thoth local tests/json_files/cairo_0/cairo_array_sum.json -b -color
# To get a verbose version with more details about decoded bytecodes:
thoth local tests/json_files/cairo_0/cairo_array_sum.json -vvv

Print the contract's control-flow graph (CFG)

thoth local tests/json_files/cairo_0/cairo_double_function_and_if.json -cfg -view
# For a specific function:
thoth local tests/json_files/cairo_0/cairo_double_function_and_if.json -cfg -view -function "__main__.main"
# For a specific output format (pdf/svg/png):
thoth local tests/json_files/cairo_0/cairo_double_function_and_if.json -cfg -view -format png

The output file (pdf/svg/png) and the dot file are inside the output-cfg folder.

Generate inputs for the Cairo fuzzer

You can generate inputs for the Cairo fuzzer using this command

thoth local ./tests/json_files/cairo_0/cairo_test_symbolic_execution_2.json -a fuzzer

F.A.Q

How to find a Cairo/Starknet compilation artifact (json file)?

Thoth supports cairo and starknet compilation artifact (json file) generated after compilation using cairo-compile or starknet-compile. Thoth also supports the json file returned by: starknet get_full_contract.

How to run the tests?

python3 tests/test.py

How to build the documentation?

# Install sphinx
apt-get install python3-sphinx

#Create the docs folder
mkdir docs & cd docs

#Init the folder
sphinx-quickstart docs

#Modify the `conf.py` file by adding
import thoth

#Generate the .rst files before the .html files
sphinx-apidoc -f -o . ..

#Generate the .html files
make html

#Run a python http server
cd _build/html; python3 -m http.server

Why my bytecode is empty?

First, verify that your JSON is correct and that it contains a data section. Second, verify that your JSON is not a contract interface. Finally, it is possible that your contract does not generate bytecodes, for example:

%lang starknet

from starkware.cairo.common.cairo_builtins import HashBuiltin

@storage_var
func balance() -> (res : felt):
end

Acknowledgments

Thoth is inspired by a lot of different security tools developed by friends such as: Octopus, Slither, Mythril, etc.

License

Thoth is licensed and distributed under the AGPLv3 license. Contact us if you're looking for an exception to the terms.

thoth's People

Contributors

lint-action avatar omahs avatar pventuzelo avatar raefko avatar rog3rsm1th avatar sebfuzzinglabs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

thoth's Issues

big_struct

We don't have the disassembling of the objects (struct) and their attributes (members):
big_struct_disass
while we have:
big_struct
big_struct_json

Decorators

We don't have the decorators disassembling.
For those codes (decorators1.cairo decorators2.cairo decorators3.cairo and constructor.cairo, l1_default.cairo):
decorators1
decorators2
decorators3
constructor1
l1_default

And those bytecodes:

(decorators1.json)
storage_var:
read
write

view:
view

external:
external
or
externals

(decorators2.json)
raw_input and raw_output:
raw_input

(decorators3.json)
event:
event

(constructor.json)
constructor:
constructor
or
constructor2

(l1_default.json)
l1_handler:
l1_handler
l1_handler_json

print of APUpdate

APUpdate should be shown only after an ASSERT_EQ.
Bug is fixed on the decompiler, need to fix it also in the disassembler.

[DISAS] flag color

add a flag to print disassembly with color :
builtins/struct
function name
call and return
jump

[DISAS] handle more properly contract interface

in this test: tests/json_files/starknet_contract_interface.json

there is no bytecode because it's a contract interface.
For the moment we just quit and inform the user but we should do something else ;)

image

Dissa

When we try to get the callflowgraph of the cairo_direct_recursion.json we don't have the direct recursion.
Command:
python3 __main__.py -file tests/json_files/cairo_direct_recursion.json -call
direct_recursion_cfg_fail
Result:
cfg

support indirect call

In some cases, we can have indirect calls

image

It's not supported yet.

on the current codebase we need to add:

  • proper disassembly print i.e. call abs [fp + 4], call rel [fp + 4]
  • add indirect call info inside the callgraph (dashed circle?)

imports with parentheses

python3 __main__.py -file tests/json_files/starknet_imports_with_parentheses.json

Traceback (most recent call last):
File "main.py", line 69, in
main()
File "main.py", line 50, in main
disassembler = Disassembler(args.file)
File "/home/fuzz/cairo_disassembler/disassembler.py", line 24, in init
self.analyze()
File "/home/fuzz/cairo_disassembler/disassembler.py", line 31, in analyze
self.json = parseToJson(self.file)
File "/home/fuzz/cairo_disassembler/jsonParser.py", line 126, in parseToJson
data, func_offset, func_identifiers = extractData(path)
File "/home/fuzz/cairo_disassembler/jsonParser.py", line 82, in extractData
json_data = json.load(f)
File "/usr/lib/python3.8/json/init.py", line 293, in load
return loads(fp.read(),
File "/usr/lib/python3.8/json/init.py", line 357, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid control character at: line 84067 column 109 (char 4194304)

remove dep to cairo-lang

the only place we need to import from cairo-lang library is for the decode_instruction

Could be interesting to copy decode_instruction and the Instruction directly in this program to prevent cairo-lang dependency and potential issue if people are not using venv

[CFG] print in textual form

we need to find a way to print the CFG like the disassembly output

ideally something like radare2 will be nice

image

Format of the implicit argument disass

We have:
python3 __main__.py -file tests/json_files/cairo_implicit_parameters.json
image
But the format of the implicit argument is using brackets { } and the classic arguments are using ( ):
cairo_implicit_parameters.cairo
image

[CALL] [DISAS] extract and show event info

The information that a function is an event can be found inside the abi section
for file: starknet_decorators3.json

image

we need to extract it, print it in the disassembler and the callgraph

[CFG] [DISAS] implement label in disassembly & cfg

we can find all label by looking at relative jump offset (JUMP_REL 9) and relative CALL (CALL rel 3145)

once done we should have an output like:

offset 2458:  ADD            AP, 1          
offset 2459:  ASSERT_EQ      [AP], [FP]     
offset 2459:  ADD            AP, 1          
offset 2460:  CALL           rel 4870       
offset 2460:  ADD            AP, 2          

label_2462:

offset 2462:  ASSERT_EQ      [AP], [FP-4] + [FP]
offset 2462:  ADD            AP, 1          
offset 2463:  ASSERT_EQ      [FP-3], [[AP-1]]
offset 2464:  ASSERT_EQ      [AP], [FP] + 1 
offset 2464:  ADD            AP, 1          
offset 2466:  ASSERT_EQ      [AP], [FP] + 1 
offset 2466:  ADD            AP, 1          
offset 2468:  ASSERT_EQ      [AP], [AP-4]   
offset 2468:  ADD            AP, 1          

get_code

We can not detect the start/end of a function

Python Package

The project should be allow the user to install it as a python package

CALL ABS/REL

Disassembler does not make a difference between CALL ABS and CALL REL
image
image

I fixed the bug for the decompiler, just need to do the same for the disassembler

Implicit Arguments

We don't have the disassembling of functions implicit arguments;
For this code: cairo_implicit_parameters.cairo
builtins_in_functions
And this bytecode: cairo_implicit_parameters.json
buitlins_in_functions_json
We have:
builtins_in_functions_disass

APupdate - if/else

in this example
image

We assign :

[AP] = 0

And we update ap.
So in the if statement, the AP used is not the same as the assigned before.
What should we do in this case ?

CAIRO SOURCE CODE :

image

[DISAS] add support for references

Some identifiers (with type = "reference") actually contains value that we can print as comment during disassembly

(warning: I'm not speaking about the "reference manager" section)

image

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.