Giter Site home page Giter Site logo

morpho-org / morpho-data-structures Goto Github PK

View Code? Open in Web Editor NEW
109.0 5.0 10.0 3.57 MB

Data structures tested and used by the Morpho Protocol.

License: GNU Affero General Public License v3.0

Shell 0.47% Solidity 88.45% Makefile 0.78% Ruby 10.30%
data-structures solidity heap doubly-linked-list

morpho-data-structures's Introduction

Morpho Data Structures 🦋

This repository contains the data structures that are used in Morpho's matching engine. The data structures are built to be secure and gas efficient.

Double Linked List

The current implementation of the double-linked list is based on this article written by Alberto Cuesta Cañada. You can find the repository here. Note that the code has been modified to meet our own needs and to allow us to sort the first accounts of the double-linked list. Our implementation is not a generalized one. What you can do with it:

  • Insert an address sorted by a value passed along this address.
  • Insert (and its value) before an account.

Red Black Binary Tree

A Red Black Binary Tree is a kind of binary tree that allows insertion/deletion/search in O(log(n)). Our implementation is a modified version of the OrderStatisticsTree repository written by Rob Hitechn which is also based on BokkyPooBahsRedBlackTreeLibrary repository written by bokkypoobah. Our modified version makes keys unique items instead of just (key, value) unique pairs.

In order to manipulate a binary tree and visualize how it manages to stay balanced, this tool is very useful. You can also find here the pseudo-code logic of the tree's function.

Heap based ordering

This implementation is based on a heap data structure and refines it by adding an unsorted portion after it. This gives us an approximation of a heap, and thus operations are performed in constant time.

At least the first maxSortedUsers / 2 addresses are sorted in the Heap. To keep constant time operation, we divide by two the size of the Heap once the size overtakes the maxSortedUsers number. It means that we remove all leaves of the heap.

The choice of this implementation is explained by the desire to store a maximum of high-value nodes in the heap to use them for peer-to-peer matching. Indeed, a naive implementation that would remove the tail and insert new values at maxSortedUsers (once the heap is full), would end up concentrating all new values on the same single path from the leaf to the root node because the shiftUp function will be always called from the same node. The risk is that low-value nodes stay in the Heap and that all the liquidity will be concentrated on the path from the leaf of index 'maxSortedUsers' to the root. Removing all the leaves enables the protocol to remove low-value nodes and to call the shiftUp function at different locations in the Heap. This process is meant to keep a maximum of liquidity available in the heap for peer-to-peer lending. The main entry point is the update function, calling internally either insert, increase, decrease or remove.

Other data structures

Other data structures may be explored in the future and we are open to any suggestions or optimization of current implementations ⚡️

Contributing

In this section, you will find some guidelines to read before contributing to the project.

Setup

Update git submodules:

git submodule update --init --recursive

Run yarn:

yarn

Testing

The tests can be run with Foundry.

For the RedBlackBinaryTree, you can run the tests with hardhat with yarn test.

Creating issues and PRs

For issues and PR, please use conventional naming and add Morpho's contributors to review your code or tackle your issue.

Before merging a PR

Before merging a PR:

  • PR must have been reviewed by reviewers. They must deliver a complete report on the smart contracts (see the section below).
  • Comments and requested changes must have been resolved.
  • PR must have been approved by every reviewer.
  • CI must pass.

Code Formatting

We use Husky hook to format code before being pushed to any remote branch to enforce coding style among all developers.

Audits

The code concerning the heap based ordering data-structure has been audited by Omniscia and the report can be found online or in the file Morpho_Omniscia.

Questions

For any questions, you can send an email to [email protected] 😊

morpho-data-structures's People

Contributors

daejunpark avatar dependabot[bot] avatar gabrielastieres avatar julien-devatom avatar makcandrov avatar mathisgd avatar merlinegalite avatar morphojo avatar qgarchery avatar rubilmax avatar tristan22400 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

morpho-data-structures's Issues

Document the Heap Ordering

Document the max size, how in effect only half of it is expected to be in the heap part of the data-structure. Explain the size, and how it tries to solve the issue of having all the liquidity in the same path

Create new tree data structure

requirements:

  • should be possible to query .last() which returns the account associated to the highest value.
  • should be possible to query .prev(account) which returns the account just before according to our new relationship order.

TODO:

  • Implement the new tree structure in another contract
  • Implement tests like in the older tree contract

HOG-02S: Repetitive Value Literal

HOG-02S: Repetitive Value Literal

Type Severity Location
Code Style HeapOrdering.sol:L128, L285, L304

Description:

The linked literal 1 is utilized in the linked portions to signify the head of the heap array, i.e. the root of the heap tree.

Example:

/// @notice Returns the address coming before `_id` in accounts.
/// @dev The account associated to the returned address does not necessarily have a lower value than the one of the account associated to `_id`.
/// @param _heap The heap to search in.
/// @param _id The address of the account.
/// @return The address of the previous account.
function getPrev(HeapArray storage _heap, address _id) internal view returns (address) {
    uint256 rank = _heap.ranks[_id];
    if (rank > 1) return getAccount(_heap, rank - 1).id;
    else return address(0);
}

Recommendation:

We advise this to be better illustrated by setting a contract-level constant declaration (i.e. HEAP_ROOT) in which the value of 1 is stored greatly improving the legibility of the codebase. To note, in the first linked line only the first instance of 1 should be replaced as the latter instance is a binary offset.

HOG-07C: Removal Upward Shift Optimization

HOG-07C: Removal Upward Shift Optimization

Type Severity Location
Gas Optimization HeapOrdering.sol:L258

Description:

The shiftUp operation is performed unconditionally whilst a child and parent's value can indeed match, thus executing an upwards shift inefficiently if the element removed was next-to-last in the tree and its children had an equal value to it.

Example:

// If the swapped account is in the heap, restore the invariant: its value can be smaller or larger than the removed value.
if (rank <= _heap.size) {
    if (_removedValue > getAccount(_heap, rank).value) shiftDown(_heap, rank);
    else shiftUp(_heap, rank);
}

Recommendation:

We advise the else statement to be adjusted to an else if (_removedValue < getAccount(_heap, rank).value) one, with the getAccount(_heap, rank).value value being cached to a local variable within the upper-most if clause to avoid the duplicate read gas cost between the if and else if conditional evaluations.

Possible optimization in Heap and HeapOrdering storage

When doing the _shiftUp and _shiftDown operations, the path from the account to shift up to the root is potentially completely changed. This requires many storage writes which is gas intensive. To improve the efficiency of those operations, we could introduce an intermediate array of pointers, each one of them from an index to another. When doing any operation, this array would be updated without needing to update the actual data.

Notes:

  • this solution is efficient if the array of pointer has a small overhead. One idea is to simply store it as a single word, where each byte represents an offset as given by the corresponding u8 value. This allows to store the heap part of the data-structure up to 128 elements. The _shiftUp and _shiftDown operations would only need to update this word one time.
  • this idea is not compatible with the current way the size is used in the HeapOrdering, because dividing the size by 2 could lead to creating some invalid pointers in the array. To implement this idea in the HeapOrdering, it could wait for the implementation of #63
  • operations that do not change the data-structure may need to do one more storage access by going through the array of pointers. Tricks could be used to avoid that additional cost, such as packing the array with a storage slot that is always read (like the size or even the slot of accounts if we are not afraid of that)

HOG-01C: Downward Traversal Optimization

HOG-01C: Downward Traversal Optimization

Type Severity Location
Gas Optimization HeapOrdering.sol:L151, L154

Description:

The shiftDown downward traversal mechanism of the heap structure is sub-optimal as it computes the result of getAccount operations duplicate times in all cases.

Example:

while (childRank <= size) {
    // Compute the rank of the child with largest value.
    if (
        childRank < size &&
        getAccount(_heap, childRank + 1).value > getAccount(_heap, childRank).value
    ) childRank++;

    childAccount = getAccount(_heap, childRank);

    if (childAccount.value > initialValue) {
        setAccount(_heap, _rank, childAccount);
        _rank = childRank;
        childRank <<= 1;
    } else break;
}

Recommendation:

We advise the result of the first getAccount invocation within the while loop to be cached and overwritten in the if block by the next child's account lookup if the condition is met, thus ensuring getAccount is at most invoked a single time per child.

HOG-01S: Ineffectual Operations

HOG-01S: Ineffectual Operations

Type Severity Location
Gas Optimization HeapOrdering.sol:L223-L224

Description:

The linked operations are ineffectual as they are performed on a memory reference that remains unutilized in the function body.

Example:

/// @notice Increases the amount of an account in the `_heap`.
/// @dev Only call this function when `_id` is in the `_heap` with a smaller value than `_newValue`.
/// @param _heap The heap to modify.
/// @param _id The address of the account to increase the amount.
/// @param _newValue The new value of the account.
/// @param _maxSortedUsers The maximum size of the heap.
function increase(
    HeapArray storage _heap,
    address _id,
    uint256 _newValue,
    uint256 _maxSortedUsers
) private {
    uint256 rank = _heap.ranks[_id];
    Account memory account = getAccount(_heap, rank);
    account.value = _newValue;
    setAccountValue(_heap, rank, _newValue);
    uint256 nextSize = _heap.size + 1;

    if (rank < nextSize) shiftUp(_heap, rank);
    else {
        swap(_heap, nextSize, rank);
        shiftUp(_heap, nextSize);
        _heap.size = computeSize(nextSize, _maxSortedUsers);
    }
}

Recommendation:

We advise them to be safely omitted reducing the gas cost of the function.

HOG-05C: Potential Significant Utility Optimization

HOG-05C: Potential Significant Utility Optimization

Type Severity Location
Gas Optimization HeapOrdering.sol:L4

Description:

The heap data structure implemented by the library is usually utilized to implement priority queues and other similar mechanisms where a common use case is to "pop" the upper-most parent of the "queue" and insert a new item that should be queued for the next iteration. If this is an envisioned use case for the heap, a significant utility and gas optimization can be achieved by implementing a dedicated "replace" function that replaces the root of the tree and then ascertains order by performing a single shiftDown operation.

Example:

library HeapOrdering {

Recommendation:

We advise this use-case to be evaluated and the function described to be implemented in case it is envisioned as frequent given that it will lead to significant gas optimizations.

HOG-08C: Upward Traversal Optimization

HOG-08C: Upward Traversal Optimization

Type Severity Location
Gas Optimization HeapOrdering.sol:L128, L129

Description:

The shiftUp upward traversal mechanism of the heap structure is sub-optimal as it computes the result of getAccount(_heap, _rank >> 1) twice redundantly.

Example:

while (_rank > 1 && initialValue > getAccount(_heap, _rank >> 1).value) {
    setAccount(_heap, _rank, getAccount(_heap, _rank >> 1));
    _rank >>= 1;
}

Recommendation:

We advise the result to be cached to a local variable that is consequently utilized for the while loop condition and setAccount execution thus optimizing the gas cost of the function significantly.

HOG-06C: Removal Swap Optimization

HOG-06C: Removal Swap Optimization

Type Severity Location
Gas Optimization HeapOrdering.sol:L250

Description:

The swap operation performed during an removal operation can be optimized by performing it conditionally only when rank != accountsLength as otherwise the heap will replace an element with itself.

Example:

/// @notice Removes an account in the `_heap`.
/// @dev Only call when this function `_id` is in the `_heap` with value `_removedValue`.
/// @param _heap The heap to modify.
/// @param _id The address of the account to remove.
/// @param _removedValue The value of the account to remove.
function remove(
    HeapArray storage _heap,
    address _id,
    uint256 _removedValue
) private {
    uint256 rank = _heap.ranks[_id];
    uint256 accountsLength = _heap.accounts.length;

    // Swap the last account and the account to remove, then pop it.
    swap(_heap, rank, accountsLength);
    if (_heap.size == accountsLength) _heap.size--;
    _heap.accounts.pop();
    delete _heap.ranks[_id];

    // If the swapped account is in the heap, restore the invariant: its value can be smaller or larger than the removed value.
    if (rank <= _heap.size) {
        if (_removedValue > getAccount(_heap, rank).value) shiftDown(_heap, rank);
        else shiftUp(_heap, rank);
    }
}

Recommendation:

We advise this conditional to be introduced optimizing the best-case gas cost of the function with minimal impact on the worst-case scenario.

HOG-03C: Insertion Swap Optimization

HOG-03C: Insertion Swap Optimization

Type Severity Location
Gas Optimization HeapOrdering.sol:L188

Description:

The swap operation performed during an insertion operation can be optimized by performing it conditionally only when newSize != accountsLength as otherwise the heap will replace an element with itself.

Example:

/// @notice Inserts an account in the `_heap`.
/// @dev Only call this function when `_id` is not in the `_heap`.
/// @dev Reverts with AddressIsZero if `_value` is 0.
/// @param _heap The heap to modify.
/// @param _id The address of the account to insert.
/// @param _value The value of the account to insert.
/// @param _maxSortedUsers The maximum size of the heap.
function insert(
    HeapArray storage _heap,
    address _id,
    uint256 _value,
    uint256 _maxSortedUsers
) private {
    // `_heap` cannot contain the 0 address.
    if (_id == address(0)) revert AddressIsZero();

    // Put the account at the end of accounts.
    _heap.accounts.push(Account(_id, _value));
    uint256 accountsLength = _heap.accounts.length;
    _heap.ranks[_id] = accountsLength;

    // Move the account at the end of the heap and restore the invariant.
    uint256 newSize = _heap.size + 1;
    swap(_heap, newSize, accountsLength);
    shiftUp(_heap, newSize);
    _heap.size = computeSize(newSize, _maxSortedUsers);
}

Recommendation:

We advise this conditional to be introduced optimizing the best-case gas cost of the function with minimal impact on the worst-case scenario.

HOG-01M: Potentially Ambiguous Functionality

HOG-01M: Potentially Ambiguous Functionality

Type Severity Location
Standard Conformity HeapOrdering.sol:L289-L295

Description:

The getTail function yields the tail of the overall account array rather than the tail of the sorted heap array which can cause mis-use by the library's users.

Example:

/// @notice Returns the address at the tail of the `_heap`.
/// @param _heap The heap to get the tail.
/// @return The address of the tail.
function getTail(HeapArray storage _heap) internal view returns (address) {
    if (_heap.accounts.length > 0) return getAccount(_heap, _heap.accounts.length).id;
    else return address(0);
}

Recommendation:

We advise a separate, dedicated getHeapTail function to be introduced that yields the ordered tree portion's tail member to its caller, increasing the utility of the library.

HOG-04C: Potential Significant Optimization

HOG-04C: Potential Significant Optimization

Type Severity Location
Gas Optimization HeapOrdering.sol:L7

Description:

Depending on the needs of the application that will ultimately utilize the library, the Account data structure can be significantly optimized by reducing its value member's accuracy from 256-bits to 96-bits, thereby permitting the address variable to be tight-packed with the uint96 into a single 32-byte slot. This will reduce all storage reads and writes for Account entries to one SLOAD / SSTORE operation regardless of whether both members are written to or read from in the same action.

Example:

struct Account {
    address id; // The address of the account.
    uint256 value; // The value of the account.
}

Recommendation:

We advise this to be done so if the 96-bit accuracy is sufficient for the end-user application.

HOG-02M: Inexistent Association of Former Value

HOG-02M: Inexistent Association of Former Value

Type Severity Location
Input Sanitization HeapOrdering.sol:L30-L36

Description:

The _formerValue argument is meant to indicate the current value of the particular _id in the heap structure, however, this is not validated and is instead assumed to be trusted input.

Impact:

An improper _formerValue combined with an improper _newValue will cause an incorrect operation to be performed on the heap that does not match the actual data the _id entry previously held.

Example:

function update(
    HeapArray storage _heap,
    address _id,
    uint256 _formerValue,
    uint256 _newValue,
    uint256 _maxSortedUsers
) internal {
    uint256 size = _heap.size;
    uint256 newSize = computeSize(size, _maxSortedUsers);
    if (size != newSize) _heap.size = newSize;

    if (_formerValue != _newValue) {
        if (_newValue == 0) remove(_heap, _id, _formerValue);
        else if (_formerValue == 0) insert(_heap, _id, _newValue, _maxSortedUsers);
        else if (_formerValue < _newValue) increase(_heap, _id, _newValue, _maxSortedUsers);
        else decrease(_heap, _id, _newValue);
    }
}

Recommendation:

While the contract is a library and thus input sanitization can be performed at the application level, from a security standpoint this reliance is ill-advised. Instead, we advise the _formerValue to be queried directly on the heap structure to guarantee all operations are performed safely regardless of end-user usage. If gas optimization is a concern here (as the data entry is already read by the caller), we advise a delta system to be used instead (i.e. specifying a decrease as -5) which would optimize the if conditionals and allow the entry to be read once at the heap level.

Testing suite improvement

cf #122

  • The testing suite lacks fuzzing & invariants
  • Mock libraries should fit more to the actual libraries (in order to improve gas reports)

HOG-02C: Increase Swap Optimization

HOG-02C: Increase Swap Optimization

Type Severity Location
Gas Optimization HeapOrdering.sol:L230

Description:

The swap operation performed during an increase operation can be optimized by performing it conditionally only when nextSize != rank as otherwise the heap will replace an element with itself.

Example:

/// @notice Increases the amount of an account in the `_heap`.
/// @dev Only call this function when `_id` is in the `_heap` with a smaller value than `_newValue`.
/// @param _heap The heap to modify.
/// @param _id The address of the account to increase the amount.
/// @param _newValue The new value of the account.
/// @param _maxSortedUsers The maximum size of the heap.
function increase(
    HeapArray storage _heap,
    address _id,
    uint256 _newValue,
    uint256 _maxSortedUsers
) private {
    uint256 rank = _heap.ranks[_id];
    Account memory account = getAccount(_heap, rank);
    account.value = _newValue;
    setAccountValue(_heap, rank, _newValue);
    uint256 nextSize = _heap.size + 1;

    if (rank < nextSize) shiftUp(_heap, rank);
    else {
        swap(_heap, nextSize, rank);
        shiftUp(_heap, nextSize);
        _heap.size = computeSize(nextSize, _maxSortedUsers);
    }
}

Recommendation:

We advise this conditional to be introduced optimizing the best-case gas cost of the function with minimal impact on the worst-case scenario.

Possible optimization in heap size

Found by @MathisGD.

Currently the heap size is changing at each insert, remove, and decrease operation. When it reaches _maxSortedUsers, it is divided by 2. The informal goal is to distribute high liquidity evenly on the branches of the heap, in order to maximize the total liquidity that is in the heap.

One other idea would be distribute the incoming liquidity "randomly" on the last nodes before _maxSortedUsers. This means that the size of the heap would only change up to _maxSortedUsers and then stay the same, potentially saving one SSTORE for each operation. Another potential benefit would be that the introduced randomness would mostly prevent worst case scenarios for the incoming liquidity.
Note that there is no real need for the randomness here to be truly random, and we could use a pseudo-random function instead.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.