Giter Site home page Giter Site logo

cyberbrain's Issues

Allow dumping tracing result

If program stucks and people use ctrl+c to interrupt the execution, we should be able to dump everything from memory to a file.

Does it not support recursive functions?

When I run the following code to calculate the Fibonacci sequence recursively, I got an AssertionError:

from cyberbrain import trace

@trace
def fibo(n):
    if n <= 1:
        return n
    else:
        return fibo(n - 1) + fibo(n - 2)

print(fibo(3))

I have also tested some other recursive functions, and they all have this problem. This may be a bug that needs to be fixed

Implement the rest of instructions planned to be supported in V1

docs.google.com/spreadsheets/d/12jHOV9TFrdPySdKVWacAFcL20U-VRPXHiCUzcJZRO2M

What's left:

  • with releated instructions
  • Closure related: LOAD_CLOSURE, LOAD_DEREF, LOAD_CLASSDEREF, STORE_DEREF, DELETE_DEREF
  • Call related: CALL_FUNCTION, CALL_FUNCTION_KW, CALL_FUNCTION_EX, CALL_METHOD, BUILD_TUPLE_UNPACK_WITH_CALL, BUILD_MAP_UNPACK_WITH_CALL
  • Others: LOAD_BUILD_CLASS, ROT_FOUR

Postponded:

SET_ADD, LIST_APPEND, MAP_ADD (Used for list/set/dict comprehension, which we are not able to test due to call tracing not enabled)

Refactor visualization code

We should refactor visualize.js to:

  • Create a separate class TraceData to:
    • Manage raw events and loops, including visible events
    • Rename getInitialState to initialize
    • Add a updateVisibleEvents method (modified from Loop.generateNodeUpdate), which returns the current visible events
    • Remove the visible events calculation logic from initialize.
  • Modify the TraceGraph class to:
    • Calls TraceData.initialize upon receiving data from the backend
    • Calls the TraceData.updateVisibleEvents after a loop counter is set

The most notable change is that we don't replace nodes anymore, but render all nodes again if a loop is updated.
Reasons:

  • Each method can only have one responsibility (previously getInitialState does multiple things)
  • More robust, because calculating nodesToHide and nodesToShow is tricky
  • Help split TraceData and TraceGraph class
  • No performance sacrifice, because the number of visible nodes are small.

Only allow triggering tracing once

For now, if users call the start() method multiple times, for the invocations after the first call, the method should do nothing.

To achieve this, we should record the whether the method has been called.

Improve tooltip

  • Tooltip text should be truncated.
  • Tooltip should not overlap with nodes.
    We probably need to show tooltip on the same height with each node, starting from the right or left edge depending on the position of the node. To do that, at least we need to know the width of a node.

Improve Cyberbrain API

Provide a decorator API

Sometimes a decorator is more convenient than .start() and .end(), especially when the function can return from different places.

Also since at this point we don't support multi-frame tracing, a decorator API should be more user-friendly.

Provide a default tracer object

Provide a way to disable tracing

This would allow users to control whether to enable tracing via a flag, so that they don't need to modify their code.

Specifics TBD.

Confusing graph when there are multiple variables in one line

Example code:

from cyberbrain import trace


@trace
def fib(n):
    a, b = 0, 1
    for _ in range(n):
        a, b = b, a+b
    return b


if __name__ == '__main__':
    fib(3)

Result:
image

Maybe different name within one line can be separated? Alike this workaround:

from cyberbrain import trace


@trace
def fib(n):
    (a,
     b) = 0, 1
    for _ in range(n):
        (a,
         b) = b, a+b
    return b


if __name__ == '__main__':
    fib(3)

Result:
image

Show links for replaced nodes in loops

image

When nodes in a loop are replaced, the lines involving replaced nodes disappear. I believe even if these edges are not accurate after modifying loop counters, they still provide useful information thus should be kept.

We can use dotted lines for this, but we need to add a caption or tooltip to show what it means.

Received message larger than max

gRPC has a 4MB payload size limit on the client side.

RESOURCE_EXHAUSTED: Received message larger than max (7904348 vs. 4194304)
	at Object.callErrorFromStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/call.js:31)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client.js:176)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:342)
	at Object.onReceiveStatus (/Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:305)
	at /Users/laike9m/Dev/Python/Cyberbrain/cyberbrain-vsc/node_modules/@grpc/grpc-js/build/src/call-stream.js:124
	at processTicksAndRejections (/Applications/Visual%20Studio%20Code.app/Contents/Resources/app/out/vs/code/electron-browser/workbench/internal/process/task_queues.js:76)

Allow mutual interaction between source code and the trace graph

Some ideas:

  • When clicking/hovering on a trace graph node, highlight the corresponding text in the source code panel
  • Vice versa

The difficult part is that if an identifer appears multiple times in the same line, we may not be able to tie them to the correct nodes accurately. But at least we could just highlight the whole line or all nodes in the same line, which is already useful.

Store JSON-searialized objects directly and get rid of diffs

Right now we store diffs for mutations and restore the value at each snapshot only when returning the tracing result to frontend. A problem is that we need to deepcopy (and pickling internally) the values, but many objects are unpickable.

This problem can be avoided, because eventually we'll pass JSON to the frontend, so there's no need to store the original values —— we can serialize them to JSON early and only store the JSON.

The problem of extra memory usage still exists. But since we're storing JSON, they can be dumped to the disk and loaded back easily. The optimization is out of the scope of this issue, but it will be a lot simpler and robust than using deepdiff.

Handling dependent loop counters

Loop counter nodes need to be hidden or displayed as other loop counter changes. And a loop counter's max value can be affected by the current value of other loop counters' (see password.py).

e.g.

for i in range(10):
  if i == 2:
    for j in range(2):  # The visility of this loop counter node should be dependent on i,
      print(j)

Solve edge overlap issue

Sometimes edges seriouly overlap with each other, like:

image

Related:
visjs/vis-network#84

One possible approach:
image
Inspect all edges, if there's any vertical edge (from.x = to.x), adjust from.x = from.x + 10.

If two ends are on adjacent levels, do nothing.

This may require #21 to be implemented first, as the redraw of graph may lead to overlap of event nodes and lineno nodes.

Support generator functions

run the below code raises AttributeError

from cyberbrain import trace


def main():
    @trace
    def fib_gen(count):
        a, b = 1, 1
        while count := count - 1:
            yield a
            a, b = b, a + b

    for fib_num in fib_gen(10):
        print(fib_num)


if __name__ == "__main__":
    main()
stdout/err

Starting grpc server on 50051...
fib_gen <cyberbrain.frame.Frame object at 0x7f7908a26d30>
jumped: False
1
fib_gen <cyberbrain.frame.Frame object at 0x7f7902c9c850>
jumped: False
Traceback (most recent call last):
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/value_stack.py", line 130, in emit_event_and_update_stack
    handler = getattr(self, f"_{instr.opname}_handler")
AttributeError: 'Py38ValueStack' object has no attribute '_YIELD_VALUE_handler'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/ptvsd_launcher.py", line 43, in <module>
    main(ptvsdArgs)
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/lib/python/old_ptvsd/ptvsd/__main__.py", line 432, in main
    run()
  File "/tmp/vscode-extensions/[email protected]/extension/pythonFiles/lib/python/old_ptvsd/ptvsd/__main__.py", line 316, in run_file
    runpy.run_path(target, run_name='__main__')
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/gitpod/.pyenv/versions/3.8.6/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/workspace/Cyberbrain/examples/gen.py", line 17, in <module>
    main()
  File "/workspace/Cyberbrain/examples/gen.py", line 12, in main
    for fib_num in fib_gen(10):
  File "/workspace/Cyberbrain/examples/gen.py", line 9, in fib_gen
    yield a
  File "/workspace/Cyberbrain/examples/gen.py", line 9, in fib_gen
    yield a
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/tracer.py", line 235, in _local_tracer
    self.frame_logger.update(raw_frame)
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/logger.py", line 131, in update
    self.frame.log_events(frame, instr, jumped)
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/frame.py", line 133, in log_events
    event_info = self.value_stack.emit_event_and_update_stack(
  File "/workspace/.pip-modules/lib/python3.8/site-packages/cyberbrain/value_stack.py", line 132, in emit_event_and_update_stack
    raise AttributeError(
AttributeError: Please add
def _YIELD_VALUE_handler(self, instr):

How to view traces after the first one?

I tried out the demo on Gitpod and followed the instructions. After viewing the first example I ran a different example but didn't know how to view the graph. Eventually I figured out that killing the first process with Ctrl+C made it possible to reinitialize Cyberbrain with a new example. Is this the best way? If so, consider adding it to the instructions.

Gitpod support

#24

  • Publish the extension to openvsx.
  • Claming namespace: EclipseFdn/open-vsx.org#170
  • Reinstall online example's extension from openvsx, and verify it can work.

Refs:

Most things already work except Devtools.

Issues:

Make Cyberbrain more intuitive to use

Background

In #49 and cool-RR's email, an issue was mentioned:
It is not intuitive how to run a different program and open the trace graph with new data.

Quote:

I tried out the demo on Gitpod and followed the instructions. After viewing the first example I ran a different example but didn't know how to view the graph. Eventually I figured out that killing the first process with Ctrl+C made it possible to reinitialize Cyberbrain with a new example.

I changed the value passed to the function and reran, expecting to see the graph updated with the new values when I hovered with the mouse. It wasn't. I also tried the command "Initialize CyberBrain" again and yet the value wasn't updated.

Clearly we need to do better at this.

How It Works Now?

Currently, the Cyberbrain Python lib (abbr. cb-py) launches a server. When users run "Initialize Cyberbrain" in VS Code, the The Cyberbrain VS Code extension (abbr. cb-vsc) talks to the server and fetches data, then visualizes it. The server listens on a fixed port, thus there can't be multiple running servers.

Proposed Solution

Note: the below solution takes into consideration a feature which has not been implemented yet: multi-frame tracing. The original design of this feature is described here, which may differ from the below solution. But the core idea keeps unchanged: let users pick the frame to visualize.

Overview

cb-vsc automatically starts a long-running server, let's call it coordination server (abbr. cs). The workflow looks like this:

  1. When cb-py finishes tracing a Python program
    • 2.1. If there's only one frame
      cb-py sends this frame to cs/cb-vsc, cb-vsc generates a new trace graph.
    • 2.2. If there are multiple frames
      3. cb-py sends the locations of these frames (aka FrameLocaterList) to cs/cb-vsc
      4. The user picks one frame (details TBD), cs/cb-vsc sends the location of this frame (FrameLocater) back to cb-py
      5. cb-py sends the selected frame to cs/cb-vsc, cb-vsc generates a new trace graph.

Note that in case 2.2, step 3-5 could repeat multiple times to allow visualizing different frames in the same execution. This requires server-side streaming RPC, so cs/cb-vsc can send multiple FrameLocaters to cb-py. Also cb-vsc should persist the locations of all available frames.

The proposed solution has a few benefits compared to the existing implementation:

  • Users only need to run the Python program, no need to run "Initialize Cyberbrain".
  • The experience keeps unchanged for running multiple programs and/or multiple times.

The Coordination Server

Requirements:

  • cs should remain active as long as VS Code is open, presumably with a periodic status check.

  • The listening port should be configurable. Potentially, we can use a config file ~/.cyberbrain_config and let cb-py and cb-vsc read it.

Open questions:

  • Will vsc launch multiple coordination servers when there are multiple opened window? If yes, we need to handle it gracefully.

Things to Take into Consideration

Stateless

The infrastructure should be as stateless as possible, otherwise it would be very complicated to maintain and extend.

Future Proof

The design should work well with future features to add, including (but not limited to) multi-frame tracing, though this is hard since things may change or news features are planned.

Needs to work well with codelens #34

The report sent from cb-py could potentially carry information to tell cb-vsc where to show codelens, so that users are aware of the click-to-enable-trace-graph feature. This (and supporting for multi-frame tracing) also means that the Python program needs to stay alive before manually terminated.

Remote Debugging Friendly

The design should be able to support remote debugging.

AI: Learn how remote debugging works.

Third-party Friendly

We should not rely on VS Code specific things, and if users want to build their own cs and frontend, they should be able to do so.

When dragging a node, all related nodes are all moved as whole

"related nodes" means all nodes that show value when hovering the dragged node.

The outcome includes nearly impossible for arranging node graph order in loops, since all values are all bind together.

Example code:

from cyberbrain import trace


@trace
def fib(n):
    a = 0
    b = 1
    for _ in range(n):
        t = b
        b = a + b
        a = t
    return b


if __name__ == '__main__':
    fib(10)

Support Cyberbrain in more editors and IDEs

There are countless editors and IDEs out there. For convenience, I'll call them environments. I'd like to see Cyberbrain integrated with all of them, but this simply is not possible given the limited time I have. Considering the technologies Cyberbrain is using, here's what I'm gonna do.

  • Cyberbrain will officially support major vscode-compatible environments
  • Support for non-vscode-compatible environments will rely on the community
    I'm committed to provide as much help as I can, including but not limited to:
    • Answering questions
    • Audio/video 1:1
    • Making necessary code changes
    • Pair programming

The reason is simple. The only environment we now support is VS Code (local), thus it's much easier to support vscode-compatible environments than others.

Based on the strategy, the environments we will offically support include:

Please let me know if there's more.

The environments that we will rely on the community to support include:

  • All non-web IDEs (PyCharm/Eclipse/Visual Studio/etc)
  • Vim/Emacs/Sublime/Atom/etc
  • Jyputer notebook
  • Command line

I will create a formal specification of the internal API to help people build third-party tools.

There is no preset timeline for when each environment will be supported, or in which version. I want to keep it flexible, and most likely, the environments that more people requested for will be supported first. Once a new environment is supported, we'll release a new minor version.

For requesting support for another vscode-compatible environment, please open a separate issue.

If you want to migrate Cyberbrain to a non-vscode-compatible environment, please contact me directly on Twitter or Discord. I'm happy to discuss it anytime.

Improve object inspection

  • Show absolute line number
  • Show class name for instances
  • Diffrenciate between tuple, list, set (needs to record the original type)
  • Deal with '{"py/type": "test_cellvar.test_closure.<locals>.Foo"}', which represents a Python class.
  • Numpy objects
  • Pandas objects
  • Show string with quotes
  • Fix: re.Match object is null in Js.
    ( As it turns out, jsonpickle only serializes an object's __dict__. A re.Match object has no __dict__, only attributes defined by descriptors, so was serialized to null. )
  • Pass truncated repr to FE to use as the tooltip text.
  • Repr truncated on Linux (alexmojaki/cheap_repr#13, alexmojaki/cheap_repr#15)

Conext and Solutions

Right now we use

jsonpickle.encode(python_object, unpicklable=False)

to convert a Python object to JSON. Many information is lost in the conversion. If we use unpicklable=False, theoretically it's lossless, but we need to handle the extra information (like methods, scopes, etc).

See:

Fall back to repr is not a bad choice.

On the Js side, we could use eval to generate a more user-friendly output. So instead of a plain Js object, we could attach class information by
image

We can also hide the __proto__ attribute when logging:
https://stackoverflow.com/questions/11818091/hiding-the-proto-property-in-chromes-console

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.