Giter Site home page Giter Site logo

operand / agency Goto Github PK

View Code? Open in Web Editor NEW
381.0 7.0 19.0 2.66 MB

A fast and minimal framework for building agent-integrated systems

Home Page: https://createwith.agency

License: MIT License

Python 97.71% HTML 0.48% Ruby 0.38% SCSS 0.21% Jinja 0.95% Shell 0.28%
actor actor-model agent agents ai artificial-intelligence machine-learning python agi artificial-general-intelligence

agency's People

Contributors

dependabot[bot] avatar operand avatar wwj718 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

agency's Issues

OpenAI Errors

Hi @operand,

Faced issues with context window limits + rate limiting.

Would recommend using something like reliableGPT - handles retries, model switching, etc.

from reliablegpt import reliableGPT
openai.ChatCompletion.create = reliableGPT(openai.ChatCompletion.create, ...)

Source: https://github.com/BerriAI/reliableGPT

Decentralized `Space` type

A decentralized space would enable another level of scalability beyond AMQP. And it creates the possibility of peer-to-peer agent networks... very exciting if you ask me. :)

I think one of the best candidates for doing this would be to use the ZeroMQ library. The messaging API is very similar to AMQ so it should easily fit.

I personally won't be able to focus on this for some time since the UI work is underway and I'm committed to ensuring that the AMQP and native implementations are further developed.

API Improvements

This issue is about making a number of API and internal improvements. Rather than create issues for each I'm listing them together here. Most of these are relatively small and could be fixed together. Some of them will be breaking API changes so it makes sense to try to bundle some before the next version increment.

If you have thoughts on these or other API improvements you'd like to see please share.

  • Remove "id queue" declarations in AMQPSpace
    Instead use ack/timeouts to return a "not found" error. It wastes resources on the server and isn't foolproof anyway.
  • Move thread event declarations out of Agent class
    This would keep all the threading details in the Space class making the Agent class more portable
  • Remove colorama dependency
  • Wider Python support including MicroPython
  • Broadcasts should be self-received
    This was a hasty choice made early on and I just haven't gotten around to changing it. The current pattern is not the standard way that a broadcast is handled. The API shouldn't be surprising like this.
  • ...

How to send and receive messages interactively

In previous versions, an agent was instantiated and then added to the space. The agent could be referenced from outside the space. We could "hold" a agent in an environment like Jupyter and interact with other agents through it. In the latest version, space is responsible for instantiating the agent. We can't seem to get a reference to an agent. How to interactively communicate with other agents in the space?

Another benefit of referencing the agent is that it can be programmed dynamically, which seems to be helpful for early exploration.

Isolating the agent in a space is very consistent with actor-style programming. Actors can only communicate with each other through messages and cannot call functions from each other.

In erlang/elixer, I remember that users can still send messages to specific actors in the repl environment.

/help doesn't work

According to the documentation:

/help
This will broadcast a help action to all other agents

/help doesn't work (from the web UI).

/help to:"" works well.

Create example applications directory

The agents/ and spaces/ subdirectories of the library currently contain example implementations of the core classes. I think it would be better to move these examples into a separate examples/ folder outside of the root agency/ folder. This way library downloads don't contain files that they don't need.

And what I'm thinking we could do next is to develop one or more full applications under the examples/ directory that showcase different implementations, starting with the demo I detailed in the main readme. Example applications could come complete with dockerfiles etc.

Add message broker/networking support (AMQP)

The idea here is to have agents use AMQP (RabbitMQ) as their queue provider, allowing multiple instances of an agency environment to integrate seamlessly over a network.

This feature could also enable a more efficient multiprocessing solution.

Performance tests

This issue is about adding some form of automated performance testing to protect against performance regressions. It would be nice to also publish the metrics somehow.

Model training example

It should be possible to use agency in interesting ways for purposes other than a typical application. One idea is to use it as a basis for doing model training.

This use case is not a primary focus of the library so it may not be developed for a while, but this issue is to track any thoughts or progress.

gorilla integration example

gorilla

This one's a bit interesting because of the "just output the code" approach of gorilla, which is really nice.

To integrate with this library we could simply wrap the output from gorilla and pass it to an agent responsible for executing the code. This would provide the benefit of allowing access control and review before executing any dangerous code.

Multimedia support

I'm using this issue for now to track work on multimedia support, but this is likely to be broken down into several issues (features) as I move forward. For now I'm just capturing some context.

Relevant links:

  • discussion - #72
  • gradio app - #82
  • react app (replacing gradio) - #136

Broader python support

Right now Agency supports python 3.9 and above. I'd like to make sure that Agency is compatible across more python versions. It probably already is but at a minimum needs testing.

It would be great to set up some automated tests across different versions when this is done.

Gradio Application

Summary

This issue is to track development on a new "starter" Gradio application that will replace the Flask/React WebApp class.

For now I'll break out the following tasks and use this issue to track them.

  • Research and choose technology for application stack
  • Create initial starter app that matches current demo features
  • Rename WebApp and put aside as a Flask/React example
  • Update documentation as needed

After the above tasks are done, we can begin adding multimodal features and other improvements. Those can be broken into separate issues.

[edit] I'm not sure I'm going to change the name like I originally thought. The demo just needs a new UI and is otherwise fine for now.

Notes

I'm calling it a "starter" application, because I want to reflect a slightly different role for it. The current demo was a proof of concept. It works, but it isn't great for modification or extension. Now that more features will require front end changes, a better foundation is needed.

In addition to replacing the demo, I hope to maintain the app as a foundation for building custom applications of your own. Meaning that it should be easy to copy the source and customize to your needs.


Regarding the tech stack, I have not yet decided what to use, I started the discussion here for ideas if you have any.

I can say the following to start:

  • Using web technologies for the UI is a given due to its broader reach. For building locally-installed apps, frameworks like electron or Flutter could be used. I'll keep in mind compatibility with those frameworks to keep those options open.

  • A UI library will be chosen. Streamlit looks like a great possibility. Whatever is chosen, I want to ensure that:

    • It's well supported including good documentation etc.
    • It supports a broad range of multimedia and interactive UI features

I like to work fast and I don't want to shift focus from the core library for long (it needs a lot of work), but decisions like this shouldn't be rushed, so I won't likely ship any big features for the next couple weeks or so while I experiment. I'm still here for fixes or anything that comes up.

I plan to get an initial cut of a new application added as soon as possible to gather feedback. Stay tuned to this issue if you'd like to follow along. And I really appreciate any feedback you can offer.

missed heartbeats from client, timeout: 60s

image

When I run the multi-container AMQPSpace application: ./demo down amqp , encountered the problem: missed heartbeats from client, timeout: 60s

After that the message cannot be replied.

image

I checked the RabbitMQ documentation: heartbeats,my understanding is that if the network is normal, the client and RabbitMQ communicate periodically to confirm each other's survival.

According to the instructions in the document: Configuring Connectivity, I increased the heartbeat value: 3600, and the error did not occur within an hour.The flow of messages through the system is also normal. This seems to indicate that the TCP connection has not been disconnected.

I have done many tests, and the above situation is reproducible.

So I guess the problem seems to be that the client and server miss each other's heartbeat information.

have you ever encountered this problem?

My system environment is:

  • MacOS version 13.4.1
  • Docker version 20.10.24

Multimodal model example

This issue would entail adding an example using a multimodal model of some kind. Once multimedia messaging support is further along, adding an example like this should hopefully be straightforward.

Improve multiprocessing and concurrency

This issue is to discuss and track work on concurrency related improvements. v1.4.0 just recently added multiprocessing support, but there are more improvements that could be made.


Another multiprocessing improvement to explore would be to allow parallel processing of messages by individual agents. Currently they must complete processing of a message one at a time. See #95 for an example implementation from @wwj718.

Browseable API site

It would be useful to autogenerate API documentation and host it. This should help reduce documentation maintenance since a lot of documentation can be maintained with the code.

Optional args

Hello, I noticed that in version 1.3, the help method converts Python objects into a JSON structure. However, the current implementation prevents the use of UnionType in action definitions, which makes the args unable to be optional.

I looked at the code, and the main reason for this is:

# type
 sig_annotation = signature.parameters[arg_name].annotation
 if sig_annotation is not None and sig_annotation.__name__ != "_empty":

When the argument type is a UnionType, sig_annotation does not have the __name__ attribute.

Is this a design decision?

Potential issues introduced by eventlet

Hello, thank you for creating such a fantastic project. During my usage, I encountered the issue mentioned in eventlet/eventlet#371. According to the comments in that issue, I resolved the problem by moving the import of 'agency' to the top of the file. If eventlet is required, it would be beneficial to include instructions regarding this aspect.

Logging, debugging, and observation

This issue is about adding debugging and logging features to support developers. I'm not clear yet on what specifically should be added but to start I'm listing some thoughts.

  • Instrument the codebase with debug messages that can be turned on with an environment variable
  • Add a Space.on_route callback that can be used to print or observe messages. I think for AMQP, the on_route method would be called whether the message was sent locally or not. This would allow observing all messages sent in a space.

Should we add an optional action parameter description?

Hello, I found that when using the function call agent in the demo, it may be unclear whether the parameter should be "objective" or "make a todo list" when faced with "your objective is xxx, your current task is make a todo list". In this case, it would be helpful to have a description of the parameters. I saw a comment mentioning that this functionality may be added in the future. I have done some exploration and would like to share my thoughts:
I imagine that the description of the parameters should be optional, as non-function call agents may not be concerned about this. Here are two possible approaches:
1.Simply add parameter descriptions in action doc, then extract them using regular expressions. This method requires agreeing on the format of the parameters.
2. Implement a decorator args_schema that takes a subclass of pydantic BaseModel.

def args_schema(cls: BaseModel):
    def decorator(func):
        @wraps(func)
        def wrapper(say_input):
            if not isinstance(say_input, cls):
                try:
                    say_input = cls.parse_raw(say_input)
                except ValidationError as e:
                    raise ValueError("Invalid input for {}: {}".format(cls.__name__, e))
            return func(say_input)

        wrapper.input_type = cls
        return wrapper
    return decorator

class SayInput(BaseModel):
    """Inputs for say"""

    what: str = Field(description="what to say")

@args_schema(SayInput)
@access_policy(ACCESS_PERMITTED)
 def _action__say(self, content: str):
     pass

In this way, we can obtain the schema of the parameters through args_schema.schema() in help method, but this method requires modifying help to optionally add the schema. Additionally, the validation in the decorator seems to be redundant with the validation of MessageSchema inside _route. The advantage is that the format is more controllable, and it is more convenient to obtain the schema of the parameters in the function call, avoiding manual conversion.

How do you consider this issue? Is there a better approach? Looking forward to your insights.

Code execution (interpreter) support

It's easy enough today to implement an "eval" action on an agent and use it to run arbitrary code in python or the shell, but given the power of code execution in general, I want to explore how it can be supported as a first class feature.

Following the React app updates, I plan to explore how this could be more directly supported.

In the meantime, this is all it takes to create a simple code-interpreter-like agent if you'd like to play around with this concept.

@action
def eval(self, code: str):
    """Evaluate python code within this agent's context"""
    eval(code)

Naming Things

Hi Operand! I commented on your post on Reddit. I read the readme and wanted to offer a couple of thoughts early on, and since they're more architecture-oriented I thought I'd open an issue instead of posting on Reddit. I also want to offer some different ideas that could be taken as critical, and I don't want to pile on any more negativity or "Umm, Actually" vibes on reddit. So I'll do it here :)

I also thought it would be useful to share these ideas sooner as opposed to later when the decisions might be harder to change.

I love that you're adopting an Actor model for this, and I think that makes perfect sense. I think the idea of sending an LLM a message is a little backwards though. What struck me as weird was a couple of things:

"say" as a command that is /sent/ to a model - It seems to me that this is turned around - In actor model, messages are sent, and actors can handle messages in channels they are listening to.

In this way, I'd expect "say" to be something more like "handle_message". You can see a good example of how Elixir takes a page from Erlang in Liveview with then "handle_event" callback: https://hexdocs.pm/phoenix_live_view/Phoenix.LiveView.html#module-bindings

"actions" seem more like "messages"; that is, an actor might take some kind of action in the real or virtual world based on a message. As an example, imagine people in a meeting room. When they speak, they send messages to each other, and sometimes, they take actions (such as writing something down). Also, each person in the room (e.g. space e.g. channel) can "hear" all the traffic in that room, but responses are generally dictated by who a message was directed to.

In erlang at least, these are sometimes called "messages" and "mailboxes".

Regarding the qualified names, "Chatty.Demospace", i imagined that the space and the target agent would be separate parameters, something more like:

{
  to: "Chatty",      // process id
  space: "Demospace" // channel
  message: "What would you say it is you do here?"
  args: {            // i wish this was called something else.. meta? params? i guess args is fine
    ...
  }
}

Curious to hear your thoughts, and please don't take any of this as critical; I think that you've chosen the right direction for this!

Messaging and connection reliability

This issue is about adding options to support more reliable messaging guarantees.

A few things this might entail:

  • Enabling AMQP durable messaging for the AMQPSpace class
  • Adding reconnection logic
  • Adding support for message acknowledgements
  • More error and disconnection handling in general

How to extend action message ?

I want to add a field for action: message_id, used to match sent and received messages. Other potential extension requirements include timestamp...

Currently I seem to need to modify the internal implementation of Agent (MessageSchema or ActionSchema). Is there a better way ?

hello, i wanted to reach out

i am developing a multimodal model which will be trained on a dataset designed to allow it to understand how to pilot a computer intelligently, i was reading through this repository because i was involved with a similar project recently (OSNAP) and was surprised to see how much the effort your pushing for has in common with mine. were presently working on developing models with long term memory on my discord id like to have a quick chat with you if youve got time to see we can do to contribute from the resources that we have on hand

Restore foreground agent support

In release 1.4.0 the API was changed to where agent instances are instantiated and run in a sub-process or thread, out of direct access from the main thread.

This unfortunately made certain functionality difficult that may require the agent to communicate within the main process. For example, receiving input from the process's stdin cannot be done directly. This also led to difficulties with the demo Gradio application and testing was made more difficult as well.

So I see this as a bit of a regression, and there are a number reasons to support the ability to have at least one agent in the "foreground" while other agents may be distributed elsewhere.

I think a solution should be small enough 🤞, and I think the refactoring involved will be worthwhile so I'll be focusing on this before getting back to the JS/React work.

React and Javascript support

Now that I've spent some time with Gradio, it's clear that it's too inflexible to support the kinds of features that are needed. In addition my hope is that Agency is a foundation for custom applications, and Gradio unfortunately doesn't fit that use case well enough.

So I'll be dusting off the older React app implementation, and updating it to enable work on multimedia and other features.

As part of this, I'll be developing a proper JS client that will become the communication basis for the front end. The JS client will likely mirror the API in python, and allow communication across the two languages.

Agent in the real world

I created an experiment using MQTT (RabbitMQ MQTT Plugin) for message delivery.

video: MicroPython Demo

The basic idea is to implement an MQTTSpace, so that we can connect most MQTT-compatible devices to the Agency. Since MQTT is de facto standard in the field of the Internet of Things (IoT), it means that most open source hardware can be connected!

Since web mqtt is available out of the box in RabbitMQ, we can use many web environments as the environment where the Agent lives. The following is an example:

video: Snap! Demo

Snap! is a broadly inviting programming language for kids and adults that's also a platform for serious study of computer science.

Snap! is a live programming environment with powerful liveness.

It has a built-in web MQTT library, which is very suitable for interactively building agents, which is very helpful for early experiments.


It seems valuable to enable the MQTT plugin in the Agency's RabbitMQ service. If you're interested in this, I would like to work on it.

More details: https://github.com/wwj718/agency/blob/mqtt/examples/mqtt_demo/README.md

Assumptions in __process

__process assumes that the result of self.__commit(message) is response.

Imagine the following use case (this is my real use case): there is a manager agent that receives a message from a customer agent, the manager agent delegates the task to the workers agent (by forwarding the message), and the final response is sent from the worker agent to the customer agent. This assumption of __process makes things like this difficult

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.