Comments (7)
Since TaskMatrix recently updated for GroundingDINO and segment-anything support (2 weeks ago), it could definitely be of use for generating segmentation results.
Some ideas I have:
- Use TaskMatrix's templates to define execution flows for complex tasks (or common ones).
- Compatibility: we can modify the resolution/scale of screenshots to make recordings more compatible with other devices.
- Can implement a form of pop-up detection, since sometimes unexpected events can occur (like a pop-up from another application / OS), which would usually break automation processes until a human intervenes.
from openadapt.
To add to @0dm, we could also implement:
- Regression Testing: We can utilize TaskMatrix to automate testing after every change (or every set of changes) to ensure that the changed code is bug-free & functional
- Automating User Input: We could use TaskMatrix to simulate user input (e.g. typing commands). In addition, we can use verify the response. Saving time for the coder!
from openadapt.
Use TaskMatrix's templates to define execution flows for complex tasks (or common ones).
@0dm what would this look like?
from openadapt.
forgive me, it's almost 6am and my brain is going to implode (i haven't slept). here's a rough idea with one of their example templates.
they define a template:
class ObjectSegmenting:
template_model = True # Add this line to show this is a template model.
def __init__(self, Text2Box:Text2Box, Segmenting:Segmenting):
# self.llm = OpenAI(temperature=0)
self.grounding = Text2Box
self.sam = Segmenting
# methods below...
now, this template can be applied for a variety of situations, such as the prompt, "segment the given object:" or even "segment the cat" (🐈⬛ 😨).
then, they collect all the templates for a list of possible functions:
# Load Template Foundation Models
for class_name, module in globals().items():
if getattr(module, 'template_model', False):
template_required_names = {k for k in inspect.signature(module.__init__).parameters.keys() if k!='self'}
# ...
In puterbot, we could define templates for "open application", "send email", etc. and have these functions activate on certain phrases like "open <application> => open chrome" . This would make the codebase very organized and make combining tasks easier, since replays could invoke numerous functions in sequence (open chrome => goto url gmail => compose new email => ...)
from openadapt.
Thank you @0dm ! Let's talk through this on a call. Please grab some time on my calendar here:
https://www.getclockwise.com/c/richard-abrich/quick-meeting (today after 3pm or tomorrow, otherwise next week).
from openadapt.
@abrichr , scheduled for 4:15 tomorrow!
from openadapt.
https://aclanthology.org/2024.naacl-demo.2/
https://github.com/chenfei-wu/TaskMatrix/tree/main/LowCodeLLM
https://www.youtube.com/watch?v=jb2C1vaeO3E
from openadapt.
Related Issues (20)
- Store application logs HOT 1
- [Bug]: not bound to a Session; lazy load operation of attribute 'action_event' cannot proceed HOT 5
- Loading / Splash screen
- [Bug]: hide fixture recordings in dashboard HOT 2
- [Bug]: database contains recordings from previous versions HOT 2
- [Bug]: Failed to execute script 'entrypoint' due to unhandled exception: 'NoneType' has no attribute 'write' HOT 4
- [Bug]: `WindowEvent` race condition HOT 3
- [Bug]: mouse positions on Windows are not accurate HOT 2
- Handle similar segments HOT 2
- [Bug]: searching for deleted file while replaying
- Dashboard: delete parent events HOT 5
- [Bug]: Nonetype object has no attribute write HOT 3
- Support for Efficient Multi-Window Accessibility Tree Capture HOT 1
- [Bug]: VanillaReplayStategy returns unexpected keyword argument instructions
- [Bug]: VanillaReplayStategy returns KeyError
- [Bug]: handle OpenAI API image errors
- Wait for action effect to complete before attempting to get next action
- [Bug]: Vanilla replay strategy fails with below error HOT 2
- [Bug]: ValueError in `openadapt\video.py:151` - Specified pixel format rgb24 is invalid or not supported HOT 4
- [Bug]: `QWindowsContext: OleInitialize() failed` error on running: `python -m openadapt.app.visualize` HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openadapt.