Giter Site home page Giter Site logo

Comments (4)

JodyZ0203 avatar JodyZ0203 commented on May 22, 2024 1

try again

from sweep.

sweep-nightly avatar sweep-nightly commented on May 22, 2024

Hey @wwzeng1,

I've started working on this issue. The plan is to modify the on_ticket function in the src/handlers/on_ticket.py file. I'll identify the points where a failure can occur and add code to remove the "sweep" label from the issue at these points.

I'll be using the Github API's remove_from_labels method to remove the label. Before that, I'll fetch the "sweep" label object using the get_label method. I'll also handle the case where the "sweep" label does not exist by catching the exception raised by the get_label method and logging a message.

Give me a minute!

Best,
Sweep bot

Some code snippets I looked at (click to expand). If some file is missing from here, you can mention the path in the ticket description.

sweep/src/api.py

Lines 1 to 57 in f130c22

import time
from loguru import logger
import modal
from pydantic import ValidationError
from src.handlers.create_pr import create_pr # type: ignore
from src.handlers.on_ticket import on_ticket
from src.handlers.on_comment import on_comment
from src.utils.constants import API_NAME, BOT_TOKEN_NAME, LABEL_COLOR, LABEL_DESCRIPTION, LABEL_NAME, SWEEP_LOGIN
from src.events import (
CommentCreatedRequest,
InstallationCreatedRequest,
IssueCommentRequest,
IssueRequest,
PRRequest,
ReposAddedRequest,
)
from src.utils.event_logger import posthog
from src.utils.github_utils import get_github_client, index_full_repository
from fastapi import HTTPException, Request
stub = modal.Stub(API_NAME)
image = (
modal.Image.debian_slim()
.apt_install("git")
.pip_install(
"openai",
"anthropic",
"PyGithub",
"loguru",
"docarray",
"backoff",
"tiktoken",
"highlight-io",
"GitPython",
"posthog",
"tqdm",
"pyyaml"
)
)
secrets = [
modal.Secret.from_name(BOT_TOKEN_NAME),
modal.Secret.from_name("openai-secret"),
modal.Secret.from_name("anthropic"),
modal.Secret.from_name("posthog"),
modal.Secret.from_name("highlight"),
]
FUNCTION_SETTINGS = {
"image": image,
"secrets": secrets,
"timeout": 15 * 60,
}
handle_ticket = stub.function(**FUNCTION_SETTINGS)(on_ticket)
handle_comment = stub.function(**FUNCTION_SETTINGS)(on_comment)

reply = sweep_bot.chat(final_review_prompt, message_key="final_review")
review_coment = PullRequestComment.from_string(reply)
</snippet>
</relevant_snippets_in_repo>
<relevant_paths_in_repo>
src/core/vector_db.py
src/handlers/on_ticket.py
tests/test_chatgpt.py
src/handlers/on_comment.py
tests/test_pr_diffs.py
</relevant_paths_in_repo>
<repo_tree>
.flake8
.github/...
.gitignore
.pre-commit-config.yaml
.vscode/...
Dockerfile
LICENSE
README.md
deploy.sh
poetry.lock
pyproject.toml
src/
|- __init__.py
|- api.py
|- core/
| |- __init__.py
| |- models.py
| |- prompts.py
| |- react.py
| |- sweep_bot.py
| |- vector_db.py
|- events.py
|- handlers/
| |- __init__.py
| |- on_comment.py
| |- on_review.py
| |- on_ticket.py
|- utils/...
tests/
|- chunking_experiments.ipynb
|- create_sample_issue.py
|- example_code/...
|- recursive_chunking_experiments.ipynb
|- test_cached_embedding.py
|- test_chatgpt.py
|- test_chunking.py
|- test_deeplake.py
|- test_dfs.py
|- test_diffs.py
|- test_gpt_functions.py
|- test_models.py
|- test_new_ticket.py
|- test_pr_diffs.py
|- test_prompt_constructor.py
|- test_review_comments.py
|- test_tiktoken.py
|- test_tools.py
|- test_tree.py
|- test_vector_db.py
</repo_tree>
Repo: sweep: Sweep AI solves Github tickets by writing PRs
Issue Url: https://github.com/sweepai/sweep/issues/1
Username: wwzeng1
Issue Title: Write a simple reply to the user
Issue Description: None
<body file_name="on_comment.py">
1: """
2: On Github ticket, get ChatGPT to deal with it

repo_description=repo_description if repo_description else "",
diffs=diffs,
issue_url=pr.html_url,
username=username,
title=pr_title,
tree=tree,
summary=pr_body,
snippets=snippets,
pr_file_path=pr_file_path, # may be None
pr_line=pr_line, # may be None
)
logger.info(f"Human prompt{human_message.construct_prompt()}")
sweep_bot = SweepBot.from_system_message_content(
# human_message=human_message, model="claude-v1.3-100k", repo=repo
human_message=human_message, repo=repo,
)
except Exception as e:
posthog.capture(username, "failed", properties={
"error": str(e),
"reason": "Failed to get files",
**metadata
})
raise e
try:
logger.info("Fetching files to modify/create...")
file_change_requests = sweep_bot.get_files_to_change()
logger.info("Making Code Changes...")
sweep_bot.change_files_in_github(file_change_requests, branch_name)
logger.info("Done!")
except Exception as e:
posthog.capture(username, "failed", properties={
"error": str(e),
"reason": "Failed to make changes",
**metadata
})
raise e
posthog.capture(username, "success", properties={**metadata})
logger.info("on_comment success")
return {"success": True}

repo_description=repo_description if repo_description else "",
diffs=diffs,
issue_url=pr.html_url,
username=username,
title=pr_title,
tree=tree,
summary=pr_body,
snippets=snippets,
pr_file_path=pr_file_path, # may be None
pr_line=pr_line, # may be None
)
logger.info(f"Human prompt{human_message.construct_prompt()}")
sweep_bot = SweepBot.from_system_message_content(
# human_message=human_message, model="claude-v1.3-100k", repo=repo
human_message=human_message, repo=repo
)
except Exception as e:
posthog.capture(username, "failed", properties={
"error": str(e),
"reason": "Failed to get files",
**metadata
})
raise e
try:
logger.info("Fetching files to modify/create...")
file_change_requests = sweep_bot.get_files_to_change()
logger.info("Making Code Changes...")
sweep_bot.change_files_in_github(file_change_requests, branch_name)
logger.info("Done!")
except Exception as e:
posthog.capture(username, "failed", properties={
"error": str(e),
"reason": "Failed to make changes",
**metadata
})
raise e
posthog.capture(username, "success", properties={**metadata})
logger.info("on_comment success")

"""
On Github ticket, get ChatGPT to deal with it
"""
# TODO: Add file validation
import os
import openai
from loguru import logger
import modal
from src.core.entities import FileChangeRequest, Snippet
from src.core.prompts import (
reply_prompt,
)
from src.core.sweep_bot import SweepBot
from src.core.prompts import issue_comment_prompt
from src.handlers.create_pr import create_pr
from src.handlers.on_comment import on_comment
from src.handlers.on_review import review_pr
from src.utils.event_logger import posthog
from src.utils.github_utils import get_github_client, search_snippets
from src.utils.prompt_constructor import HumanMessagePrompt
from src.utils.constants import DB_NAME, PREFIX, UTILS_NAME
github_access_token = os.environ.get("GITHUB_TOKEN")
openai.api_key = os.environ.get("OPENAI_API_KEY")
update_index = modal.Function.lookup(DB_NAME, "update_index")
bot_suffix = "I'm a bot that handles simple bugs and feature requests \
but I might make mistakes. Please be kind!"
collapsible_template = """
<details>
<summary>{summary}</summary>
{body}
</details>
"""
chunker = modal.Function.lookup(UTILS_NAME, "Chunking.chunk")
num_of_snippets_to_query = 10
max_num_of_snippets = 5
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str,
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None
):
# Check if the title starts with "sweep" or "sweep: " and remove it
if title.lower().startswith("sweep: "):
title = title[7:]
elif title.lower().startswith("sweep "):
title = title[6:]
# Flow:
# 1. Get relevant files
# 2: Get human message
# 3. Get files to change
# 4. Get file changes
# 5. Create PR
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"issue_number": issue_number,
"repo_full_name": repo_full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"installation_id": installation_id,
"function": "on_ticket",
"mode": PREFIX,
}
posthog.capture(username, "started", properties=metadata)
g = get_github_client(installation_id)
if comment_id:
logger.info(f"Replying to comment {comment_id}...")
logger.info(f"Getting repo {repo_full_name}")
repo = g.get_repo(repo_full_name)
current_issue = repo.get_issue(number=issue_number)
if current_issue.state == 'closed':
posthog.capture(username, "issue_closed", properties=metadata)
return {"success": False, "reason": "Issue is closed"}
item_to_react_to = current_issue.get_comment(comment_id) if comment_id else current_issue
eyes_reaction = item_to_react_to.create_reaction("eyes")
def comment_reply(message: str):
current_issue.create_comment(message + "\n\n---\n" + bot_suffix)
comments = current_issue.get_comments()
replies_text = ""
if comment_id:
replies_text = "\nComments:\n" + "\n".join(
[
issue_comment_prompt.format(
username=comment.user.login,
reply=comment.body,
) for comment in comments
]
)
def fetch_file_contents_with_retry():
retries = 3
error = None
for i in range(retries):
try:
logger.info(f"Fetching relevant files for the {i}th time...")
return search_snippets(
repo,
f"{title}\n{summary}\n{replies_text}",
num_files=num_of_snippets_to_query,
branch=None,
installation_id=installation_id,
)
except Exception as e:
error = e
continue
posthog.capture(
username, "fetching_failed", properties={"error": error, **metadata}
)
raise error
# update_index.call(
# repo_full_name,
# installation_id=installation_id,
# )
logger.info("Fetching relevant files...")
try:
snippets, tree = fetch_file_contents_with_retry()
assert len(snippets) > 0
except Exception as e:
logger.error(e)
comment_reply(
"It looks like an issue has occured around fetching the files. Perhaps the repo has not been initialized: try removing this repo and adding it back. I'll try again in a minute. If this error persists contact [email protected]."
)
raise e
# reversing to put most relevant at the bottom
snippets: list[Snippet] = snippets[::-1]
num_full_files = 2
num_extended_snippets = 2
most_relevant_snippets = snippets[-num_full_files:]
snippets = snippets[:-num_full_files]
logger.info("Expanding snippets...")
for snippet in most_relevant_snippets:
current_snippet = snippet
_chunks, metadatas, _ids = chunker.call(
current_snippet.content,
current_snippet.file_path
)
segmented_snippets = [
Snippet(
content=current_snippet.content,
start=metadata["start"],
end=metadata["end"],
file_path=metadata["file_path"],
) for metadata in metadatas
]
index = 0
while index < len(segmented_snippets) and segmented_snippets[index].start <= current_snippet.start:
index += 1
index -= 1
for i in range(index + 1, min(index + num_extended_snippets + 1, len(segmented_snippets))):
current_snippet += segmented_snippets[i]
for i in range(index - 1, max(index - num_extended_snippets - 1, 0), -1):
current_snippet = segmented_snippets[i] + current_snippet
snippets.append(current_snippet)
# snippet fusing
i = 0
while i < len(snippets):
j = i + 1
while j < len(snippets):
if snippets[i] ^ snippets[j]: # this checks for overlap
snippets[i] = snippets[i] | snippets[j] # merging
snippets.pop(j)
else:
j += 1
i += 1
snippets = snippets[:min(len(snippets), max_num_of_snippets)]
human_message = HumanMessagePrompt(
repo_name=repo_name,
issue_url=issue_url,
username=username,
repo_description=repo_description,
title=title,
summary=summary + replies_text,
snippets=snippets,
tree=tree, # TODO: Anything in repo tree that has something going through is expanded
)
sweep_bot = SweepBot.from_system_message_content(
human_message=human_message, repo=repo, is_reply=bool(comments)
)
sweepbot_retries = 3
try:
for i in range(sweepbot_retries):
logger.info("CoT retrieval...")
if sweep_bot.model == "gpt-4-32k-0613":
sweep_bot.cot_retrieval()
logger.info("Fetching files to modify/create...")
file_change_requests = sweep_bot.get_files_to_change()
for file_change_request in file_change_requests:
try:
contents = repo.get_contents(file_change_request.filename)
if contents:
file_change_request.change_type = "modify"
else:
file_change_request.change_type = "create"
except:
file_change_request.change_type = "create"
logger.info("Getting response from ChatGPT...")
reply = sweep_bot.chat(reply_prompt, message_key="reply")
sweep_bot.delete_messages_from_chat("reply")
logger.info("Sending response...")
new_line = '\n'
comment_reply(
reply
+ "\n\n"
+ collapsible_template.format(
summary="Some code snippets I looked at (click to expand). If some file is missing from here, you can mention the path in the ticket description.",
body="\n".join(
[
f"https://github.com/{organization}/{repo_name}/blob/{repo.get_commits()[0].sha}/{snippet.file_path}#L{max(snippet.start, 1)}-L{min(snippet.end, snippet.content.count(new_line))}\n"
for snippet in snippets[::-1]
]
),
)
)
logger.info("Generating PR...")
pull_request = sweep_bot.generate_pull_request()
logger.info("Making PR...")
response = create_pr(file_change_requests, pull_request, sweep_bot, username, installation_id, issue_number)
if not response or not response["success"]: raise Exception("Failed to create PR")
pr = response["pull_request"]
current_issue.create_reaction("rocket")
try:
eyes_reaction.delete()
except:
pass
try:
changes_required, review_comment = review_pr(repo=repo, pr=pr, issue_url=issue_url, username=username,
repo_description=repo_description, title=title,
summary=summary, replies_text=replies_text, tree=tree)
logger.info(f"Addressing review comment {review_comment}")
if changes_required:
on_comment(repo_full_name=repo_full_name,
repo_description=repo_description,
comment=review_comment,
username=username,
installation_id=installation_id,
pr_path=None,
pr_line_position=None,
pr_number=pr.number)
except Exception as e:
logger.error(e)
break
except openai.error.InvalidRequestError as e:
logger.error(e)
comment_reply(
"I'm sorry, but it looks our model has ran out of context length. We're trying to make this happen less, but one way to mitigate this is to code smaller files. If this error persists contact [email protected]."
)
posthog.capture(
username,
"failed",
properties={
"error": str(e),
"reason": "Invalid request error / context length",
**metadata,
},
)
raise e
except Exception as e:
logger.error(e)
comment_reply(
"I'm sorry, but it looks like an error has occured. Try removing and re-adding the sweep label. If this error persists contact [email protected]."
)
posthog.capture(
username,
"failed",
properties={"error": str(e), "reason": "Generic error", **metadata},
)
raise e
else:
try:
eyes_reaction.delete()
except:
pass
item_to_react_to.create_reaction("rocket")
posthog.capture(username, "success", properties={**metadata})
logger.info("on_ticket success")
return {"success": True}


I'm a bot that handles simple bugs and feature requests but I might make mistakes. Please be kind!

from sweep.

sweep-nightly avatar sweep-nightly commented on May 22, 2024

I'm sorry, but it looks like an error has occured. Try removing and re-adding the sweep label. If this error persists contact [email protected].


I'm a bot that handles simple bugs and feature requests but I might make mistakes. Please be kind!

from sweep.

sweep-nightly avatar sweep-nightly commented on May 22, 2024

Here's the PR! #396


Step 1: 🔍 Code Search

I found the following snippets in your repository. I will now analyze this snippets and come up with a plan.

Some code snippets I looked at (click to expand). If some file is missing from here, you can mention the path in the ticket description.

'''
On Github ticket, get ChatGPT to deal with it
'''
# TODO: Add file validation
import os
import openai
from loguru import logger
import modal
from tabulate import tabulate
from sweepai.core.entities import FileChangeRequest, Snippet
from sweepai.core.prompts import (
reply_prompt,
)
from sweepai.core.sweep_bot import SweepBot
from sweepai.core.prompts import issue_comment_prompt
from sweepai.handlers.create_pr import create_pr, create_config_pr, safe_delete_sweep_branch
from sweepai.handlers.on_comment import on_comment
from sweepai.handlers.on_review import review_pr
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import get_github_client, search_snippets
from sweepai.utils.prompt_constructor import HumanMessagePrompt
from sweepai.utils.constants import DB_NAME, PREFIX, UTILS_NAME, SWEEP_LOGIN
from sweepai.utils.chat_logger import ChatLogger, discord_log_error
from sweepai.utils.config import SweepConfig
import traceback
github_access_token = os.environ.get("GITHUB_TOKEN")
openai.api_key = os.environ.get("OPENAI_API_KEY")
update_index = modal.Function.lookup(DB_NAME, "update_index")
sep = "\n---\n"
bot_suffix_starring = "⭐ If you are enjoying Sweep, please [star our repo](https://github.com/sweepai/sweep) so more people can hear about us!"
bot_suffix = f"\n{sep}I'm a bot that handles simple bugs and feature requests but I might make mistakes. Please be kind!\n<sup>[Join Our Discord](https://discord.com/invite/sweep-ai)"
stars_suffix = "⭐ In the meantime, consider [starring our repo](https://github.com/sweepai/sweep) so more people can hear about us!"
collapsible_template = '''
<details>
<summary>{summary}</summary>
{body}
</details>
'''
chunker = modal.Function.lookup(UTILS_NAME, "Chunking.chunk")
num_of_snippets_to_query = 30
# max_num_of_snippets = 5
total_number_of_snippet_tokens = 15_000
num_full_files = 2
num_extended_snippets = 2
def post_process_snippets(snippets: list[Snippet]):
for snippet in snippets[:num_full_files]:
snippet = snippet.expand()
# snippet fusing
i = 0
while i < len(snippets):
j = i + 1
while j < len(snippets):
if snippets[i] ^ snippets[j]: # this checks for overlap
snippets[i] = snippets[i] | snippets[j] # merging
snippets.pop(j)
else:
j += 1
i += 1
# truncating snippets based on character length
result_snippets = []
total_length = 0
for snippet in snippets:
total_length += len(snippet.get_snippet())
if total_length > total_number_of_snippet_tokens * 5:
break
result_snippets.append(snippet)
return result_snippets
def on_ticket(
title: str,
summary: str,
issue_number: int,
issue_url: str,
username: str,
repo_full_name: str,
repo_description: str,
installation_id: int,
comment_id: int = None
):
# Check if the title starts with "sweep" or "sweep: " and remove it
if title.lower().startswith("sweep: "):
title = title[7:]
elif title.lower().startswith("sweep "):
title = title[6:]
# Flow:
# 1. Get relevant files
# 2: Get human message
# 3. Get files to change
# 4. Get file changes
# 5. Create PR
organization, repo_name = repo_full_name.split("/")
metadata = {
"issue_url": issue_url,
"repo_name": repo_name,
"repo_description": repo_description,
"username": username,
"installation_id": installation_id,
"function": "on_ticket",

"""
List of common prompts used across the codebase.
"""
# Following two should be fused
system_message_prompt = "Your name is Sweep bot. You are a brilliant and thorough engineer assigned to the following Github ticket. You will be helpful and friendly, but informal and concise: get to the point. When you write code to solve tickets, the code works on the first try and is formatted perfectly. You have the utmost care for the user that you write for, so you do not make mistakes."
system_message_issue_comment_prompt = "Your name is Sweep bot. You are a brilliant and thorough engineer assigned to the following Github ticket, and a user has just responded with feedback. You will be helpful and friendly, but informal and concise: get to the point. When you write code to solve tickets, the code works on the first try and is formatted perfectly. You have the utmost care for the user that you write for, so you do not make mistakes."
human_message_prompt = [
{'role': 'assistant', 'content': 'Examining repo...'},
{'role': 'user', 'content': """<relevant_snippets_in_repo>
{relevant_snippets}
</relevant_snippets_in_repo>"""},
{'role': 'user', 'content': """<relevant_paths_in_repo>
{relevant_directories}
</relevant_paths_in_repo>"""},
{'role': 'user', 'content': """<repo_tree>
{tree}
</repo_tree>"""},
{'role': 'user', 'content':
"""# Repo & Issue Metadata
Repo: {repo_name}: {repo_description}
Issue Url: {issue_url}
Username: {username}
Issue Title: {title}
Issue Description: {description}"""}]
human_message_review_prompt = [
{'role': 'assistant', 'content': 'Reviewing my pull request...'},
{'role': 'user', 'content': """<relevant_snippets_in_repo>
{relevant_snippets}
</relevant_snippets_in_repo>"""},
{'role': 'user', 'content': """<relevant_paths_in_repo>
{relevant_directories}
</relevant_paths_in_repo>"""},
{'role': 'user', 'content': """"<repo_tree>
{tree}
</repo_tree>"""},
{'role': 'user', 'content':
"""These are the file changes.
We have the file_path, the previous_file_content, the new_file_content, and the diffs.
The file_path is the name of the file.
The previous_file_content is the content of the file before the changes.
The new_file_content is the content of the file after the changes.
The diffs are the lines changed in the file. <added_lines> indicates those lines were added, <deleted_lines> indicates they were deleted.
Keep in mind that we may see a diff for a deletion and replacement, so don't point those out as issues.
{diffs}"""}]
diff_section_prompt = """
<file_path>
{diff_file_path}
</file_path>
<previous_file_content>
{previous_file_content}
</previous_file_content>
<new_file_content>
{new_file_content}
</new_file_content>
<file_diffs>
{diffs}
</file_diffs>
"""
review_prompt = """\
I need you to carefully review the code diffs in this pull request.
The code was written by an inexperienced programmer and may contain accidental deletions, logic errors or other issues.
Think step-by-step logically and thoroughly analyze to summarize the diffs per file in the format:
Step-by-step thoughts:
* Lines x1-x2: Summary of the changes (added, deleted, modified, errors, issues)
* Lines y1-y2: Summary of the changes (added, deleted, modified, errors, issues)
...
<file_summarization>
* file_1 - changes in file_1

import json
from loguru import logger
import github
from github.Repository import Repository
from github.ContentFile import ContentFile
from github.GithubException import GithubException
import modal
from pydantic import BaseModel
from sweepai.core.code_repair import CodeRepairer
from sweepai.utils.chat_logger import ChatLogger
import re
from sweepai.core.entities import (
FileChange,
FileChangeRequest,
FilesToChange,
PullRequest,
RegexMatchError,
Function,
Snippet
)
from sweepai.core.chat import ChatGPT
from sweepai.core.prompts import (
files_to_change_prompt,
pull_request_prompt,
create_file_prompt,
modify_file_prompt_2,
modify_file_plan_prompt,
)
from sweepai.utils.config import SweepConfig
from sweepai.utils.constants import DB_NAME, SECONDARY_MODEL
from sweepai.utils.diff import format_contents, generate_new_file, is_markdown
class CodeGenBot(ChatGPT):
def get_files_to_change(self):
file_change_requests: list[FileChangeRequest] = []
for count in range(5):
try:
logger.info(f"Generating for the {count}th time...")
files_to_change_response = self.chat(files_to_change_prompt, message_key="files_to_change") # Dedup files to change here
files_to_change = FilesToChange.from_string(files_to_change_response)
files_to_create: list[str] = files_to_change.files_to_create.split("\n*")
files_to_modify: list[str] = files_to_change.files_to_modify.split("\n*")
for file_change_request, change_type in zip(
files_to_create + files_to_modify,
["create"] * len(files_to_create)
+ ["modify"] * len(files_to_modify),
):
file_change_request = file_change_request.strip()
if not file_change_request or file_change_request == "* None":
continue
logger.debug(file_change_request)
logger.debug(change_type)
file_change_requests.append(
FileChangeRequest.from_string(
file_change_request, change_type=change_type
)
)
# Create a dictionary to hold file names and their corresponding instructions
file_instructions_dict = {}
for file_change_request in file_change_requests:
# If the file name is already in the dictionary, append the new instructions
if file_change_request.filename in file_instructions_dict:
instructions, change_type = file_instructions_dict[file_change_request.filename]
file_instructions_dict[file_change_request.filename] = (instructions + " " + file_change_request.instructions, change_type)
else:
file_instructions_dict[file_change_request.filename] = (file_change_request.instructions, file_change_request.change_type)
file_change_requests = [FileChangeRequest(filename=file_name, instructions=instructions, change_type=change_type) for file_name, (instructions, change_type) in file_instructions_dict.items()]
if file_change_requests:
return file_change_requests
except RegexMatchError:
logger.warning("Failed to parse! Retrying...")
self.delete_messages_from_chat("files_to_change")
continue
raise Exception("Could not generate files to change")
def generate_pull_request(self) -> PullRequest:
for count in range(5):
try:
logger.info(f"Generating for the {count}th time...")
pr_text_response = self.chat(pull_request_prompt, message_key="pull_request", model=SECONDARY_MODEL)

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/sweepai/handlers/on_comment.py#L1-L180

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/sweepai/slack.py#L266-L0

sweep/sweepai/api.py

Lines 1 to 104 in 31bec7a

import time
from loguru import logger
import modal
from pydantic import ValidationError
from sweepai.handlers.create_pr import create_pr # type: ignore
from sweepai.handlers.on_ticket import on_ticket
from sweepai.handlers.on_comment import on_comment
from sweepai.utils.constants import API_NAME, BOT_TOKEN_NAME, DB_NAME, LABEL_COLOR, LABEL_DESCRIPTION, LABEL_NAME, SWEEP_LOGIN
from sweepai.events import (
CommentCreatedRequest,
InstallationCreatedRequest,
IssueCommentRequest,
IssueRequest,
PRRequest,
ReposAddedRequest,
)
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import get_github_client, index_full_repository
from fastapi import HTTPException, Request
from pymongo import MongoClient
stub = modal.Stub(API_NAME)
image = (
modal.Image.debian_slim()
.apt_install("git")
.pip_install(
"openai",
"anthropic",
"PyGithub",
"loguru",
"docarray",
"backoff",
"tiktoken",
"highlight-io",
"GitPython",
"posthog",
"tqdm",
"pyyaml",
"pymongo",
"tabulate"
)
)
secrets = [
modal.Secret.from_name(BOT_TOKEN_NAME),
modal.Secret.from_name("openai-secret"),
modal.Secret.from_name("anthropic"),
modal.Secret.from_name("posthog"),
modal.Secret.from_name("highlight"),
modal.Secret.from_name("mongodb"),
modal.Secret.from_name("discord")
]
FUNCTION_SETTINGS = {
"image": image,
"secrets": secrets,
"timeout": 30 * 60,
}
handle_ticket = stub.function(**FUNCTION_SETTINGS)(on_ticket)
handle_comment = stub.function(**FUNCTION_SETTINGS)(on_comment)
handle_pr = stub.function(**FUNCTION_SETTINGS)(create_pr)
update_index = modal.Function.lookup(DB_NAME, "update_index")
@stub.function(**FUNCTION_SETTINGS)
@modal.web_endpoint(method="POST")
async def webhook(raw_request: Request):
"""Handle a webhook request from GitHub."""
try:
request_dict = await raw_request.json()
logger.info(f"Received request: {request_dict.keys()}")
event = raw_request.headers.get("X-GitHub-Event")
assert event is not None
match event, request_dict.get("action", None):
case "issues", "opened":
request = IssueRequest(**request_dict)
issue_title_lower = request.issue.title.lower()
if issue_title_lower.startswith("sweep") or "sweep:" in issue_title_lower:
g = get_github_client(request.installation.id)
repo = g.get_repo(request.repository.full_name)
labels = repo.get_labels()
label_names = [label.name for label in labels]
if LABEL_NAME not in label_names:
repo.create_label(
name=LABEL_NAME,
color=LABEL_COLOR,
description=LABEL_DESCRIPTION,
)
# TODO(sweep): figure out why this is breaking
# else:
# label = repo.get_label(LABEL_NAME)
# label.edit(
# name=LABEL_NAME,
# color=LABEL_COLOR,
# description=LABEL_DESCRIPTION
# )
current_issue = repo.get_issue(number=request.issue.number)
current_issue.add_to_labels(LABEL_NAME)
case "issues", "labeled":

"""
Take a PR and provide an AI generated review of the PR.
"""
from loguru import logger
from sweepai.core.entities import DiffSummarization, PullRequestComment
from sweepai.core.prompts import review_prompt
from sweepai.core.sweep_bot import SweepBot
from sweepai.utils.github_utils import get_file_contents
from sweepai.utils.prompt_constructor import HumanMessageFinalPRComment, HumanMessagePromptReview, HumanMessageReviewFollowup
from sweepai.utils.chat_logger import ChatLogger
# Plan:
# 1. Get PR
# 2. Get files changed
# 3. Come up with some comments for the PR
# 4. Take comments and add them to the PR
def get_pr_diffs(repo, pr):
base_sha = pr.base.sha
head_sha = pr.head.sha
comparison = repo.compare(base_sha, head_sha)
file_diffs = comparison.files
pr_diffs = []
for file in file_diffs:
print(file.status)
diff = file.patch
if file.status == "added":
pr_diffs.append((file.filename, get_file_contents(repo, file_path=file.filename, ref=head_sha), "", diff))
elif file.status == "modified":
pr_diffs.append((file.filename, get_file_contents(repo, file_path=file.filename, ref=head_sha), get_file_contents(repo, file_path=file.filename, ref=base_sha), diff))
elif file.status == "removed":
pr_diffs.append((file.filename, "", get_file_contents(repo, file_path=file.filename, ref=base_sha), diff))
else:
logger.info(f"File status {file.status} not recognized") #TODO(sweep): We don't handle renamed files
return pr_diffs
def review_pr(repo, pr, issue_url, username, repo_description, title, summary, replies_text, tree):
repo_name = repo.full_name
logger.info("Getting PR diffs...")
diffs = get_pr_diffs(repo, pr)
human_message = HumanMessagePromptReview(
repo_name=repo_name,
issue_url=issue_url,
username=username,
repo_description=repo_description,
title=title,
summary=summary + replies_text,
snippets=[],
tree=tree,
diffs=[diffs[0] if len(diffs) > 0 else ""],
pr_title=pr.title,
pr_message=pr.body or "",
)
summarization_replies = []
chat_logger = ChatLogger({
'repo_name': repo_name,
'title': '(Review) ' + title,
'summary': summary + replies_text,
"issue_url": issue_url,
"username": username,
"repo_description": repo_description,
"issue_url": issue_url,
"username": username,
"repo_description": repo_description,

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_gpt_diff_function.py#L646-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_code_repair.py#L78-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_gpt_diff_function.py#L311-L0

import shutil
import modal
import os
import time
import re
import github
from github import Github
from github.Repository import Repository
from loguru import logger
from jwt import encode
import requests
from tqdm import tqdm
from sweepai.core.entities import Snippet
from sweepai.utils.config import SweepConfig
from sweepai.utils.constants import APP_ID, DB_NAME
from sweepai.utils.event_logger import posthog
def make_valid_string(string: str):
pattern = r"[^\w./-]+"
return re.sub(pattern, "_", string)
def get_jwt():
signing_key = os.environ["GITHUB_APP_PEM"]
app_id = APP_ID
payload = {"iat": int(time.time()), "exp": int(time.time()) + 600, "iss": app_id}
return encode(payload, signing_key, algorithm="RS256")
def get_token(installation_id: int):
jwt = get_jwt()
headers = {
"Accept": "application/vnd.github+json",
"Authorization": "Bearer " + jwt,
"X-GitHub-Api-Version": "2022-11-28",
}
response = requests.post(
f"https://api.github.com/app/installations/{int(installation_id)}/access_tokens",
headers=headers,
)
return response.json()["token"]
def get_github_client(installation_id: int):
token = get_token(installation_id)
return Github(token)
def get_installation_id(username: str):
jwt = get_jwt()
response = requests.get(
f"https://api.github.com/users/{username}/installation",
headers={
"Accept": "application/vnd.github+json",
"Authorization": "Bearer " + jwt,
"X-GitHub-Api-Version": "2022-11-28",
},
)
obj = response.json()
try:
return obj["id"]
except:
raise Exception("Could not get installation id, probably not installed")
def display_directory_tree(
root_path,
includes: list[str] = [],
excludes: list[str] = [".git"],
):
def display_directory_tree_helper(
current_dir,
indent="",
) -> str:
files = os.listdir(current_dir)
files.sort()
tree = ""
for item_name in files:
full_path = os.path.join(current_dir, item_name)[len(root_path) + 1 :]
if item_name in excludes:
continue
file_path = os.path.join(current_dir, item_name)
if os.path.isdir(file_path):
if full_path in includes:
tree += f"{indent}|- {item_name}/\n"
tree += display_directory_tree_helper(
file_path, indent + "| "
)
else:
tree += f"{indent}|- {item_name}/...\n"
else:
tree += f"{indent}|- {item_name}\n"
return tree
tree = display_directory_tree_helper(root_path)
lines = tree.splitlines()
return "\n".join([line[3:] for line in lines])
def get_file_list(root_directory: str) -> str:

"""
This file should be split into environment and config files
"""
PREFIX = "prod"
DB_NAME = PREFIX + "-db"
API_NAME = PREFIX + "-api"
UTILS_NAME = PREFIX + "-utils"
SLACK_NAME = PREFIX + "-slack"
BOT_TOKEN_NAME = PREFIX + "-bot-token"
if PREFIX == "prod":
BOT_TOKEN_NAME = "bot-token"
SWEEP_LOGIN = "sweep-ai[bot]"
if PREFIX == "prod":
APP_ID = 307814
ENV = PREFIX
elif PREFIX == "dev2":
APP_ID = 327588
ENV = PREFIX
SWEEP_LOGIN = "sweep-canary[bot]"
elif PREFIX == "dev":
APP_ID = 324098
ENV = PREFIX
SWEEP_LOGIN = "sweep-nightly[bot]"
LABEL_NAME = "sweep"
LABEL_COLOR = "9400D3"
LABEL_DESCRIPTION = "Sweep your software chores"
SWEEP_CONFIG_BRANCH = "sweep/add-sweep-config"
DEFAULT_CONFIG = """# Reference: https://github.com/sweepai/sweep/blob/main/sweep.yaml
branch: dev"""

"""
Creates PR given description.
"""
import os
import openai
from loguru import logger
import modal
from github.Repository import Repository
from sweepai.core.entities import FileChangeRequest, PullRequest
from sweepai.core.sweep_bot import SweepBot
from sweepai.handlers.on_review import review_pr
from sweepai.utils.config import SweepConfig
from sweepai.utils.event_logger import posthog
from sweepai.utils.github_utils import get_github_client
from sweepai.utils.constants import DB_NAME, PREFIX, DEFAULT_CONFIG, SWEEP_CONFIG_BRANCH, SWEEP_LOGIN
github_access_token = os.environ.get("GITHUB_TOKEN")
openai.api_key = os.environ.get("OPENAI_API_KEY")
update_index = modal.Function.lookup(DB_NAME, "update_index")
num_of_snippets_to_query = 10
max_num_of_snippets = 5
def create_pr(
file_change_requests: list[FileChangeRequest],
pull_request: PullRequest,
sweep_bot: SweepBot,
username: str,
installation_id: int,
issue_number: int | None = None
):
# Flow:
# 1. Get relevant files
# 2: Get human message
# 3. Get files to change
# 4. Get file changes
# 5. Create PR
organization, repo_name = sweep_bot.repo.full_name.split("/")
metadata = {
"repo_full_name": sweep_bot.repo.full_name,
"organization": organization,
"repo_name": repo_name,
"repo_description": sweep_bot.repo.description,
"username": username,
"installation_id": installation_id,
"function": "on_ticket",
"mode": PREFIX,
}
posthog.capture(username, "started", properties=metadata)
try:
logger.info("Making PR...")
pull_request.branch_name = sweep_bot.create_branch(pull_request.branch_name)
completed_count, fcr_count = sweep_bot.change_files_in_github(file_change_requests, pull_request.branch_name)
if completed_count == 0 and fcr_count != 0:
logger.info("No changes made")
posthog.capture(
username,
"failed",
properties={
"error": "No changes made",
"reason": "No changes made",
**metadata,
},
)
return {"success": False, "error": "No changes made"}
# Include issue number in PR description
if issue_number:
# If the #issue changes, then change on_ticket (f'Fixes #{issue_number}.\n' in pr.body:)
pr_description = f"{pull_request.content}\n\nFixes #{issue_number}.\n\nTo checkout this PR branch, run the following command in your terminal:\n```zsh\ngit checkout {pull_request.branch_name}\n```"

from typing import Any
import webbrowser
import httpx
from pydantic import BaseModel
import requests
import json
from loguru import logger
from sweepai.app.config import SweepChatConfig
from sweepai.core.entities import Function, PullRequest, Snippet
from sweepai.utils.constants import PREFIX
create_pr_function = Function(
name="create_pr",
description="Creates a PR.",
parameters={
"properties": {
"plan": {
"type": "array",
"items": {
"type": "object",
"properties": {
"file_path": {
"type": "string",
"description": "The file path to change."
},
"instructions": {
"type": "string",
"description": "Concise NATURAL LANGUAGE summary of what to change in each file. There should be absolutely NO code, only English.",
"example": [
"Refactor the algorithm by moving the main function to the top of the file.",
"Change the implementation to recursion"
]
},
},
"required": ["file_path", "instructions"]
},
"description": "A list of files to modify or create and corresponding instructions."
},
"title": {
"type": "string",
"description": "Title of PR",
},
"summary": {
"type": "string",
"description": "Detailed summary of PR",
},
"branch": {
"type": "string",
"description": "Name of branch to create PR in.",
},
},
"required": ["plan", "title", "summary", "branch"]
}
)
create_pr_function_call = {"name": "create_pr"}
def break_json(raw_json: str):
# turns something like {"function_call": {"arguments": " \""}}{"function_call": {"arguments": "summary"}} into two objects
try:
yield json.loads(raw_json)

from pydantic import BaseModel
from sweepai.core.prompts import (
human_message_prompt,
human_message_prompt_comment,
human_message_review_prompt,
diff_section_prompt,
review_follow_up_prompt,
final_review_prompt,
comment_line_prompt
)
from loguru import logger
class HumanMessagePrompt(BaseModel):
repo_name: str
issue_url: str
username: str
title: str
summary: str
snippets: list
tree: str
repo_description: str = ""
def delete_file(self, file_path):
# super unoptimized
if file_path in [snippet.file_path for snippet in self.snippets]:
for snippet in self.snippets:
if snippet.file_path == file_path:
self.snippets.remove(snippet)
def get_relevant_directories(self):
deduped_paths = []
for snippet in self.snippets:
if snippet.file_path not in deduped_paths:
deduped_paths.append(snippet.file_path)
return "\n".join(deduped_paths)
def render_snippets(self):
return "\n".join([snippet.xml for snippet in self.snippets])
def construct_prompt(self):
human_messages = [{'role': msg['role'], 'content': msg['content'].format(
repo_name=self.repo_name,
issue_url=self.issue_url,
username=self.username,
repo_description=self.repo_description,
tree=self.tree,
title=self.title,
description=self.summary if self.summary else "No description provided.",
relevant_snippets=self.render_snippets(),
relevant_directories=self.get_relevant_directories(),
)} for msg in human_message_prompt]
return human_messages
class HumanMessagePromptReview(HumanMessagePrompt):
pr_title: str
pr_message: str = ""
diffs: list
def format_diffs(self):
formatted_diffs = []
for file_name, new_file_contents, old_file_contents, file_patch in self.diffs:
format_diff = diff_section_prompt.format(
diff_file_path=file_name,
new_file_content=new_file_contents.rstrip("\n"),
previous_file_content=old_file_contents.rstrip("\n"),
diffs=file_patch
)
formatted_diffs.append(format_diff)
return "\n".join(formatted_diffs)
def construct_prompt(self):
human_messages = [{'role': msg['role'], 'content': msg['content'].format(
repo_name=self.repo_name,
issue_url=self.issue_url,
username=self.username,
repo_description=self.repo_description,
tree=self.tree,
title=self.title,
description=self.summary,
relevant_snippets=self.render_snippets(),
relevant_directories=self.get_relevant_directories(),
diffs=self.format_diffs(),
pr_title=self.pr_title,
pr_message=self.pr_message,
)} for msg in human_message_review_prompt]
return human_messages
class HumanMessageReviewFollowup(BaseModel):
diff: tuple

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/sweepai/slack.py#L232-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_diffs.py#L506-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_code_repair.py#L1-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_review_comments.py#L1-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_pr_diffs.py#L11-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_review_comments.py#L11-L0

import typer
from sweepai.app.config import SweepChatConfig
epilog = "Sweep is a AI junior developer. Docs at https://docs.sweep.dev, install at https://github.com/apps/sweep-ai and support at https://discord.gg/sweep-ai."
typer_app = typer.Typer(epilog=epilog)
# @app.callback()
@typer_app.command()
def start():
"""
Launch Sweep Chat in the browser
"""
SweepChatConfig.load()
from sweepai.app.ui import demo
print("\033[93m⭐ Remember to star our repo at https://github.com/sweepai/sweep! \033[0m")
demo.queue()
demo.launch(inbrowser=True)
@typer_app.command()
def auth():
"""
Reauthenticate with Github API for Sweep to work (for token expiry)
"""
SweepChatConfig.load(recreate=True)
print("Setup completed successfully!")
print("\033[93m⭐ Remember to star our repo at https://github.com/sweepai/sweep! \033[0m")
def app():
# hacky solution based on https://github.com/tiangolo/typer/issues/18#issuecomment-1577788949
import sys
commands = {'start', 'auth'}
sys.argv.append('start') if sys.argv[-1] not in commands else None
typer_app()
if __name__ == "__main__":

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_generate_new_files.py#L1-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_pr_diffs.py#L1-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_generate_new_files.py#L98-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/sweepai/slack.py#L267-L0

https://github.com/sweepai/sweep/blob/31bec7a73a4df3574d68219f9ef5f21fb06dade9/tests/test_chatgpt.py#L1-L0


Step 2: 🧐 Snippet Analysis

From looking through the relevant snippets, I decided to make the following modifications:

File Path Proposed Changes
sweepai/handlers/on_ticket.py Add code to remove the "sweep" label at the points where a failure can occur. Handle the case where the "sweep" label does not exist.

Step 3: 📝 Planning

I have created a plan for writing the pull request. I am now working on executing my plan and coding the required changes to address this issue. Here is the planned pull request:

Remove "sweep" label on failure
sweep/remove-sweep-label-on-failure

This PR addresses the issue of the "sweep" label not being removed when a failure occurs.

The changes made in this PR are in the on_ticket function in the sweepai/handlers/on_ticket.py file. Code has been added to remove the "sweep" label at the points where a failure can occur. This is done by fetching the "sweep" label object using the get_label method and then calling the remove_from_labels method.

Additionally, a case has been handled where the "sweep" label does not exist. This is done by catching the exception raised by the get_label method and logging a message.

This change will ensure that the "sweep" label is removed from the issue when a failure occurs, providing a more accurate representation of the issue's status.


Step 4: ⌨️ Coding

I have finished coding the issue. I am now reviewing it for completeness.


Step 5: 🔁 Code Review

Success! 🚀


I'm a bot that handles simple bugs and feature requests but I might make mistakes. Please be kind!
Join Our Discord

from sweep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.