jeffwidman / bitbucket-issue-migration Goto Github PK
View Code? Open in Web Editor NEWA small script for migrating repo issues from Bitbucket to GitHub
License: GNU General Public License v3.0
A small script for migrating repo issues from Bitbucket to GitHub
License: GNU General Public License v3.0
python3 migrate.py abadger abadger/c2p abadger1406 NetFM/c2p
Please enter your Bitbucket password.
Please enter your GitHub password.
Note: If your account has two-factor authentication enabled, you must use a personal access token from https://github.com/settings/tokens in place of a password for this script.
Imported Issue: https://api.github.com/repos/NetFM/c2p/issues/12
Traceback (most recent call last):
File "migrate.py", line 442, in
sys.exit(main(options))
File "migrate.py", line 148, in main
assert gh_issue_id == issue['local_id']
AssertionError
When I look at github, it has just the first issue copied. This issue will get repeated if I run again, but no new issues beyond this are copied.
Given that this is a single, self-contained migration script that's only run a couple of times by an end user, I don't think it's worth the effort to support both python 2 and python 3.
Anyone who's likely to use this script will almost certainly have the technical proficiency to install python 3 in order to run a simple script.
So I'd rather update to Python 3 and be done with it.
This is more an annoyance than anything else--public BB repos we don't actually need the BB password.
similar to how issues currently have comments, it'd be nice to include a link to the original bitbucket comment just in case formatting gets messed up or something
Running with -n
works fine, but when I remove the option I get the following after entering the GitHub password:
Traceback (most recent call last):
File "migrate.py", line 442, in <module>
sys.exit(main(options))
File "migrate.py", line 131, in main
gh_issue, gh_comments, options.github_repo, gh_auth, headers
File "migrate.py", line 385, in push_github_issue
respo = requests.post(url, json=issue_data, auth=auth, headers=headers)
File "/usr/lib/python3/dist-packages/requests/api.py", line 88, in post
return request('post', url, data=data, **kwargs)
File "/usr/lib/python3/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
TypeError: request() got an unexpected keyword argument 'json'
I'm using username/reponame for both, github and bitbucket, but I get a 404:
Traceback (most recent call last):
File "migrate.py", line 143, in <module>
response = urllib2.urlopen(url)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 410, in open
response = meth(req, response)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 523, in http_response
'http', request, response, code, msg, hdrs)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 448, in error
return self._call_chain(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: NOT FOUND
note to self when I update readme to make sure Vitaly Babiy is credited as the original author
Currently Creole braces ({{{...}}}) are only converted on issue bodies. These should also be converted on comment bodies.
Here's an example where this crops up:
https://bitbucket.org/zzzeek/sqlalchemy/issues/1579/consider-existing-bindparam-s-in-an-insert#comment-9009854
add https://pypi.python.org/pypi/keyring as an optional dependancy that will try to retrieve github/bitbucket creds from the keyring before asking for a password
Cherrypick this commit: 9355196
Shows up in both BB v1 (count
) and v2 API (size
).
It's a metadata that reports the total issue/comment count. Can be used to see if we've retrieved all the issues. Can't be used for looping through issues because issues can be deleted, so no guarantee of contiguous IDs.
Consider hgtools 30. That ticket, migrated from bitbucket, indicates it was originally reported by https://github.com/laffoyb, but that URL returns a 404.
The migration script shouldn't be declaring a Github user based on naive inference. It should omit the Github link unless the script is provided with a username mapping. Otherwise, it creates broken or possibly erroneous links.
it's supported by both the BB API and GH API:
issue['metadata']['milestone']
.issue['milestone']
.GH requires that the milestone ID already exist when the issue is imported, otherwise the import will fail.
Fully implementing this probably looks like:
convert_issue()
)Reported here: #20 (comment)
the problem with -f comes from Bitbucket's api that are broken, neither "start" nor "limit" work according to the docs. With 17 issues, if I pass -f 15 it copies the first 2 (=17-15) issues. Similarly, limit does not limit anything but it just takes the last X issues
@serviceman do you know if this is still an issue?
I saw bitbucket rolled out v2 of their cloud apis--would switching to those fix this?
Test issue.
Traceback (most recent call last):
File "migrate.py", line 304, in <module>
push_issue(gh_username, gh_repository, issue, body, comments)
File "migrate.py", line 220, in push_issue
gh_repository
File "/usr/local/lib/python2.7/site-packages/pygithub3/services/issues/__init__.py", line 94, in create
return self._post(request)
File "/usr/local/lib/python2.7/site-packages/pygithub3/services/base.py", line 139, in _post
response = self._client.post(request, data=input_data, **kwargs)
File "/usr/local/lib/python2.7/site-packages/pygithub3/core/client.py", line 89, in post
response = self.request('post', request, **kwargs)
File "/usr/local/lib/python2.7/site-packages/pygithub3/core/client.py", line 71, in wrapper
return func(self, verb, request, **kwargs)
File "/usr/local/lib/python2.7/site-packages/pygithub3/core/client.py", line 80, in request
GithubError(response).process()
File "/usr/local/lib/python2.7/site-packages/pygithub3/core/errors.py", line 36, in process
self.response.raise_for_status()
File "/usr/local/lib/python2.7/site-packages/requests/models.py", line 834, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden
We have some custom regexs to parse the created_on field, should just drop those and switch to using the utc_last_updated (bb api v1) or created_on (bb api v2)
Need to doublecheck the ISO formats there are what github's import api expects
I believe Github has recently disabled SSL v3 and I am getting the following errors:
Traceback (most recent call last):
File ".\migrate.py", line 177, in <module>
ni = github.issues.create(issue_data, options.github_repo.split('/')[0], options.github_repo.split('/')[1] )
File "E:\Python27\lib\site-packages\pygithub3\services\issues\__init__.py", line 101, in create
return self._post(request)
File "E:\Python27\lib\site-packages\pygithub3\services\base.py", line 138, in _post
response = self._client.post(request, data=input_data, **kwargs)
File "E:\Python27\lib\site-packages\pygithub3\core\client.py", line 88, in post
response = self.request('post', request, **kwargs)
File "E:\Python27\lib\site-packages\pygithub3\core\client.py", line 70, in wrapper
return func(self, verb, request, **kwargs)
File "E:\Python27\lib\site-packages\pygithub3\core\client.py", line 76, in request
response = self.requester.request(verb, request, **kwargs)
File "E:\Python27\lib\site-packages\requests\sessions.py", line 252, in request
r.send(prefetch=prefetch)
File "E:\Python27\lib\site-packages\requests\models.py", line 632, in send
raise SSLError(e)
requests.exceptions.SSLError: [Errno 8] _ssl.c:507: EOF occurred in violation of protocol
Following the instructions at https://lukasa.co.uk/2013/01/Choosing_SSL_Version_In_Requests/ and putting the 3rd code snippet at the top of migrate.py
gets rid of that error (as long as you upgrade the requests package to >=1.0.0) but then there is a dependency mismatch with the requests package and I get the following error:
Traceback (most recent call last):
File ".\migrate.py", line 72, in <module>
github = Github(login=options.github_username, password=github_password)
File "E:\Python27\lib\site-packages\pygithub3\github.py", line 25, in __init__
self._users = User(**config)
File "E:\Python27\lib\site-packages\pygithub3\services\users\__init__.py", line 14, in __init__
self.keys = Keys(**config)
File "E:\Python27\lib\site-packages\pygithub3\services\base.py", line 40, in __init__
self._client = Client(**config)
File "E:\Python27\lib\site-packages\pygithub3\core\client.py", line 28, in __init__
self.__set_params(self.config)
File "E:\Python27\lib\site-packages\pygithub3\core\client.py", line 56, in __set_params
self.requester.params.append(per_page)
AttributeError: 'dict' object has no attribute 'append'
> py -2 .\migrate.py kang python-keyring-lib jaraco jaraco/keyring
Please enter your GitHub password
Traceback (most recent call last):
File ".\migrate.py", line 304, in <module>
push_issue(gh_username, gh_repository, issue, body, comments)
File ".\migrate.py", line 266, in push_issue
format_comment(comment),
File ".\migrate.py", line 122, in format_comment
comment['user'].encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 153: ordinal not in range(128)
things to refactor:
Should use to track how far we are in issue migration.
Ideally this is an optional import so if the user doesn't have it, we gracefully degrade to not counting progress.
I accidentally broke it here: #35 (comment)
I'm currently attempting to migrate https://bitbucket.org/jaraco/hgtools. I'm using the jaraco/bitbucket-issue-migration#refactor-and-python3 to get the keyring support, but the technique for importing the issues into github should be the same.
When I run the migration, I get different results for different runs. In a recent test, I ran:
python migrate.py jaraco hgtools jaraco jaraco/hgtoolstest
Which yielded lines including:
Created bitbucket issue 27 (27/31): hgtools is a weird name [1 comments]
...
Created bitbucket issue 31 (31/31): Automatic versioning breaks if mercurial repository is synced with a GIT upstream through hg-git extension [0 comments]
Created 31 issues
However, despite the script having created the issues in order, that issue 27 appears as issue 28. Looks like issue 31 was processed as 25.
But the behavior is not deterministic. In another run, there were more and different discrepancies.
It seems there may be a race condition in the Github issue importer such that it can get some issues out of order. I'm not yet sure what can or should be done.
when I tried to import issues from bitbucket repo to a organization repo, it shows this error.
pygithub3.exceptions.NotFound: 404 - Not Found
Hi,
I tried to migrate a rather large project from bitbucket to github with your script but eventually had to give up because the script kept failing with a 403 FORBIDDEN.
I found out that the problem was not related to access rights but was due to the fact that the comments attached to some of the issues that I tried to migrate were quite big. (I always assign change sets to issues.) For these comments the according function in push_issue just failed and leaves the issue created but with only a portion of the comments added.
Maybe you can add some support for that but right now the script just can not migrate large issues which is a blocker and forced me to give up on migrating to github.
Cheers,
Sebastian
Note: if you need to migrate to a GitHub organizational repository, use your personal username, but the appropriate API token for the repository.
AFAIK, API token is gone from Github and thus I'm not sure how to upload to my organizational repo.
For example, issue 3 was deleted.
Now the 3rd issue created in Github is BB issue 4, and the IDs get out of sync.
Non-sync'd IDs breaks fix_links()
conversion to relative GH urls.
Test it on the SQLAlchemy repo, as IDs and issue total count do not match, so guaranteed to hit this issue.
I'm not sure how to solve, as Github doesn't let us input the issue ID. Perhaps when it happens, create a placeholder issue in github just to keep the IDs synced. Be sure to auto-close this placeholder issue.
Default for v1 is 15, but can specify as high as 50:
https://confluence.atlassian.com/bitbucket/issues-resource-296095191.html#issuesResource-GETalistofissuesinarepository'stracker
If we ever switch to v2, can specify as high as 100 (#30)
According to the script's help command, the script's example usage here on GitHub is outdated.
Script help command output:
usage: migrate.py [-h] [-n] [-f START]
bitbucket_username bitbucket_repo github_username
github_repo
A tool to migrate issues from Bitbucket to GitHub. note: the Bitbucket
repository and issue tracker have to bepublic
positional arguments:
bitbucket_username Your Bitbucket username
bitbucket_repo Bitbucket repository to pull data from.
github_username Your GitHub username
github_repo GitHub to add issues to. Format: <username>/<repo
name>
optional arguments:
-h, --help show this help message and exit
-n, --dry-run Perform a dry run and print eveything.
-f START, --start_id START
Bitbucket issue id from which to start import
If you have two factor auth turns on in your github account, the uploader fails and throws a 401 error. It would be useful to catch this error and suggest that the person disable 2FA if possible, or that they try their password again.
Hi,
since the issue tracker API in github does not provide any transactional semantics the script must make sure that once it has created a new issue in github, it must not fail. Otherwise it will recreate the same issue over and over again when retrying. I ran into encoding problems and experienced exactly this problem.
Suggested change:
def push_issue(gh_username, gh_repository, issue, body, comments):
#perform formatting before issue creation
formatted_comments = []
for comment in comments:
try:
formatted_comments.append(format_comment(comment))
except UnicodeDecodeError as e:
print "failed comment:", comment, type(comment['user']), type(comment['body'])
raise
...
# finally use the preformatted comments
for comment in formatted_comments:
github.issues.comments.create(
new_issue.number,
comment,
gh_username,
gh_repository
)
Cheers,
Sebastian
Now that we're using requests, not sure why we need urllib2 in the script
You have triggered an abuse detection mechanism and have been temporarily blocked from content creation. Please retry your request again later.
Anyone know how sensitive GitHub's thing is?
Bitbucket rolled out API v2, need to check whether we should switch or not:
https://blog.bitbucket.org/2013/11/12/api-2-0-new-function-and-enhanced-usability/
In general, it's obviously better to be using what BB is most excited about, but it's not clear that BB plans to expand their v2 api to the point of eventually killing their v1 api.
A brief exploring of the API looks like there's some endpoints that could fix #16, #32, #8
I'm running a migration of issues from bb://pypa/setuptools to gh://pypa/setuptools, but twice now I've hit an error:
Completed 80 of 491 issues
Traceback (most recent call last):
File "migrate.py", line 444, in <module>
sys.exit(main(options))
File "migrate.py", line 144, in main
status_url, gh_auth, headers
File "migrate.py", line 420, in verify_github_issue_import_finished
.format(status_url, respo.status_code)
RuntimeError: Failed to check GitHub issue import status url: https://api.github.com/repos/pypa/setuptools/import/issues/441874 due to unexpected HTTP status code: 404
The first time it happened at issue 288, so it's apparently intermittent, perhaps a race condition.
Need to switch from start_id
to start
since it's not the issue ID, it's the current list index of the issue.
Also need to clarify that if issue IDs were deleted in the past from BB (see #42), then list index will decrease, but issue ID won't change.
The readme says "The script will also throttle the amount of requests it makes per minute to avoid the 60 request per minute limit that github enforces." But in trying this out just now, I'm getting 401 errors from github.
Traceback (most recent call last):
File "migrate.py", line 153, in <module>
title=issue.get('title').encode('utf-8'),
File "/usr/local/lib/python2.7/dist-packages/github2/issues.py", line 53, in open
filter="issue", datatype=Issue)
File "/usr/local/lib/python2.7/dist-packages/github2/core.py", line 50, in get_value
value = self.make_request(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/github2/core.py", line 41, in make_request
**post_data)
File "/usr/local/lib/python2.7/dist-packages/github2/request.py", line 70, in post
method="POST")
File "/usr/local/lib/python2.7/dist-packages/github2/request.py", line 84, in make_request
result = self.raw_request(url, extra_post_data, method=method)
File "/usr/local/lib/python2.7/dist-packages/github2/request.py", line 115, in raw_request
response.status, response_text))
RuntimeError: unexpected response from github.com 401: ' '
Not sure it makes any difference, but I'm trying to sync to an org account repo.
Hi Jeff,
Probably doing something very wrong, but how do I save my
username/password so that it works with the utility ?
have tried putting in ~/.hgrc - see below.
Thanks Dave
ps Use to Climb in UK - Chair Ladder was my favourite area, Cornwall
granite sea cliff climbing.
(py3) dave$ python3 migrate.py [email protected]
https://bitbucket.org/abadger/c2p abadger1406
https://github.com/NetFM/c2p
Traceback (most recent call last):
File "migrate.py", line 442, in
sys.exit(main(options))
File "migrate.py", line 103, in main
kr_pass_bb = keyring.get_password('Bitbucket', options.bitbucket_username)
File "/home/dave/py3/lib/python3.4/site-packages/keyring/core.py",
line 42, in get_password
return _keyring_backend.get_password(service_name, username)
File "/home/dave/py3/lib/python3.4/site-packages/keyring/backends/fail.py",
line 18, in get_password
raise RuntimeError("No recommended password was available")
RuntimeError: No recommended password was available
(py3) dave$ more ~/.hgrc-
[auth]
x.prefix = https://bitbucket.org/abadger/c2p
x.username = [email protected]
x.password = xxx-password-xxx
(py3) dave$
#25 added support for bitbucket labels/metadata/component and translates them to github labels:
https://github.com/jeffwidman/bitbucket_issue_migration/blob/master/migrate.py#L295
@nicoddemus do you remember what bitbucket metadata & component types refer to?
Example:
https://bitbucket.org/leafstorm/flask-uploads/issues/11
https://github.com/jeffwidman/testing/issues/116
Easiest solution is probably just do: No description provided.
Just a comment, not really an issue - this only works on issue trackers that are public in bitbucket, otherwise you'll get an HTTP error and the migration will fail.
{"error": {"message": "Resource not found", "detail": "There is no API hosted at this URL.\n\nFor information about our API's, please refer to the documentation at: https://confluence.atlassian.com/x/IYBGDQ"}}
A couple of improvements:
[changeset:2320](changeset:2320)
In #19, I encountered an error, but only when I tried to run the migration for real. When I ran it using --dry-run
, the error was not encountered. Dry run should probably catch errors in formatting.
During a dry run, Github credentials are not needed, but they're prompted for anyway. This was not an issue in the refactor-and-python3 submission, which only prompted for credentials in the SubmitHandler.
There are two locations in migrate.py where the name auth
is being used instead of options.gh_auth
, which causes the following traceback:
Traceback (most recent call last):
File "migrate.py", line 481, in <module>
sys.exit(main(options))
File "migrate.py", line 172, in main
gh_issue, gh_comments, options.github_repo, gh_auth, headers
NameError: name 'gh_auth' is not defined
It's only in two places. PR is incoming.
like #9 but for BB.
Right now it's not an issue since we don't auth with BB, but will need addressing for #22
Unfortunately, Bitbucket doesn't support application specific passwords yet:
https://bitbucket.org/site/master/issues/11774/application-specific-passwords-or-tokens
So until then:
After fixing #56 I got the following error
$ python3 migrate.py panaxit panax-cli panaxit panax-cli
Please enter your GitHub password.
Note: If your account has two-factor authentication enabled, you must use a personal access token from https://github.com/settings/tokens in place of a password for this script.
Traceback (most recent call last):
File "migrate.py", line 435, in <module>
sys.exit(main(options))
File "migrate.py", line 109, in main
issues = get_issues(bb_url, options.start)
File "migrate.py", line 286, in get_issues
.format(url=bb_url)
RuntimeError: Could not find the Bitbucket repository: https://api.bitbucket.org/1.0/repositories/panax-cli/issues
Hint: the Bitbucket repository name is case-sensitive.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.