singer-io / tap-jira Goto Github PK

View Code? Open in Web Editor NEW

35.0 20.0 54.0 226 KB

A Singer.io tap for extracting data from the JIRA API

License: GNU Affero General Public License v3.0

Python 99.74% Makefile 0.26%

tap-jira's Introduction

tap-jira

This is a Singer tap that produces JSON-formatted data following the Singer spec.

This tap:

Pulls raw data from the JIRA Cloud REST API
Extracts the following resources:
Outputs the schema for each resource
Incrementally pulls data based on the input state

Quick Start

Install

pip install tap-jira

Create the config file

Create a JSON file called config.json. Its contents should look like (for Basic Auth):

 {
     "start_date": "2010-01-01",
     "username": "your-jira-username",
     "password": "your-jira-password",
     "base_url": "https://your-jira-domain",
     "user_agent": "<user-agent>",
     "request_timeout": 300,
     "groups": "jira-administrators, site-admins, jira-software-users"
 }

or (for OAuth):

{
  "oauth_client_secret": "<oauth-client-secret>",
  "user_agent": "<user-agent>",
  "oauth_client_id": "<oauth-client-id>",
  "access_token": "<access-token>",
  "cloud_id": "<cloud-id>",
  "refresh_token": "<refresh-token>",
  "start_date": "<i.e. 2017-12-04T19:19:32Z>",
  "request_timeout": 300,
  "groups": "jira-administrators, site-admins, jira-software-users"
}

The start_date specifies the date at which the tap will begin pulling data (for those resources that support this).

For Basic Auth, the base_url is the URL where your Jira installation can be found. For example, it might look like: https://mycompany.atlassian.net.

The groups specifies groups for users stream. It is an optional parameter. Default value is ["jira-administrators", "jira-software-users", "jira-core-users", "jira-users", "users"].

Run the Tap in Discovery Mode
```
tap-jira -c config.json -d
```
See the Singer docs on discovery mode here.

Run the Tap in Sync Mode

tap-jira -c config.json -p catalog-file.json

tap-jira's People

Contributors

Stargazers

Watchers

Forkers

indigojump jacksonh plenadatadave jbelke ullauri henriallik charles-zhan rmy0005 valulucchesi vbourgin citybaseinc meverg fhanson gannettdigital henriblancke mohitverma24 rpaterson uptilab2 bazaarvoicebiztech monkidea minwareco jedachte apollographql mostafazh hotgluexyz fluendo agrandotech extead rebuy-de longtomjr aaronnie02 degreed-data-engineering peliqan-io nagypeter peako-io leightonish poeticmichael kugart semergydev ftozkoparan penny-ai wkennedy727 jamesgouldsonos mainspringenergy linalakk87 lerrua villagelabsco paulcaron16k janhoon edgarrmondragon

tap-jira's Issues

Additional properties are not allowed

Hi everyone,

When including the Projects scheme in a catalog file generated by the discovery option, I get the following return message:

CRITICAL Record does not pass schema validation: Additional properties are not allowed ('style', 'isPrivate' were unexpected)

Has someone else seen this before?

Changelogs cannot be linked to Issues

There is no data in the changelogs endpoint that I can use to join data from changelogs to the parent issue.

It looks like here the pertinent "issueIdOrKey" is input when requesting changelog data.

Can we output the issue id in the changelog object as well? Doing so would help to connect this data to parent issue(s).

Error connecting to Jira sever with self-signed certificate - SSLCertVerificationError - certificate verify failed: unable to get local issuer certificate

We have an on-premise Jira installation which uses a self-signed certificate. I integrated the self-signed certificate int the OS infrastructure (I can connect with curl). `

The tap-jira extractor fails to handle self-signed certificates. Using REQUESTS_CA_BUNDLE or SSL_CERT_FILE had no effect. Both environment variables worked for me when I created a simple HTTP client using urllib3 (which is used by the plugin). tap-jira uses certifi to validate SSL certificates, ignoring the environment variables.

Is there a way to provide the self-signed root certificate to the extractor?

Here is a setup to reproduce the error - precondition:

Jira server with self-signed certificate
Extract the certificate and integrate it on your client's root certificate database (for example, with update-ca-certificate on Debian)

➜ meltano add extractor tap-jira --variant singer-io
Added extractor 'tap-jira' to your project
Variant:	singer-io
Repository:	https://github.com/singer-io/tap-jira
Documentation:	https://hub.meltano.com/extractors/tap-jira--singer-io

2024-06-29T07:54:23.010831Z [info     ] Installing extractor 'tap-jira'
2024-06-29T07:54:33.398365Z [info     ] Installed extractor 'tap-jira'

To learn more about extractor 'tap-jira', visit https://hub.meltano.com/extractors/tap-jira--singer-io

# Configure the plugin - replaced our domain name with example.com
➜ meltano config tap-jira set base_url "https://jira.example.com"
➜ meltano config tap-jira set start_date 2024-06-28
➜ echo TAP_JIRA_USERNAME=your_jira_user >> .env
➜ echo TAP_JIRA_PASSWORD=your_jira_password >> .env

➜ meltano config tap-jira test
2024-06-29T07:54:48.547509Z [info     ] The default environment 'dev' will be ignored for `meltano config`. To configure a specific environment, please use the option `--environment=<environment name>`.
Need help fixing this problem? Visit http://melta.no/ for troubleshooting steps, or to
join our friendly Slack community.

Plugin configuration is invalid
Catalog discovery failed: command ['/home/max/dev/qxvp/mx-dev-analytics/.meltano/extractors/tap-jira/venv/bin/tap-jira', '--config', '/home/max/dev/qxvp/mx-dev-analytics/.meltano/run/tap-jira/tap.74b43b5c-493a-4b2a-9dcb-bf01818f8959.config.json', '--discover'] returned 1 with stderr:
 INFO Using Basic Auth API authentication
INFO Backing off send(...) for 1.0s (requests.exceptions.SSLError: HTTPSConnectionPool(host='jira.example.com', port=443): Max retries exceeded with url: /rest/api/2/serverInfo (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)'))))
...

I found a hacky work-around by injecting pip-system-certs into the extractors virtual environment:

➜ . .meltano/extractors/tap-jira/venv/bin/activate
(venv) ➜ python -m pip install pip-system-certs
(venv) ➜ deactivate
➜ meltano config tap-jira test
2024-06-29T08:12:21.188098Z [info     ] The default environment 'dev' will be ignored for `meltano config`. To configure a specific environment, please use the option `--environment=<environment name>`.
Plugin configuration is valid

The group named 'jira-software-users' does not exist

Hi,

Our jira integration in Stitch stopped working last week with the subject error message.

Could this be related to this commit:
e6c54d3

Thx

Basic auth will be deprecated

Per https://developer.atlassian.com/cloud/jira/platform/deprecation-notice-basic-auth-and-cookie-based-auth/, basic auth will no longer be supported. Suggestion is to replace the basic auth feature here with api token authorization. The readme should be updated as well, with example config.

primary key of Users stream needs to be updated

The current primary key of users stream is key, however, it has been deprecated and replaced by accountId.

Using version 1.0.5 of tap-jira, I'm getting this error INFO Cannot find ['key'] primary key(s) in record: {'self': 'https://transferwise.atlassian.net/rest/api/2/user?accountId=blabla ....etc}

Refactor class `Everything` back into class `Stream`

Take class Everything's sync() and add it to class Stream's definition
- Confirm all subclasses of Stream implement a sync() anyway
Delete class Everything
In class Stream's __init__(), add path=None
Change the initialization of the 3 Everything objects to initialize Stream objects

Optionally Refresh OAuth Credentials

Hi,

Our organization is refreshing OAuth credentials before we pull data using tap-Jira. It turned out tap-Jira internally refreshes the tokens. As a result, the previously generated tokens get invalidated. We need to have previously generated tokens to still be valid.

It would be great if we can optionally refresh OAuth token.
Thanks

Not getting data | JSONDecodeError | {"currently_syncing": null}

Hi I am using tap-jira and target-csv to export jira data hosted locally to csv
I have configured config.json for jira and tried both discovery mode and sync mode
tap-jira -c config-jira.json -d > catalog.json
tap-jira -c config-jira.json -p catalog.json

There are few issues to be noticed here:

Even if i run tap-jira in discovery mode leaving all values for each key in config empty
{ "start_date": "", "username": "", "password": "", "base_url": "", "user_agent": "" }
or I run with actual correct values, in both cases I get same catalog.json file and I get no exceptions.
when run in sync mode on catalog file (generated with correct values in config) I am getting JSONDecodeError exception:

Traceback (most recent call last):
File "C:\Singer\tap-jira\Scripts\tap-jira-script.py", line 33, in
sys.exit(load_entry_point('tap-jira==2.0.0', 'console_scripts', 'tap-jira')())
File "c:\singer\tap-jira\lib\site-packages\tap_jira_init_.py", line 135, in main
raise exc
File "c:\singer\tap-jira\lib\site-packages\tap_jira_init_.py", line 132, in main
main_impl()
File "c:\singer\tap-jira\lib\site-packages\tap_jira_init_.py", line 109, in main_impl
args = get_args()
File "c:\singer\tap-jira\lib\site-packages\tap_jira_init_.py", line 28, in get_args
unchecked_args = utils.parse_args([])
File "c:\singer\tap-jira\lib\site-packages\singer\utils.py", line 174, in parse_args
args.properties = load_json(args.properties)
File "c:\singer\tap-jira\lib\site-packages\singer\utils.py", line 109, in load_json
return json.load(fil)
File "C:\Python39\lib\json_init_.py", line 293, in load
return loads(fp.read(),
File "C:\Python39\lib\json_init_.py", line 346, in loads
return _default_decoder.decode(s)
File "C:\Python39\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Python39\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I can see syncing null {"currently_syncing": null} when trying to run command for sending data to target-csv

INFO Using Basic Auth API authentication
{"currently_syncing": null}

Q1. Is singer compatible with powershell and windows ?
Q2. Is there something that I am missing in execution ?

Wrong API resource to query Components

When Components is selected then it tries to use
/rest/api/2/project/{projectIdOrKey}/component
resource and throws exception because the correct one is:
/rest/api/2/project/{projectIdOrKey}/components (with 's')

See doc: https://developer.atlassian.com/cloud/jira/platform/rest/v2/api-group-project-components/#api-rest-api-2-project-projectidorkey-components-get

tap-jira/tap_jira/streams.py

Line 156 in 4bd9abd

path = "/rest/api/2/project/{}/component".format(project["id"])

Tested with on-prem Jira, but most likely cloud version has the same issue.

Support for personal access token

Hi,

Our team is using Jira on-prem. As of now, tap-Jira doesn't support personal access token authentication. For more information please refer https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html.

It would be great to have support for personal access token.
Thanks

Getting PRs merged

Dear singer team,

I think there are quite some PRs open that could be reviewed and merged into the main branch. Is there anyway to support you on this? If not, any other chance to move forward? :-)

Best,
Daniel

Incremental sync flaw

I'm running this tap through meltano. I've been running since March 2023 and have noticed that it occasionally misses records. If I do a full re-sync, the records show up, but something about the state is not working properly. Have you seen this before?

Broken links in README.md

All resource links in the README.md redirect to https://developer.atlassian.com/cloud/jira/platform/rest/v3/intro/

Status names mapping to status subcategories

Any plans to include a table with all the status names and their corresponding subcategories?

We can get both in the issues table, but that only has the statuses that are currently in use. It's possible that an issue changed from/to a status that is no longer in use (we can see this in changelogs__items), and we only have the status name, we don't know if it's a To Do, In Progress or Done status.

Readme missing info - How does one run tap-jira with a state and how does one correctly output the state of running import?

The tap outputs its state incrementallya which results with an invalid json file if we stream the state into it.
Any guidance and a readme entry for running this tap with a state would be helpful.

Add an accountType property on user/author schema
Add tmpFromAccountId and tmpToAccountId properties to changelogs schema
Add entityId, uuid, properties, isPrivate and style properties to projects schema

Related PR: #39

This update of schema resolves issue #35

Quiet the output

Hi. Is there a parameter to silent the output a little bit? the amount of warnings and metrics is flooding my CI output, for example:

WARNING Removed paths list: ['isPrivate', 'properties', 'style']
WARNING Removed 3 paths during transforms:
	isPrivate
	properties
	style

and:

INFO METRIC: {"type": "counter", "metric": "record_count", "value": 2, "tags": {"endpoint": "users"}}