Giter Site home page Giter Site logo

aws-samples / codecommit-crawler-innersource Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 8.0 237 KB

A custom crawler written for AWS CodeCommit that generates the repos.json that can be used by the SAP InnerSource Portal.

License: MIT No Attribution

Python 100.00%
innersource inner-source opensource codecommit discovery-platform

codecommit-crawler-innersource's Introduction

AWS CodeCommit Crawler for InnerSource Portal

Organizations setting up an InnerSource ecosystem in their intranet should be able to use any Source Code Control system. This project assists in setting up a crawler for AWS CodeCommit based InnerSource code repositories that can be utilized by the SAP InnerSource Portal. The crawler can made to fetch these details automatically every once a while using cron construct. Click here to know more about the Crawler.

This crawler was created as part of an AWS Devops Blog titled Building an InnerSource ecosystem using AWS DevOps tools that talks about building a model InnerSource ecosystem that leverages multiple AWS services, such as CodeBuild, CodeCommit, CodePipeline, CodeArtifact, and CodeGuru, along with other AWS services and open source tools.

The project creates a repos.json to be consumed by the SAP InnerSource Portal to display available InnerSource projects. The solution assumes that you have the CodeCommit repositories already setup and that the crawler is able to connect to them using AWS credentials (namely, aws_access_key_id and aws_secret_access_key).

The crawler implements a custom logic for assigning the activity score and omits the fields that are not available/relevant for CodeCommit (e.g. Fork or Star).

Installation

pip install -r requirements.txt

Usage

  1. (Optional) Add a tag to your InnerSource repos with key as type and value as innersource
  2. (Optional) Add an innersource.json file in each repo (a sample file is included in this repo), with the details about the project.
  3. Run python3 ./crawler.py, which will create a repos.json file containing the relevant metadata for the AWS CodeCommit repos
  4. Copy repos.json to your instance of the SAP InnerSource Portal and launch the portal as outlined in their installation instructions.

Customization

While the entire code can be customized according to your use case, a particular customization might be needed if your AWS CodeCommit installation contains repositories other than the InnerSource repos. In such a case you may want to filter out the InnerSource ones using tags, such as type = innersource. An example code to implement this filter is provided:

tag_data = cc_client.list_tags_for_resource(
	resourceArn = repo_metadata["Arn"]
)
repo_tags = tag_data["tags"]
repoType = repo_tags["type"]
if repoType != "innersource":
	break

CodeCommit Crawler

Similarly, you may chose to add an innersource.json file in each of your InnerSource repo (a sample file is included in this repo), with the details about the project. This helps in populating the fields on the portal information of which cannot be fetched from CodeCommit.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

codecommit-crawler-innersource's People

Contributors

amazon-auto avatar arlou avatar dchucks avatar spier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.