Giter Site home page Giter Site logo

gsa / code-gov-api Goto Github PK

View Code? Open in Web Editor NEW
54.0 18.0 27.0 5.82 MB

API powering the code.gov source code harvester

Home Page: http://code.gov

License: Other

JavaScript 92.43% CSS 0.79% HTML 6.49% Dockerfile 0.30%
gov elasticsearch collaboration open-data code-gov us-government open-source

code-gov-api's Introduction

CircleCI Maintainability Test Coverage Issue Count

Code.gov API - Unlocking the potential of the Federal Government’s software

Our backend API. This project is an Express.js application backed by Elasticsearch. Its primary function is to index and make America's source code discoverable and searchable.

Introduction

What is Code.gov?

Code.gov is a website promoting good practices in code development, collaboration, and reuse across the U.S. Federal Government. Code.gov will provide tools and guidance to help agencies implement the Federal Source Code Policy. It will include an inventory of the government's custom code to promote reuse between agencies. Code.gov will also provide tools to help government and the public collaborate on open source projects.

Click to show more details

Looking for more general information about Code.gov and all of its projects? We have a repo for that! code-gov is the main place to find out more general information about Code.gov as a platform and program.

If you have any general feedback or do not know where to place an particular issue, please feel free to use code-gov to create new issues.

Installation

Please install the following dependencies before running this project:

Once node is installed, install the local npm dependencies

cd code-gov-api && npm install

Running

Environment Variables

Before running any of the commands included in the package.json file there are some environment variables that need to be set:

  • NODE_ENV: The node environment the project is running under. Valid environments are:

    • prodcution or prod
    • staging or stag
    • development or dev
  • LOGGER_LEVEL: The output level of all the logs produced by the application. This extends to the Elasticsearch library. Defaults to info.

    Click for details on logger levels

    We use Bunyan for our logging. More info on logger levels can be found at https://github.com/trentm/node-bunyan#levels

  • NEW_RELIC_KEY (optional) - Your New Relic key. You will need a New Relic account to get one. For more inforamation visit the New Relic docs.

  • USE_HSTS: Sets the use of HTTP Strict Transport Security. The default value depends on the value set for NODE_ENV. This variable is set to true if in production or false if not in production.

  • HSTS_MAX_AGE: A HSTS required directive. For more information on what it is used for please visit https://tools.ietf.org/html/rfc6797#section-6.1.1. This value defaults to 31536000 milliseconds

  • HSTS_PRELOAD: Whether or not to use the HSTS pre-loaded lists. Defaults to false. More information on HSTS pre-loaded lists can be found at https://tools.ietf.org/html/rfc6797#section-12.3.

  • PORT: Port to be used by the API. Defaults to 3000.

  • ES_HOST: URL for the Elasticsearch host to be used by the API and harvesting process. This URL should also contain the user and password needed to use the Elasticsearch service. Defaults to http://elastic:changeme@localhost:9200

    Click for more details on Elasticsearch Auth

    Elasticsearch has a built in REST API with its own internal security features. The user elastic with the password changeme is the default super user. This should not be used this way in a production environment.

    For more information about how to configure the authentication for Elasticsearch click here.

  • GITHUB_AUTH_TYPE: The type of authentication mechanism to use with the Github API. Defaults to token.

    Click here for more information on Github Authentication Types

    There are a couple of different ways you can interact with the Github API. The more common ones are:

    • basic authentication
    • OAuth2 token based authentication
    • OAuth2 key/secret based authentication

    For more information please click here

  • GITHUB_TOKEN: The token to use for Github API access. This variable has no default value and needs to be provided by you. This token can be obtained in your Github profile settings. For more info please click here.

Data Harvesting

This project uses Elasticsearch to store code repository metadata. As such, it is necessary to run an indexing process which will populate the necesary indexes in Elasticsearch.

Make sure that Elasticsearch is running and is accessible.

Click here for more info on installing and running Elasticsearch

To install Elasticsearch on your machine please follow the instructions found here.

We have found that using Elasticsearch within a Docker container is one of the simplest ways to get up and running. We have included a Docker compose file in this project that can help you get on your way.

Please take a look at the Getting Started and Set up Elasticsearch sections in the Elastic documentaion.

Once verified that Elasticsearch is up execute:

npm run index

This will start the harvesting and indexing process. Once this process is finished all data should be available for the API.

Starting the API

After the indexing process runs, you can fire up the server by running:

npm start

The API should now be accessible via the browser (or curl) at http://localhost:3000/api/.

Click for the cUrl command
curl http://localhost:3000/api/

Docker

For more detailed documentation on Docker and its components please visit their documentation site.

Build

To run a container you first have to build an image. To do so you can execute

docker build -t <name_and_tag_for_your_image> .
Click for example

For us, Code.gov, the command would be:

docker build -t codegov/code-gov-api .

To verify that the image was created you can execute

docker images

Look for the name_and_tag_for_your_image that you used to build the image.

docker-build

Run a container

To create and run a container execute:

docker run -p 3000:3000 codegov/code-gov-api

If you want the container to run in the background (detached) pass the -d flag to the docker run command.

Eg:

docker run -d -p 3000:3000 codegov/code-gov-api

To attach the project's source directory to the containers volume execute docker run -d -p 3000:3000 -v <path_to_project>:/usr/src/app codegov/code-gov-api

Eg.

docker run -d -p 3000:3000 -v /home/user/code-gov-api:/usr/src/app codegov/code-gov-api

For more information on how to use Docker volumes take a look at:

Container Env

The code-gov-api container accepts a number of environment variables. They are the same variables found here.

Click here for an example
docker run -p 3000:3000 \
  -e NODE_ENV=dev \
  -e ES_HOST="http://elastic:changeme@localhost:9200" \
  codegov/code-gov-api

Docker compose

Docker compose lets you recreate a complete environment for the code.gov API. The docker-compose.yml file lets us define how these services are stood up, how they relate to each other, and manages other low level things. For more detailed information on Docker Compose take a look at https://docs.docker.com/compose/.

To stand up a Code.gov API environment execute from the root of the project:

docker-compose up

This command will build a new code-gov-api image, download an Elasticsearch image, and will run all containers in the correct order. You will see the output of each container in your terminal.

Once everything is up and running you can access the API in your browser at: http://localhost:3001/api. If you only want to build the code-gov-api image you can execute docker-compose build.

Contributing

Here’s how you can help contribute to code.gov API:

Questions

If you have questions, please feel free to open an issue here or send us an email at [email protected].

Public domain

As stated in our contributing document:

This project is in the worldwide public domain (in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication).

All contributions to this project will be released under the CC0-1.0 dedication. By submitting a pull request, you are agreeing to comply with this waiver of copyright interest.

Contact Info

Join our #opensource-public Slack channel: https://chat.18f.gov/

code-gov-api's People

Contributors

bjbhatt avatar bjbhattgsa avatar danieljdufour avatar dependabot[bot] avatar jcastle-zz avatar lbeaufort avatar mattbailey0 avatar michael-balint avatar normanbrobinson avatar ricardoareyes avatar saracope avatar seanstar12 avatar snyk-bot avatar stim371 avatar valedaemon avatar yozlet avatar zachary-kuhn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

code-gov-api's Issues

Agencies Endpoint

Looks like the API already has a JSON file of Agencies - would love for these to be available via an endpoint.

It would also be helpful to pass params that only return Agencies that have repos.

null license produces warning on dashboard

Schema 1.0.1 specifies that the license field can be null or a string. If it is null, the dashboard outputs a warning:

{
  "keyword": "type",
  "dataPath": ".license",
  "schemaPath": "#/properties/license/type",
  "params": {
    "type": "string"
  },
  "message": "should be string"
}

The dashboard should not output a warning if the license field is null, since the schema allows it.

Are code.json HTTP redirects supported?

Does the code.gov indexer support HTTP 301 redirects at agencies' code.json endpoints? We are looking at hosting code.json at a location other than our agency's main website root.

add a repo description

It's not a big deal but when you have a second, it's worth adding a repo description for developer friendliness.

API is crashing on production & staging in cloud.gov

I've just tried pushing both the production version and the staging version of code-api (and code-api-staging), and gotten the following error messages:

screen shot 2017-07-18 at 2 08 23 pm

screen shot 2017-07-18 at 2 16 19 pm

It looks like 'request-promise' module is being called but it isn't found. But I don't seem to have a problem running the API when I load it locally.

@valedaemon Any ideas?

Clean up stale branches

There are some old branches that can be cleaned up / deleted. They are either too old or are not applicable anymore.

Dashboard incorrectly flags "organization" as required property

Version 1.0.1 of the schema defines organization as an optional property within projects objects. However, the validator is currently throwing warnings when organization is omitted and referring to it as required.

Options:
1 - squelch this warning entirely
2 - continue to warn but soften warning language

I'm of the mind that we don't need to press agencies to include an organization where it isn't relevant and we should go with option 1.

Add repoHostname Field

Add a repoHostname field to determine the host of the repo (i.e. GitHub, Bitbucket). This will be useful for the front-end as it will enable custom iconography to be introduced from FontAwesome or other sources to display the logo of the repo host wherever the repo is displayed.

Add id Field in Conjunction with the repoID

Add a numeric id field that we can guarantee is unique. The front-end is currently using the repoID to resolve the repo's route. To make the switch over to a different field without having to create redirects in the front-end, I propose we create an id field which matches the repoID unless that id already exists in the API - in which case a new, randomly generated unique number be created and saved as the id of the repo for the API's purposes.

Add a proper manifest.yml file

Why

The manifest.yml file is used with continuous deployments. It'll be important for us to ensure proper configuration of the app.

Add mapping support for Code.json schema v2.0

As I see it there's a couple of things that need to be done:

  • API has to recognize v1.0.0
  • API has to recognize v1.0.1
  • API has to recognize v2.0.0
  • Create mappings for v2.0.0
  • Rename current mappings to reflect they are v1.0,1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.