Giter Site home page Giter Site logo

kubernetes-sigs / depstat Goto Github PK

View Code? Open in Web Editor NEW
33.0 5.0 16.0 38.53 MB

depstat is a dependency analyzer for Go modules enabled projects. It runs as part of the Kubernetes CI pipeline to help evaluate dependency updates to Kubernetes.

License: Apache License 2.0

Go 99.42% Makefile 0.58%
cli kubernetes golang dependency k8s-sig-architecture kubernetes-ci-pipeline dependency-analyzer cobra

depstat's Introduction

depstat

depstat is a command-line tool for analyzing dependencies of Go modules enabled projects.

depstat demo with k8s repo

Installation

To install depstat you can run

go install github.com/kubernetes-sigs/depstat@latest

Usage

  • depstat can be used as a standalone command-line application. You can use depstat to produce metrics about the dependencies of your Go modules enabled project.

  • Another common way to run depstat is in the CI pipeline of your project. This would help you analyze the dependency changes which come with PRs. You can look at how this is done for the kubernetes/kubernetes repo using prow here.

Commands

To see the list of commands depstat offers you can run depstat help. depstat currently supports the following commands:

cycles

depstat cycles shows all the cycles present in the dependencies of the project.

An example of a cycle in project dependenies is: golang.org/x/net -> golang.org/x/crypto -> golang.org/x/net

--json prints the output of the cycles command in JSON format. For the above example the JSON output would look like this:

{
  "cycles": [
    [
      "golang.org/x/net",
      "golang.org/x/crypto",
      "golang.org/x/net"
    ]
  ]
}

graph

depstat graph will generate a graph.dot file which can be used with Graphviz's dot command to visualize the dependencies of a project.

For example, after running depstat graph, an SVG can be created using: twopi -Tsvg -o dag.svg graph.dot

By default, the graph would be created around the main module (first module in the go mod graph output), but you can choose to create a graph around a particular dependency using the --dep flag.

list

depstat list shows a sorted list of all project dependencies. These include both direct and transitive dependencies.

  1. Direct dependencies: Dependencies that are directly used in the code of the project. These do not include standard go packages like fmt, etc. These are dependencies that appear on the right side of the main module in the go mod graph output.

  2. Transitive dependencies: These are dependencies that get imported because they are needed by some direct dependency of the project. These are dependencies that appear on the right side of a dependency that isn't the main module in the go mod graph output.

stats

depstat stats will provide the following metrics about the dependencies of the project:

  1. Direct Dependencies: Total number of dependencies required by the main module(s) directly.

  2. Transitive Dependencies: Total number of transitive dependencies (dependencies which are further needed by direct dependencies of the project).

  3. Total Dependencies: Total number of dependencies of the main module(s).

  4. Max Depth of Dependencies: Length of the longest chain starting from the first main module; defaults to length from the first module encountered in "go mod graph" output.

  • The --json flag gives this output in a JSON format.
  • --verbose mode will help provide you with the list of all the dependencies and will also print the longest dependency chain.

main module

By default, the first module encountered in "go mod graph" output is treated as the main module by depstat. Depstat uses this main module to determine the direct and transitive dependencies. This behavior can be changed by specifying the main module manually using the --mainModules flag with the stats command. The flag takes a list of modules names, for example:

depstat stats --mainModules="k8s.io/kubernetes,k8s.io/kubectl"

Project Goals

depstat is being developed under the code organization sub-project under SIG Architecture. The goal is to make it easy to evaluate dependency updates to Kubernetes. This is done by running depstat as part of the Kubernetes CI pipeline.

Community Contact Information

You can reach the maintainers of this project at:

#k8s-code-organization on the Kubernetes slack.

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

depstat's People

Contributors

alenkacz avatar atharva-shinde avatar bishtsaurabh5 avatar developer-guy avatar k8s-ci-robot avatar karuppiah7890 avatar lakshya8066 avatar liggitt avatar rinkiyakedad avatar srinivas-pokala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

depstat's Issues

transitiveDependencies stat is misleading

noted in kubernetes/kubernetes#103123 (comment)

There is a spectrum of dependencies:

  • transitive-only, not actually needed to build (wouldn't show up in vendor)
  • transitive-only, needed to build (would show up in vendor)
  • directly referenced dependencies

The following might be more useful than the current transitiveDependencies stat (which I think means "transitive-only", so it actually increments when we eliminate direct references to a dependency but it is still in the tree transitively):

  • dependencies required to build (things in our tree we actually include code from when vendoring/building)
  • direct dependencies (things one of our k8s.io/* components refers to directly)

Then, any number decreasing would be moving in the right direction, and any number increasing would be moving in the wrong direction

cc @dims

add tests for JSON output

depstat stats --json and depstat cycles --json currently support producing JSON output. We should have tests which check these outputs.

add tests

Write tests to cover the methods in utils.go.

add tests for getDepInfo function

Currently in all the tests we generate the dependency graph manually. We need to have some tests which check the getDepInfo function so that we are able to catch errors like we faced here.

EDIT: fixed links

command to show all NEW dependencies in a PR

A command which will show all the new dependencies which are present in the PR, that is, all dependencies minus the ones already present.

@alenkacz we could maybe take this a step further and show not only the additions but also any removals in dependencies that happen. This would be a good way of knowing if some PR actually has a positive impact on the project dependencies :)

Create a SECURITY_CONTACTS file.

As per the email sent to kubernetes-dev[1], please create a SECURITY_CONTACTS
file.

The template for the file can be found in the kubernetes-template repository[2].
A description for the file is in the steering-committee docs[3], you might need
to search that page for "Security Contacts".

Please feel free to ping me on the PR when you make it, otherwise I will see when
you close this issue. :)

Thanks so much, let me know if you have any questions.

(This issue was generated from a tool, apologies for any weirdness.)

[1] https://groups.google.com/forum/#!topic/kubernetes-dev/codeiIoQ6QE
[2] https://github.com/kubernetes/kubernetes-template-project/blob/master/SECURITY_CONTACTS
[3] https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance-template-short.md

add command to support generating graphs

The module dependencies can be visualised with the help of a directed graph which can be generated using Graphviz's dot command.

We would need to generate a .dot file which which has all edges of the graph in a format understood by Graphviz.
This README explains the process in detail.

sort list of dependencies

List of dependencies is printed when using depstat list and depstat stats --verbose. It'll be nice to have the list in sorted order.

support --json flag for cycles

For example:

If the output of depstat cycles is this:

All cycles in dependencies are: 

golang.org/x/net -> golang.org/x/crypto -> golang.org/x/net

golang.org/x/crypto -> golang.org/x/net -> golang.org/x/crypto

then the JSON would look like:

{
    "cycles": [
        [
            "golang.org/x/net",
            "golang.org/x/crypto",
            "golang.org/x/net"
        ],
        [
            "golang.org/x/crypto",
            "golang.org/x/net",
            "golang.org/x/crypto"
        ]
    ]
}

@palnabarun I saw you mentioned that the JSON could be cycles: [{}, {}], how would that look like? ๐Ÿค”

Create a SECURITY_CONTACTS file.

As per the email sent to kubernetes-dev[1], please create a SECURITY_CONTACTS
file.

The template for the file can be found in the kubernetes-template repository[2].
A description for the file is in the steering-committee docs[3], you might need
to search that page for "Security Contacts".

Please feel free to ping me on the PR when you make it, otherwise I will see when
you close this issue. :)

Thanks so much, let me know if you have any questions.

(This issue was generated from a tool, apologies for any weirdness.)

[1] https://groups.google.com/forum/#!topic/kubernetes-dev/codeiIoQ6QE
[2] https://github.com/kubernetes/kubernetes-template-project/blob/master/SECURITY_CONTACTS
[3] https://github.com/kubernetes/community/blob/master/committee-steering/governance/sig-governance-template-short.md

Shell script to test against repos

Before we can start using depstat in the k8s CI pipeline we need to check if it is at least working (not giving any errors) with all repositories listed here.

Shell script would:

  1. Iterate through a map of repos.
  2. Clone them.
  3. Run depstat stats and see if there are any errors.
  4. Keep a track of the time it took (this would help identify if it is going in an infinite loop or not). We would expect all repos to take less time than that taken by the main k8s repo (about 2 mins).

allow users to set the mainModule

Currently we set the first package in the go mod graph output as the mainModule. It would be more useful if users can specify what packages they want to be treated as the mainModule by passing them as a flag to the stats command.

investigate long time taken for some projects

For some projects like minikube and test-infra depstat takes too much time (20 mins+) and memory to produce the stats.

This bit of code is probably responsible for this.

We also currently are not sure if it will eventually produce the output or not but after trying various things out, it seems likely that at some point it will. The reason it takes so much time is because of a large number of edges in the dependency graphs of those repos.

Number of edges = go mod graph | wc -l

Link to the relevant discussion on Slack.

store all cycles

Currently cycles are stored in a map[int][]string where we only store one cycle corresponding to a particular length. In order to solve #24 we would need access to all the cycles.

fix max depth of dependencies logic

Printing the number of max dependencies and the length of the longest dependency chain currently has some problems:

  • Running the command consecutively 4-5 times leads to different results
  • The length of the longest dependency chain which gets printed in verbose mode doesn't match the length which is present as maxDepthOfDependencies in the analysis.json file.

The current code for calculating the longest dependency chain is based on storing the dependency graph and then finding the longest path in this graph.

The logic for the length of the longest dependency chain involves iterating through each unvisited dependency and performing a depth first search on it and updating the [https://github.com/RinkiyaKeDad/dependency-analyzer-poc/blob/feat_1/cmd/utils.go#L17) each time. To avoid infinite recursion dependencies which have already been visited are kept track of. In verbose mode for printing the longest dependency chain also a visited array is used.

fix longest chain definition and output

Currently max depth of dependencies calculates the length of longest chain by counting the number of edges, so for,
A -> B -> C it will show 2. But since the tool deals with dependencies it would be better to count number of dependencies in longest chain instead.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.