Giter Site home page Giter Site logo

josh-project / josh Goto Github PK

View Code? Open in Web Editor NEW
1.3K 9.0 54.0 9.28 MB

Just One Single History

Home Page: https://josh-project.github.io/josh/

License: MIT License

Rust 70.70% Dockerfile 1.18% Shell 2.49% Makefile 0.07% Python 0.45% HTML 0.26% CSS 0.01% Nix 0.16% SCSS 1.45% TypeScript 8.15% Go 15.08%
git monorepo workflow scm

josh's Introduction

Just One Single History

Combine the advantages of a monorepo with those of multirepo setups by leveraging a fast, incremental, and reversible implementation of git history filtering.

josh-proxy can be integrated with any git host:

$ docker run \
    -p 8000:8000 \
    -e JOSH_REMOTE=https://github.com \
    -v josh-vol:/data/git \
    joshproject/josh-proxy:latest

See Container options for full list of environment variables.

Use cases

Partial cloning

Reduce scope and size of clones by treating subdirectories of the monorepo as individual repositories.

$ git clone https://josh-project.dev/josh.git:/docs.git

The partial repo will act as a normal git repository but only contain the files found in the subdirectory and only commits affecting those files. The partial repo supports both fetch as well as push operation.

This helps not just to improve performance on the client due to having fewer files in the tree, it also enables collaboration on parts of the monorepo with other parties utilizing git's normal distributed development features. For example, this makes it easy to mirror just selected parts of your repo to public github repositories or specific customers.

Project composition / Workspaces

Simplify code sharing and dependency management. Beyond just subdirectories, Josh supports filtering, re-mapping and composition of arbitrary virtual repositories from the content found in the monorepo.

The mapping itself is also stored in the repository and therefore versioned alongside the code.

Central monorepo Project workspaces workspace.josh file
Folders and files in central.git Folders and files in project1.git
dependencies = :/modules:[
    ::tools/
    ::library1/
]
Folders and files in project2.git
libs/library1 = :/modules/library1

Workspaces act as normal git repos:

$ git clone http://josh/central.git:workspace=workspaces/project1.git

Simplified CI/CD

With everything stored in one repo, CI/CD systems only need to look into one source for each particular deliverable. However, in traditional monorepo environments, dependency management is handled by the build system. Build systems are usually tailored to specific languages and need their input already checked out on the filesystem. So the question:

"What deliverables are affected by a given commit and need to be rebuilt?"

cannot be answered without cloning the entire repository and understanding how the languages used handle dependencies.

In particular, when using C family languages, hidden dependencies on header files are easy to miss. For this reason, limiting the visibility of files to the compiler by sandboxing is pretty much a requirement for reproducible builds.

With Josh, each deliverable gets its own virtual git repository with dependencies declared in the workspace.josh file. This means answering the above question becomes as simple as comparing commit ids. Furthermore, due to the tree filtering, each build is guaranteed to be perfectly sandboxed and only sees those parts of the monorepo that have actually been mapped.

This also means the deliverables to be re-built can be determined without cloning any repos like typically necessary with normal build tools.

GraphQL API

It is often desireable to access content stored in git without requiring a clone of the repository. This is useful for CI/CD systems or web frontends, such as dashboards.

Josh exposes a GraphQL API for that purpose. For example, it can be used to find all workspaces currently present in the tree:

query {
  rev(at:"refs/heads/master", filter:"::**/workspace.josh") {
    files { path }
  }
}

Caching proxy

Even without using the more advanced features like partial cloning or workspaces, josh-proxy can act as a cache to reduce traffic between locations or keep your CI from performing many requests to the main git host.

FAQ

See here

josh's People

Contributors

arthurbragaa avatar campeis avatar christian-schilling avatar dependabot-preview[bot] avatar dependabot[bot] avatar dhermes avatar flokli avatar jazzdan avatar josh-contributor avatar kloolk avatar lmg avatar marcmo avatar marklodato avatar misilelab avatar mnemnion avatar nicoretti avatar phimuemue avatar ralfjung avatar robw-23 avatar seonggwonyoon avatar tazjin avatar theo-pnv avatar tshepang avatar vlad-ivanov-name avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

josh's Issues

Cloning moved subdirectories gives latest directory content

When you clone a subdirectory which was moved, Josh gives you the last state the directory had in the old location. This behavior already gave us some painful bugs because you lose the ability to "break stuff".

I'd expect all files to be deleted, leaving an empty repository.

Fix octopus merge support

Currently the history traversal will panic if it sees an octopus. There is no real reason for that.

Migrate to tokio-rs/tracing

Looks like a better solution that would also unify trace and log output.

However would first need to implement tokio/tracing to chrome-tracing format.

Improve filter documentation

  • Add glob pattern
  • Add complicated examples (recursive mapping of lock and toml files from rust would be a good example)
  • Explain josh rewrite of workspace.josh files (can write ugly ones they get nicer)

git-sync sometimes doesn't work on windows

In windows in bash, it seems the --porcelain option is not having the desired effect.

alexander.XXX@XXX MINGW64 /e/processes (master)
$ c:/tools/git-sync -o create origin master:refs/for/master
* refs/heads/master -> refs/for/master
fatal: unable to access 'https://example.com/c/the/repo/+/75889 Add processes workspace                /': URL using bad/illegal format or missing URL
POST git-receive-pack (526 bytes)
remote: josh-proxy
remote: response from upstream:
remote: remote:
remote: remote: Processing changes: updated: 1 (\)
remote: remote: Processing changes: refs: 1, updated: 1 (\)
remote: remote: Processing changes: refs: 1, updated: 1 (\)
remote: remote: Processing changes: refs: 1, updated: 1, done
remote: remote: commit fbd4ca9: -
remote: remote:
remote: remote: SUCCESS
remote: remote:
remote: remote:   XXX Add processes workspace
remote: remote:
remote: To XXXX
remote:  * [new branch]                JOSH_PUSH -> refs/for/master
remote: REWRITE(c19fe237bbecf379e670f47ff768c3f3cfc8ef45 -> 39cf03eee252687db3e0da4b31ed63b44b0d04ea)
remote:
remote:
Pushing to XXXX

View for workspace index

Provide a view that helps discovering all the workspaces available and mapping from monorepo revisions to workspace revisions.

Support token based authentication

Github announced that they will deprecate credential based auth in August.
To keep josh usable with github we need to implement a supported alternative.

Please add tag support

  • tag in gerrit repo shall be available via Josh
  • tag in Josh shall be available in gerrit

Take care about collisions:

  • last is best or
  • prefix or
  • ?

Thank you!

Change behaviour on workspace empty mappings

Maybe it'd make sense to actually throw an error (and not a warning) on an empty mapping in a workspace.

Then we could add an option to allow people to clone a non-existent filter (-o ignore-unknown or something)

Unwanted merges showing up on subfolder export

Hi all,

When a subfolder is extracted from a history that contains merges without effect on it (for example, using --no-ff), git log will hide the merges in the subfolder, but josh will reproduce them in the export.

This behaviour is demonstrated in #432

It would seem that this is a different behaviour than before, but that is to be confirmed.

MfG,
Louis-Marie

authentication failed

Hi, Im trying to run the example on the README, and I saw that when I tried to clone, git is asking me for authentication. I looked at the source code and I saw there were passwords being set via an environment variable, so I tried running the server again by giving an explicit user/pass:

JOSH_USERNAME="josh" JOSH_PASSWORD="josh" ./target/release/josh-proxy --local ./tmp1/ --remote=https://github.com --port=8000

but when I tried:

git clone http://localhost:8000/git/git.git:/Documentation.git

I still got:

Cloning into 'Documentation'...
Username for 'http://localhost:8000': josh
Password for 'http://josh@localhost:8000': 
fatal: Authentication failed for 'http://localhost:8000/git/git.git:/Documentation.git/'

What is the right way to authenticate?

JOSH returns 500 on `git ls-remote` when there is no master branch

Hey!

We're successfully using JOSH to virtualise our main trunk repo so that we can consume libraries in other downstream projects. This works great for the most part. We have however run in to an issue when using JOSH together with git-subrepo and a renamed master branch in the source repo.

As a workaround we have been adding the branch name to the filter section of the remote url, effectively turning it in to the master branch, but this seems to cause issues with pushing changes back upstream.

Cloning a virtualised repo directly works just fine, even when there is no master branch. The correct default brach is identified anyways. The same way, adding that full source repo using git-subrepo, everything works and the correct default brach is identified.

The problem seems to arise when using all three things at once, and it seems to come down to JOSH returning a 500 error on the ls-remote request initiated by git-subrepo.

Going through the logs it seems like the following fetch call is the culprit:

git fetch --no-tags https://[email protected]/<repo/name>.git \'+refs/heads/*:refs/josh/upstream/<name>.git/refs/heads/*\' \'+refs/tags/*:refs/josh/upstream/<name>.git/refs/tags/*\' \'+refs/heads/master:refs/josh/upstream/<name>.git/refs/heads/master\'" "/data/git/" 

Since it fails with the following:

fatal: Couldn't find remote ref refs/heads/master

Adding @main to the remote url for the virtual repo sets the correct headref and makes the whole thing work as expected. However even though cloning and pulling new changes from the remote works as expected, my colleagues have reported that they have been unable to push new changes back upstream.

If this is expected to work with a branch name specified like that, we could investigate the cause of the error pushing upstream further.

typo in filter ignores it

Because of
filters.rs:1031 return SubdirView::new(&Path::new(name));
the filter is ignored and the subdirectory is returned as is when there is a typo in the filter name.

An error should be returned instead.

Document example workflow on how to add modules/paths to a workspace

Describe the entire process of adding a subtree/module to a workspace. (maybe with a small example)

Also add information about the issue that when adding a module/subtree that josh is creating an incomming merge commit,
which needs to be taken into account when rebasing locally (if rebase workflow is used).

In that case, the following rebase command will create the desired results

git rebase origin/master --rebase-merges

where as

git rebase origin/master

will add the merged history as new local changes within the tree, which is conflicting because within the monorepo this exact commits are already published.

Please support pushing of tags

Pushing tags to gerrit via josh fails with remote: error: hook declined to update refs/tags/tag1

> git push origin -o base=master --tags
Enumerating objects: 2, done.
Counting objects: 100% (2/2), done.
Delta compression using up to 32 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 255 bytes | 255.00 KiB/s, done.
Total 2 (delta 1), reused 0 (delta 0)
remote: josh-proxy
remote: response from upstream:
remote: Branch "refs/tags/tag1" does not exist on remote.
remote: If you want to create it, pass "-o base=<branchname>"
remote: to specify a base branch.
remote: 
remote: 
remote: 
remote: error: hook declined to update refs/tags/tag1
remote: response from upstream:
remote: Branch "refs/tags/tag2" does not exist on remote.
remote: If you want to create it, pass "-o base=<branchname>"
remote: to specify a base branch.
remote: 
remote: 
remote: 
remote: error: hook declined to update refs/tags/tag2

Pushing with -o base=<branch> or not makes no difference.

Maybe I am also missing something? Thanks for the help!

Changing a mapping path adds the old files/folders

Starting from a workspace.josh looking like

a = :/a

and editing in one commit to

b = :/a

will duplicate the files from a in the workspace directory instead of just updating the mapping.
#250 demonstrates this behaviour in a test. Note the duplicated file2.

Add endpoint to make it possible to deploy without downtime

In order to deploy josh without downtime, it would be nice to have one endpoint to gracefully shutdown josh and another where it could be checked if the instance is up and running again.
This would allow us to make josh highly available.

error 500 on malformed filter in exclude

I was trying to clone a workspace without the contents of its folder, and did the following (erroneous) command:

$ git clone https://myserver.com/my/repo.git:exclude[:workspaces/my-workspace:exclude[:workspaces/my-workspace/workspace.josh]]:workspace=workspaces/my-workspace.git my-workspace-without-its-stuff

Of course the correct command (which seems to work) is:

$ git clone https://myserver.com/my/repo.git:exclude[::workspaces/my-workspace/:exclude[::workspaces/my-workspace/workspace.josh]]:workspace=workspaces/my-workspace.git my-workspace-without-its-stuff

But the first command gives me error 500. I need to make a smaller test case to reproduce it, but I can already say that jaeger complains that remove_dir_all "/data/git/refs/namespaces/request_0a300259-76e5-4970-8e83-6ffad75d853a" failed, error:Os { code: 2, kind: NotFound, message: "No such file or directory" }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.