Giter Site home page Giter Site logo

Comments (14)

initialcommit-io avatar initialcommit-io commented on June 14, 2024

Hi @yarikoptic,

Thanks for your interest in git-sim! It sounds like this is a new feature request. Can you describe in more detail what you're looking for? I don't have experience with git-annex so if you can provide more details on that and how git-sim would relate to it / interact with it in your proposed feature I can get a better understanding of what you mean.

from git-sim.

yarikoptic avatar yarikoptic commented on June 14, 2024
Here is a bash script which implements "above" example (note - you would need git-annex)
#!/bin/bash
# https://github.com/datalad/datalad/issues/7371
#

export PS4='> '
set -x

set -eu

umask 022
cd "$(mktemp -d /tmp/dl-XXXXXXX)"

mkdir remote
(
cd remote
git init
git annex init
)

# Let's slow down now and have it each 1 second
export PS4='> $(sleep 1)'

mkdir origin
(
cd origin;
git init  # creates main branch, no commit
git annex init  # creates git-annex branch with a commit, also modifies .git/config
echo big-data > file.dat
git annex add file.dat  # creates commit in git-annex branches
git commit -m "Adding file.dat"   # creates commit in main
git annex addurl --file file.dat http://www.oneukrainian.com/tmp/file.dat  # only updates git-annex branch if no content change
# For now no remote
git remote add --fetch remote ../remote  # no commits
git annex sync  # if remote had its own history for git-annex branch -- it would get merged
git annex copy --to=remote file.dat  # commit in git-annex branch updating availability information
)


(
cd origin
#git sim log --all
git sim --animate log --all
)

pwd

I also shared a sample of that repo at http://www.oneukrainian.com/tmp/dl-Hq2CrkR.tgz (after removing dl-Hq2CrkR/origin/git-sim_media/).

Overall what seems to be missing are

  • to be able to make video follow commits in chronological order as in commit times (I made them specifically to be produced 1 second apart).

  • somehow "annotate" each step in the video with the command which resulted in that change (not just "update" commit of the git-annex branch) -- would have been awesome. git-sim seems to implement a number of git commands to provide those demonstrations but more detailed as on what effecting the tree/staging area etc. If we could annotate all commands "natively", would have been cool.
    A wild uncooked idea on how to do it - at least in POSIX shell (or bash specifically)... In the bash script above I used export PS4='> $(sleep 1)' to slow down... may be we could abuse it or PROMPT_COMMAND to make a record somewhere (git notes?) of timestamp and command to be executed. Then while creating animation read out those times and match to commit times to discover which command triggered any particular action, thus providing text/annotation. then anyone could script their desired interactions "natively", and only do smth like PROMPT_COMMAND="git sim record $PROMPT_COMMAND" to make such records.

from git-sim.

initialcommit-io avatar initialcommit-io commented on June 14, 2024

Thanks for the details. If I understand correctly, it sounds like git-annex creates additional branches that are used for helping to track large files?

Anyway, one thing you can try is using the --all global flag which will display all local branches in the git-sim output:

git-sim --all log
git-sim --all branch new_branch
git-sim --all -n 10 merge dev
etc...

Note this might show more branches than you want, and currently there isn't a way to specify exactly which branches to show.

If you need it, we can do an enhancement to create a new option to allow specific branch names to be specified, if showing all branches is too much.

from git-sim.

initialcommit-io avatar initialcommit-io commented on June 14, 2024

@yarikoptic Hi again just wanted to check if you saw my last comment and tested out my suggestion for this.

from git-sim.

yarikoptic avatar yarikoptic commented on June 14, 2024

Thanks for the details. If I understand correctly, it sounds like git-annex creates additional branches that are used for helping to track large files?

just one branch, always named the same and which users should not even bother to ever checkout or merge manually -- git-annex. Since I see git-sim primarily to use on some demo repos -- I do not have immediate need for filtering out branches -- --all is good enough.

@yarikoptic Hi again just wanted to check if you saw my last comment and tested out my suggestion for this.

well, as I showed in that snippet (folded by default) in comment above -- indeed I tried git-sim --all log and it worked and I have tried git sim --animate log --all which gave me very lovely animation but as I pointed out in that comment -- events were not happening in the chronological order of the commits, thus limit the "demonstration utility". Should I may be file a separate issue for that chronological aspect?

the other limitation, on which I have tried to come up with an idea for a solution, is annotation of changes in the animate with actual commands which lead to the change, without explicitly needing to redo the script with git-sim alternatives since there are none for git-annex commands.

from git-sim.

initialcommit-io avatar initialcommit-io commented on June 14, 2024

Oh ok I think I understand a little better now. So that's a good point about the chronological order of commits - currently git-sim parses commits by tracing the parent/child relationships, so the order is not necessarily chronological depending on merges from other branches and other scenarios.

It's a good point because I believe by default Git's actual log commands does produce chronological commit order... so let me look into adding that. It would be nice if you could create a new issue for that.

I'm still not 100% clear on exactly what you're looking for with the annotations. It sounds like you're looking for a list of commands to be included in the images/videos, to help users understand how the commit structure came about.

It might be possible to use Git's reflog for that, at least for as long as Git keeps the reflog entries around. We could also use Git notes or something like you mentioned to link specific annotations to each commit, which Git-Sim could pick up on and use in the display. But of course this would rely on the user to add a note to every commit.

The other question is where to display the annotations in the output. Should it replace the commit message? Or somehow make more room for the message?

from git-sim.

yarikoptic avatar yarikoptic commented on June 14, 2024

It's a good point because I believe by default Git's actual log commands does produce chronological commit order... so let me look into adding that. It would be nice if you could create a new issue for that.

done: #96

I'm still not 100% clear on exactly what you're looking for with the annotations. It sounds like you're looking for a list of commands to be included in the images/videos, to help users understand how the commit structure came about.

yes

It might be possible to use Git's reflog for that, at least for as long as Git keeps the reflog entries around.

it might indeed work for some which change current HEAD, but would not work for my use case where command is changing not (only) HEAD but some other reference (git-annex branch)

We could also use Git notes or something like you mentioned to link specific annotations to each commit, which Git-Sim could pick up on and use in the display. But of course this would rely on the user to add a note to every commit.

And that is where I was hypothesizing on how to automate it in that comment . Did a prototype -- this tiny helper

$> cat ./record-commands4sim.sh
#!/bin/bash
# https://github.com/datalad/datalad/issues/7371
#
# A helper

export PS4='> $(date "+%Y-%m-%d %H:%M:%S.%N"): '; 
log="$(mktemp /tmp/sim-XXXXXXX)"
exec 2> "$log"

set -x

bash -x "$@"

echo "Commands with time stamps collected in $log"

and run on some sample script with git and other needed commands like the one I gave above (now removed some git-sim invocation etc):

$> cat ./try-git-sim-short.sh
#!/bin/bash

set -eu

umask 022
cd "$(mktemp -d /tmp/dl-XXXXXXX)"

mkdir remote
(
cd remote
git init
git annex init
)


mkdir origin
(
cd origin;
git init  # creates main branch, no commit
git annex init  # creates git-annex branch with a commit, also modifies .git/config
echo big-data > file.dat
git annex add file.dat  # creates commit in git-annex branches
git commit -m "Adding file.dat"   # creates commit in main
git annex addurl --file file.dat http://www.oneukrainian.com/tmp/file.dat  # only updates git-annex branch if no content change
# For now no remote
git remote add --fetch remote ../remote  # no commits
git annex sync  # if remote had its own history for git-annex branch -- it would get merged
git annex copy --to=remote file.dat  # commit in git-annex branch updating availability information
)

and then running like

*$> ./record-commands4sim.sh ./try-git-sim-short.sh 
Initialized empty Git repository in /tmp/dl-a82JX34/remote/.git/
init  ok
(recording state in git...)
Initialized empty Git repository in /tmp/dl-a82JX34/origin/.git/
init  ok
(recording state in git...)
add file.dat 
ok                                
(recording state in git...)
[master (root-commit) 5abb9a8] Adding file.dat
 1 file changed, 1 insertion(+)
 create mode 120000 file.dat
addurl http://www.oneukrainian.com/tmp/file.dat ok
(recording state in git...)
Updating remote
(merging remote/git-annex into git-annex...)
(recording state in git...)
commit 
On branch master
nothing to commit, working tree clean
ok
pull remote 
ok
push remote 
ok
copy file.dat (to remote...) 
ok                                
(recording state in git...)
Commands with time stamps collected in /tmp/sim-QxuVG8E

NB note that we are swallowing all stderr here into our log file

we get:

$> grep '^>' /tmp/sim-QxuVG8E
> 2023-07-10 18:04:19.016554968: bash -x ./try-git-sim-short.sh
> 2023-07-10 18:04:19.020911120: set -eu
> 2023-07-10 18:04:19.022840602: umask 022
>> 2023-07-10 18:04:19.025140046: mktemp -d /tmp/dl-XXXXXXX
> 2023-07-10 18:04:19.027615139: cd /tmp/dl-a82JX34
> 2023-07-10 18:04:19.028610488: mkdir remote
> 2023-07-10 18:04:19.030532827: cd remote
> 2023-07-10 18:04:19.031196319: git init
> 2023-07-10 18:04:19.033727446: git annex init
> 2023-07-10 18:04:19.088426848: mkdir origin
> 2023-07-10 18:04:19.090178266: cd origin
> 2023-07-10 18:04:19.090890911: git init
> 2023-07-10 18:04:19.093361474: git annex init
> 2023-07-10 18:04:19.147746324: echo big-data
> 2023-07-10 18:04:19.148348199: git annex add file.dat
> 2023-07-10 18:04:19.192806713: git commit -m 'Adding file.dat'
> 2023-07-10 18:04:19.224348891: git annex addurl --file file.dat http://www.oneukrainian.com/tmp/file.dat
> 2023-07-10 18:04:21.782459476: git remote add --fetch remote ../remote
> 2023-07-10 18:04:21.796545250: git annex sync
> 2023-07-10 18:04:22.003341497: git annex copy --to=remote file.dat
> 2023-07-10 18:04:22.091335309: echo 'Commands with time stamps collected in /tmp/sim-QxuVG8E'

and thus all time stamped invocations for all commands, so that some --animate could figure out what commit was produced by what (preceeding in time) git command.

The other question is where to display the annotations in the output. Should it replace the commit message? Or somehow make more room for the message?

I thought like "closed captions" at the bottom appear for the duration whenever animation of commit(s) being created displayed

from git-sim.

yarikoptic avatar yarikoptic commented on June 14, 2024

a "gotcha" is that git is likely not recording subsecond resolution for any of those "dates" so any "demo" script would still need to abuse PS4 to do some sleeping... well -- that record-commands4sim.sh could do that I guess -- it would delay execution but I think that is ok

from git-sim.

initialcommit-io avatar initialcommit-io commented on June 14, 2024

Ok thanks for spelling it out for me - I am not that great with shell. But now I see what you have in mind.

My thoughts are that this deviates a little bit from the purpose of git-sim, which is to allow the simulation of Git subcommands on an existing repo (aside from clone and init which obviously don't start with an existing local copy).

It seems your use case is more to simulate a flow of git commands of a whole workflow - in this case with git-annex - as opposed to showing the result of a specific Git subcommand.

But this got me thinking - as a part of git-sim, I wrote a simple Python dependency called git-dummy. In fact this is already installed by default when git-sim is installed. The purpose of git-dummy is to generate dummy Git repos (basically sample repos with fake data) that are in a desired state for executing the desired git-sim subcommand. For example you can choose the number of commits, branches, merges, divergence-points, merge-locations, etc...

I wonder if adding a new feature to git-dummy would make more sense than modifying git-sim itself for this. For example, we could add a feature in git-dummy that would override the default repo generation and allow specification of a custom set of commands to run on the repo instead - similar to the set of commands you provided:

git init  # creates main branch, no commit
git annex init  # creates git-annex branch with a commit, also modifies .git/config
echo big-data > file.dat
git annex add file.dat  # creates commit in git-annex branches
git commit -m "Adding file.dat"   # creates commit in main
git annex addurl --file file.dat http://www.oneukrainian.com/tmp/file.dat  # only updates git-annex branch if no content change
# For now no remote
git remote add --fetch remote ../remote  # no commits
git annex sync  # if remote had its own history for git-annex branch -- it would get merged
git annex copy --to=remote file.dat  # commit in git-annex branch updating availability information

Maybe this could be specified in a file that gets passed in as a command-line arg, like:

git-dummy --input-file=commands.txt

Since git-dummy is actually generating the git repo, we can add a flag to create a Git note with each commit, corresponding somehow to the command specified in the file. Although it might get a bit hairy with commands that don't explicitly lead to a new commit (which we could possibly handle with a manual symbol added in the input file), and would rely on specified commands to be available locally.

Anyway, at this point the Git notes would be populated with the "closed caption" data, so it should be simple to ingest those from git-sim and add them to the animations (while using chronological order based on the other ticket you raised).

Maybe I'm missing something that makes this method completely not work, but please let me know your thoughts.

from git-sim.

initialcommit-io avatar initialcommit-io commented on June 14, 2024

Also I would probably try and avoid stuff like correlating timestamps and intentional program delay which could get messy and slow down the performance. Wouldn't using Git notes remove the need for this since each "closed caption" note would be associated with its corresponding commit upon creation, so no need to correlate timestamps or introduce delays?

from git-sim.

yarikoptic avatar yarikoptic commented on June 14, 2024

Maybe I'm missing something that makes this method completely not work, but please let me know your thoughts.

no - avoiding bash hackery sounds like a good way forward! Indeed it could be a helper which runs "one line at a time" and adds corresponding command to git notes... the "problem" could be that git notes IIRC would associate with a particular commit but git annex commands would operate on some other branch and may be generate multiple commits (IIRC it should be possible). so may be git notes would not be the best storage. I guess ideally that helper for each line execution could be a list of records stored in yaml or json with fields

  • datetime
  • refs (list, all branches and tags)
  • command (str, to be executed)

with the last record at the end without any command value. Then it would be possible to rebuild entire history -- what refs appeared, what commits where done between different states of refs etc.

from git-sim.

initialcommit-io avatar initialcommit-io commented on June 14, 2024

the "problem" could be that git notes IIRC would associate with a particular commit but git annex commands would operate on some other branch and may be generate multiple commits (IIRC it should be possible). so may be git notes would not be the best storage.

Currently everything in git-sim is done based on commits, so I was assuming that annotations would be linked to individual commits to show what command resulted in that commit. That is why it seemed OK to use something like Git notes so that the code could just grab the associated annotation (if exists) and display it while drawing the commit. But maybe I'm missing something with how git-annex (or other technically non-git commands) should be represented.

I guess ideally that helper for each line execution could be a list of records stored in yaml or json with fields

datetime
refs (list, all branches and tags)
command (str, to be executed)
with the last record at the end without any command value. Then it would be possible to rebuild entire history -- what refs appeared, what commits where done between different states of refs etc.

Again I would assume that even if storing the annotation info in a data file you could just store commit ID along with each annotation. Why correlate timestamps if we can just store the commit ID? But it seems the you're thinking of a different representation?

from git-sim.

yarikoptic avatar yarikoptic commented on June 14, 2024

you could just store commit ID along with each annotation

I am thinking of use cases where a git command could result in multiple commits across multiple branches. Let me give an example:

❯ grep . .git/refs/heads/{git-annex,master}
.git/refs/heads/git-annex:e9fd7bbc0e61e34f83bf38b3fd4b29717fc83a96
.git/refs/heads/master:500c80db9cc457629a2af5e90ad1b3f32a57ca25
❯ rm file.dat; echo buga > file.dat
❯ git commit -m 'new version of file.dat' -a
(recording state in git...)
[master f6cc3d3] new version of file.dat
 1 file changed, 1 insertion(+), 1 deletion(-)
 mode change 120000 => 100644 file.dat
❯ grep . .git/refs/heads/{git-annex,master}
.git/refs/heads/git-annex:3efcf509bbd04a833edee5e45599fc3cfbfe90ac
.git/refs/heads/master:f6cc3d314a25029523640a0df5acc6e28fd6598c

so, indeed , in principle, you can git notes both of those commits f6cc3d314a25029523640a0df5acc6e28fd6598c and 3efcf509bbd04a833edee5e45599fc3cfbfe90ac (didn't check but there could be multiple actually on git-annex branch) with the same command (git commit -m ...' -a), but then may be you would kinda need to reconstruct the fact that it was the same single command which produced both of the commits, right?

NB my greedy wish here is to eventually use not only git commands but datalad commands too -- and those could easily create more than one commit across different branches, e.g. datalad create alone is git init; git annex init; git add .datalad/config; git commit in a nutshell; datalad addurls could add gzillion of files/urls and create git submodules etc...)

FWIW, if going git notes way -- might be worth making those notes "machine readable" (we do similar in datalad run, see e.g. any of those commit messages https://github.com/search?q=datalad+runcmd&type=commits) and have it like

=== git-sim:yaml Do not change lines below ===
git_sim_id: 2d22620e-1484-4555-be70-68e115e6f284:3  # as the 3rd command executed within that unique run
start_datetime: 2023-07-11 10:54:25.184822800
end_datetime: 2023-07-11 10:54:26.184822800
command: git commit -m "buga duga" -a
^^^ git-sim:yaml Do not change lines above ^^^

then listing the git notes would indeed allow to uniquely and unambigously associated even multiple commits with the same git_sim_id -- no interpolation or anything. But helper should add git-notes for all commits which were generated by running a command.

NB: adding multiple notes for the same commit could be tricky - would require reading old one and adding a new one, so multiple records for different "things" could co-exist I guess if so desired

❯ git notes add -m 'new message'
error: Cannot add notes. Found existing notes for object bb686d3bc696f22f30f51955b32aaf768566aec8. Use '-f' to overwrite existing notes

from git-sim.

initialcommit-io avatar initialcommit-io commented on June 14, 2024

Closing without implementation as the effort to implement such a thing likely outweighs the demand from a user perspective. However, if more folks request this I will reconsider.

from git-sim.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.