Currently the tool is taking the last commit on the branch it is run on and then analy

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Support arbitrary input instead of commit message history about python-semantic-release HOT 6 CLOSED

rantoniuk commented on June 11, 2024

Support arbitrary input instead of commit message history

from python-semantic-release.

Comments (6)

codejedi365 commented on June 11, 2024 1

I haven't specifically looked deep into the code to determine a way to do this but my initial thoughts is that this will be much harder to implement either by PSR or a custom parser than anticipated.

What you can do however is use squash commits to take your PR name into a new single commit (GitHub default is to use the PR title as the commit subject line). PSR will only evaluate the commits between the last tag and the new single commit to evaluate the version.

The problem I foresee with a custom parser is that the parser will be called with all of the git history anyway because PSR handles the walking of the commit tree and hands individual commits back to the parser. It will be called many times. We don't have a way to pass it raw text without an overhaul. We then use the return values to build a history object for changelog generation which adds a different complexity for design.

Maybe I'm not understanding the hypothesized design fully but I do understand the desire to simplify configurations. It also depends on your resulting changelog that you want but I steer away from squashing too much as many times features require multiple commit types that you want to end up in the changelog (like dependencies, docs, fixes) that are now squashed because of the PR. Lots of opinions here but moving to semantic commits requires rigor because the result is more granular changes that are documented appropriately.

Hopefully this was helpful.

from python-semantic-release.

codejedi365 commented on June 11, 2024 1

@rantoniuk, Thank you for your complete response as it helps to provide more context as to what you are trying to accomplish and why. I will try to address each part as I see fit.

let's say we have a standard desktop software, where the development is trunk-based

this means that:

from the branch naming strategy, we can automatically infer the type of change:

feat/xxxx for features, where the branch originates from main

fix/xxx or hotfix/xxx where the branch originates from a tag

I don't want to take too much away from the overall topic but its important to make sure we are using the same taxonomy, so my first comment is that, unfortunately, your first and second bullet are contradictory. Trunk-based development has no development branches (unless we are talking Release Flow which uses "release" branches for tracing releases). I believe from the description of the rest of your workflow, you are describing either GitHub Flow (aka. feature branching) or Git Flow (depends if you have a development branch, but usually where the term hotfix resides).

With that approach, I don't need to parse all the history of commit messages, it's enough that I get the last version or tag name and increment semver based on the branch name. Do you agree?

Yes, if you use a standardized branch prefix, you could use that to determine versions as it does the same thing as the commit prefixes do. PSR is not programmed to support this though and a custom parser provides back a ParsedCommit object for what that commit type is. You would also have to build some way of determining if the current commit is part of the current branch which is likely more difficult than desired because of how git works.

the developers do not need to add any prefixes

Commit prefixes (or the commit type) are the primary concept of the Conventional Commits standard and a part of Scipy's prefix variation of conventional commit standard which are two of the commit parsers we support. Much of this tool was designed around automating the information derived from standardized commit messages.

the PR titles as described above, based on the default PR title setting

Of note, the PR title on GitHub actually changes (to the commit message) if there is only one commit on a branch that is part of the PR. Any more than one and the branch name is the default.

However, looking at commands and configuration it seems that there is no way to set the patch/minor/major option based on branch naming.

Yes, because that is not how we intend the branch configuration to be used (see Multibranch Releases). That is used for release branching to determine prerelease tokens and prerelease status. This helps the Release Flow community and other communities that want to release alpha, rc, and beta variants to a user base.

Even if I don't want to use branch name, it's enough for me to parse the first commit message, which will result in 1.3. Subsequent commits do not make any difference to the version and can be just dropped.

I will confirm in the code again but during next_version determination we only evaluate the commits between the last tag and the current head of the branch for version determination. This evaluation performed from youngest to oldest (more of a git restriction). I bring this up because we don't parse everything but we still must review each commit in case of breaking changes as that is the maximum version bump. The interpretation that we are parsing the entire history is correct in relation to the changelog creation but not related to version determination.

I don't disagree there are some performance gains we need to address, I'm actually in-planning on many of them. From your thoughts here, I should consider if a major version bump is detected, we should abort the loop at that point but unfortunately I don't anticipate this actually saving much processing time.

The only scenario I can think of when this would not work is if we would have a BREAKING commit, that would result in 2.x line - but that should not be on a feat/ branch per my branching scheme.

The branch name implementation you describe I don't believe will work for the Release Flow, Trunk-based development, and/or possibly the Git Flow user bases we support which is why the algorithm focuses on commit messages rather than branch names because it works across all of them. It's also the implementation of the Conventional Commit standard; I am unaware of a branch prefix standard out there if there is one.

My thoughts on simplicity:

I have plans to provide a simpler default configuration generator which provides more of an interactive cli to initialize your configuration rather than the current advanced configuration that is dumped out currently.
Highly recommend reviewing the tool commitizen-py and integrating it into git using a git prepare-commit-msg hook and the --write-message-to-file option. You can also add a git hook called commit-msg to run the cz check command to validate commit messages before they are made. This significantly lowers the developer burden of proper commit messages that then can be interpreted by python-semantic-release. The cz check or equivalent is also recommended to be added as a CI job which catches commit message errors before they become version determination problems.

Overall, I think your version determination strategy would work when adhering to your branch naming strategy. PSR just doesn't do it currently and it would take a deviation of the current program control flow to make it work. My apologies but I'm not sure I have the bandwidth to implement & maintain this type of version determination variant. Lastly, there are tools as described above to create & validate commit messages in line with defined standards, whereas, I'm not sure there are the same for branch naming.

from python-semantic-release.

rantoniuk commented on June 11, 2024

I must say I was brainstorming this in my head for a while and I'm not sure if it actually makes sense to use PSR.

Let me explain:

let's say we have a standard desktop software, where the development is trunk-based
this means that:
- from the branch naming strategy, we can automatically infer the type of change:
  - feat/xxxx for features, where the branch originates from main
  - fix/xxx or hotfix/xxx where the branch originates from a tag

With that approach, I don't need to parse all the history of commit messages, it's enough that I get the last version or tag name and increment semver based on the branch name. Do you agree?

That simplifies a lot:

the developers do not need to add any prefixes
the PR titles as described above, based on the default PR title setting

For this to work, it would be enough to set the TOML config to this:

[semantic_release.branches.minor]
match = "feat/.*"
minor = true

[semantic_release.branches.patch]
match = "fix/.*"
patch = true

However, looking at commands and configuration it seems that there is no way to set the patch/minor/major option based on branch naming.

Why would I want to do this instead of commit parsing (apart from the reasons above)?
Let's take a normal feature branch feat/FEAT with the following history and main where the last released version was 1.2:


feat: xxx
fix: xxx
chore: xxx
feat: update of above

I see the branch name, I can parse this, I know this is already going to be 1.3 release
Even if I don't want to use branch name, it's enough for me to parse the first commit message, which will result in 1.3. Subsequent commits do not make any difference to the version and can be just dropped.

The only scenario I can think of when this would not work is if we would have a BREAKING commit, that would result in 2.x line - but that should not be on a feat/ branch per my branching scheme.

Sorry for the lenghty post, but just trying to KISS as usual for the developers.

from python-semantic-release.

rantoniuk commented on June 11, 2024

Many thanks for a very detailed response, I really appreciate it. I fully understand you cannot support all scenarios - I just thought initially it might be easy to have a custom parser but if that's not the case, I think I will create a Github Action that is similar to the pr-validator I mentioned above.

The branch name implementation you describe I don't believe will work for the Release Flow, Trunk-based development, and/or possibly the Git Flow user bases we support which is why the algorithm focuses on commit messages rather than branch names because it works across all of them. It's also the implementation of the Conventional Commit standard; I am unaware of a branch prefix standard out there if there is one.

This is a 'standardized' git-flow branching scheme, while my version of this is simplified and somehow a mix of feature branch flow.

The core assumption is:

every Pull Request to main is a complete and tested feature -> develop is not needed
based on the above (and also as a requirement), main should always be shippable, i.e. if needed we could even automatically auto-run a release on every PR merge/push to main

All in all, I really appreciate the time you spent on the discussion and the great work you're doing maintaining this tool.

For now, we can close this request and if I do my own PR-based implementation of this, I'll surely drop a comment here later in case anyone would look for the same approach.

from python-semantic-release.

codejedi365 commented on June 11, 2024

Not a problem, happy to help.

One additional consideration if you do build a variant, is to consider how to build the changelog. This is PSR's second operation beyond version determination. I may be reading into your comment about "leaving the git history to the devs" but Conventional Commits enables building of a consumer relevant changelog. We insert the commit messages in a more user friendly format with headings and such. If you only use the branch that is not saved beyond a merge (unless you parse a standardized merge commit message). And then there is a consideration of what the git history is because it will end up on your changelog to consumers. This is why I offered the use of squash and merge to handle your use case.

Either way, thanks for suggesting improvement of the project. Cheers!

from python-semantic-release.

rantoniuk commented on June 11, 2024

About the changelog - commits on the branch history are mostly irrelevant actually - I base this statement on my experience while working with many projects and teams. Branch history is usually a mess, you need to do re-basing, squashing, blah blah, to keep the history branch tidy and useful for changelog generation.

With PR approach it's a lot simpler - I just use the PR title as a functional message to be included in the release notes - and this part is already handled by https://github.com/marketplace/actions/conventional-commit-in-pull-requests

from python-semantic-release.

Support arbitrary input instead of commit message history about python-semantic-release HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent