Which is the best to implement a filtering criteria on issues before rendering them? S

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Here the doc. <a href="https://github.com/walkmod/pmd/wiki/PMD-desig

[Core] Suggestion to filter issues as a renderers implementation,about pmd/pmd

Comments (17)

jsotuyod commented on May 21, 2024

Hi @rpau!

With the current codebase, you are probably right. The best way to implement some global filter on reported issues would be by implementing a custom Renderer using the decorator pattern, and filtering which reports actually make it to the inner renderer. That way, the filtering is kept isolated from the report format being used, and can be used independently without repeating code.

Semantically this may not be the best option (a renderer that does not render, but has some business logic), but is not so bad (this could be even improved by providing an AbstractFilteringRenderer with the boilerplate, and letting different implementations just implement a boolean filter(RuleViolation violation) method or something of the sort...

I'm not sure how we would to expose this feature.... CLI / Ant task could be an option, but having it as a global property on the ruleset xml (just like today we can provide suppression regexp / xpath per rule) could also be an option....

I'm curious though, what is the use case you are trying to address?

from pmd.

rpau commented on May 21, 2024

Hi,

I am trying to resolve which are the new issues created during the last changes of developer.

The problem of implementing a Renderer as a filter is that the user cannot define anymore the format of the output (XML, txt, whatever), right?

Ok, I can use always the same.. but I think that probably there should be a different interface for it (PostFilter) but, in that case, it implies a lot of changes in the core of PMD. I can work on it, but are you interested on accepting such changes?

Another question, if a renderer requires an extra dependency (e.g JGIT), do you accept it?

from pmd.

jsotuyod commented on May 21, 2024

@rpau by using the decorator pattern you can avoid the output format issue. Your filtering renderer could decorate an XMLRenderer, a TextRenderer or whatever. To do so, I wouldn't expose filters as renderers per-se (even if they implement Renderer), but differently (as I mentioned in my earlier comment). The idea would be to tell PMD "use the xml renderer, and use this filter"

As for your particular use case, I'm not sure we can build such a feature in a one-size fits all... How do we determine what is part of the "last changes" by the developer? Depending on how you use git in your project / team / organization it could mean:

all staged / unstaged changes
all changes in commits since the merge-base with branch X (master, development, support-v1.3, whatever)
all changes in commits in the current branch not already pushed to remote X (origin, upstream, whatever)

The only project I know to do what you intend is arcanist, which applies this on top of several linting tools, and does so by imposing a specific workflow on top of Git / SVN / Mercurial. I personally wrote bridges for Arcanist + Maven + PMD, Arcanist + Gradle + PMD and Arcanist + CPD CLI among other linters (can be found here).

On the other hand, there may be a different approach to solving this use case (not without trade-offs):

We could apply a diff against a baseline report, provided both the new report and the baseline are using the same format. This of course would require the developer to keep around a baseline report.
We could leverage the analysis cache being built for PMD 5.6.0. Right now it doesn't store reported issues, but it's in our roadmap. This however wouldn't play well with switching branches, since the cache is based on file checksums; and won't discern between changed / unchanged lines within a modified file.
We could wrap PMD in a bash script / Gradle plugin / ant task / whatever you are using to ask PMD to analyze only the files git (or any other VCS) report as modified. This wouldn't take into account which lines were actually modified however.

As for supporting this within PMD, we would be interested if we can make sure it will be useful to our users.

Finally, regarding jgit, adding a dependency of over 2MB is significant, but considering Scala alone is currently adding ~25MB of dependencies, and Apex another ~28MB I don't think it would matter. We are however thinking of ways to reduce this

from pmd.

rpau commented on May 21, 2024

Hi,

Thanks for your quick feedback, it is really appreciated 👍 . Here, my comments:

Decorator pattern: I though to use that pattern, but what worries me is that renderer is created by reflexion API in a way that inner objects cannot be recursively populated. Then, if the Git filter is a renderer that contains another renderer, who instantiates by reflection this second one by using the user properties in the command line? Am I wrong with these assumptions?
Last changes: By default, I would check about the status between the last pull and the current status. Moreover, I need to maintain the last analysis in a remote server for an specific branch. Using that information, I can decide if an issue is new or not but It is a very "customized" implementation for me (like Codacy or Intellij). So, I do not know if you would like to receive my contribution in such a case. However, the only external API I need is JGIT.
Phabricator / Archanist: My intention is not force a way to collaborate (e.g branches, code-review, etc), but simply understand the new changes (may be the user needs to configure some parameters). I have checked the code of your links and the documentation, but I have not seen anything about the synchronization. But, yes you get my idea :). In fact, I was doing a similar approach parsing the output of PMD, but after a while I though I can avoid that integrating this filtering on PMD, and then avoid generate my own tool for that. But then, I realized I need to modify the API to support my renderer :S.

I can create library that extends a PMD configuration to force my renderer, but I cross my fingers to integrate it here :) and avoid creating new maven, gradle, maven plugins..

About your suggestions: I think that may be it is a good idea to generalize which is the repository that returns existing issues and which date has this repository. This repository could be your local cache or could be a remote server. What do you think?

Thanks a lot again for your feedback.

from pmd.

jsotuyod commented on May 21, 2024

Decorator pattern: I though to use that pattern, but what worries me is that renderer is created by reflexion API in a way that inner objects cannot be recursively populated. Then, if the Git filter is a renderer that contains another renderer, who instantiates by reflection this second one by using the user properties in the command line? Am I wrong with these assumptions?

Actually, exposed renderers are manually mapped here. We could create rendererer implementations and not expose them. The only way a user may select a non-mapped renderer is providing it's fully qualified name. Non the less, we could avoid this by having a marker interface (something such as RendererFilter extends Renderer) and simply making sure here that the selected renderer is a Renderer but also is not a RendererFilter. This way we should be able to keep the user specifying the output format (ie: pmd -f xml), and have filters configured elsewhere.

Last changes: By default, I would check about the status between the last pull and the current status. Moreover, I need to maintain the last analysis in a remote server for an specific branch.

Not sure I understand what you refer by "status between the last pull and the current status". Is it the commit range between the first common ancestor between remotes/XXX/branchY and local branchY, and HEAD? If so, doing a pull merging local changes to remote or doing a push would reset the "local changes" to empty, right?

I don't understand what you need the analysis in a remote server for. If this is really needed within the PMD source for your idea, we will probably not accept such contribution, since we don't offer hosting for such files, it would be ad-hoc, and raises issues about user permissions and security. However, I don't feel you should need it if you already have the git history at your disposal.

About your suggestions: I think that may be it is a good idea to generalize which is the repository that returns existing issues and which date has this repository. This repository could be your local cache or could be a remote server. What do you think?

Sounds interesting, would need to iterate it. However, I'd probably not work on this for the first iteration of this feature to keep the workload manageable / auditable, but keep it in mind.

from pmd.

rpau commented on May 21, 2024

Hi, my comments below:

Actually, exposed renderers are manually mapped here. We could create renderer implementations and not expose them. The only way a user may select a non-mapped renderer is providing it's fully qualified name.

Ok, cool.

Non the less, we could avoid this by having a marker interface (something such as RendererFilter extends Renderer) and simply making sure here that the selected renderer is a Renderer but also is not a RendererFilter. This way we should be able to keep the user specifying the output format (ie: pmd -f xml), and have filters configured elsewhere.

Exactly, filters needs to be configured with an extra parameter. I was thinking, and I have a new suggestion: is that Renderers should have an optional property which is Filter, and thus separate the condition from the rendering part. In that way, from the CLI you can distinguish this argument and instantiate the Filter.
In that way, I only need to implement a Filter interface that can be integrated in any Renderer.

Not sure I understand what you refer by "status between the last pull and the current status". Is it the commit range between the first common ancestor between remotes/XXX/branchY and local branchY, and HEAD?
Yes

If so, doing a pull merging local changes to remote or doing a push would reset the "local changes" to empty, right?

Yes and no. If you do a pull, you are not overriding your status despite of having pending changes to commit. Then, you would perform the analysis on new or modified files.

When you perform a push, the date of the last analysis is obviously previous to the last commit, and if you run again PMD, you will always compare with it.

since we don't offer hosting for such files

I am not downloading PMD files, I am just performing a REST call (using a JSON response), which returns me the existing issues por an specific file. I need this remote call to resolve if an issue like "unused method" references the line of a method that has not been modified but it is not referenced anymore because the user has removed the call. Notice that I am resolving the NEW issues produced by the specific modifications of the user, not the complete files. When the issue belongs to a new line, it is fine. However, when it is not the case, I need to compare with the previous analysis for such file (applying the corresponding line numbers' resolution).

What do you think now? I think that we can finally have generic approach afterwards that helps to to decide to use the remote server or the local repository you want to build.

How do you want to proceed?

from pmd.

jsotuyod commented on May 21, 2024

I'm still not certain we are approaching this the right way. I guess the best way forward is to produce a design document, go into detail on how this feature will be developed and exposed, and move on from that point.

from pmd.

rpau commented on May 21, 2024

I have forked this repo and I have created a very preliminar implementation with the corresponding unit tests. I think that it should be enough to start our discussion.

https://github.com/walkmod/pmd

It contains a single commit (the last one). Do you want my pull request to start discussing it? There is a formatter issue on the pom.xml, but we can discuss the rest of the code (which also passes the checkstyle rules ;) )

from pmd.

jsotuyod commented on May 21, 2024

@rpau thanks, but I was actually expecting a design document rather than code. The code seems to contradict some of the things we discussed (ie: the renderer is exposed, and not a generic decorator independent of format).

I'd appreciate a design doc so we can actually assess:

how renderer and filters are configured through Ant and CLI
how the filter detects what is new code to be analyzed. I'd specially like seeing this expressed as one or more standard git commands (git merge-base XX YY or git rev-parse ...)
how the filtering process works, specially the use of the json API. I still don't get why it's needed at all and this looks like a deal breaker for PMD.
clear identification of trade-offs being made. I'm particularly concerned on how extensible this could be outside Git (SVN, Perforce, Mercurial, etc.)

A simple markdown concise description should work. I'd happily work with you on it shall you need any help.

from pmd.

rpau commented on May 21, 2024

Here the doc.

https://github.com/walkmod/pmd/wiki/PMD-design-to-integrate-our-solution

Sorry to avoid remark that my repo is a simple implementation to validate the logic behind the incremental problem rather than an architecture to reuse it from any component of PMD.

from pmd.

jsotuyod commented on May 21, 2024

Thanks! That was really enlightening!

I now understand why you want the remote file. However, I fear we will not merge this back if we can't come up with an alternate solution.

I've some other comments, but mostly minor stuff not really worth discussing if we can't get rid of a need for a remote server. They mostly concern alternate scenarios, and are not problematic if restrained to your particular case.

We are available to support you if you need us, and hope we can still collaborate regardless of the outcome of this particular suggestion.

from pmd.

rpau commented on May 21, 2024

@jsotuyod Thanks for your quick feedback.

Sorry if this "question" is taking too much time, but I would appreciate to understand why, if all my logic belongs to my own Renderer implementation, it cannot be added as part of PMD. Is this because I am using an extra dependency (JGit)?

Otherwise, I cannot use PMD with this extension from Maven or Gradle, isn't it? They do not allow to extend the plugins classpath to support extra renderers that belong to third party libraries.

It has been a pleasure to discuss with you :)

from pmd.

jsotuyod commented on May 21, 2024

@rpau my main concern is that the JSON API you are proposing is outside the scope of PMD. Any user willing to use this new feature from PMD would require to develop, publish and support a web server providing that specific API. This would be hard and misleading on new users trying to make use of the feature.

You could add jars with extensions for PMD into both, Maven and Gradle.

For Maven, you could do:

  <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-pmd-plugin</artifactId>
    <version>XXX</version>
    <dependencies>
      <dependency>
        <groupId>your-dep</groupId>
        <artifactId>your-dep</artifactId>
        <version>your-version</version>
      </dependency>
    </dependencies>
  </plugin>

As for Gradle, you could do:

dependencies {
  pmd "your-dep:your-dep:your-version"
}

from pmd.

rpau commented on May 21, 2024

Yes, the REST API is designed only for my Renderer at this stage, right. But, this incremental feature is only for my own renderer. But I understand your concern about having an "implicit" dependency to use that component.

Thanks a lot for this snippets! I will take a look. Do you know if these plugins also support renderer parameters (in my case for the remote host address)?

The official documentation does not include them.

https://maven.apache.org/plugins/maven-pmd-plugin/pmd-mojo.html

https://docs.gradle.org/current/dsl/org.gradle.api.plugins.quality.PmdExtension.html
Should I need to ask for it?

from pmd.

jsotuyod commented on May 21, 2024

Both are for sure using the pmd ant task, so i don't think so, no. You'll probably have to hardwire those, or depend on environment variables...

from pmd.

rpau commented on May 21, 2024

Ok, thanks.

from pmd.

rsoesemann commented on May 21, 2024

@rpau and @jsotuyod what is the status of this valuable feature request in 2021? Did rpau contribute anything for Incremental analysis? Did PMD improve its analysisCache to contain issues?

What is "our" current understanding on how to best solve that?

FYI @mmoyaferrer

from pmd.

[Core] Suggestion to filter issues as a renderers implementation about pmd HOT 17 CLOSED

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent