A software product is compiled on different systems with different compilers. E.g. thi

Clarifying the above comment: the XML coverage data format doe

this should be addressed in commit <a class="commit-link" data-hovercard-type="commit"

Support cobertura format as input about lcov HOT 10 CLOSED

dilyanpalauzov commented on August 16, 2024

Support cobertura format as input

from lcov.

Comments (10)

henry2cox commented on August 16, 2024

I took a very quick look at the coverage-*.dtd doc (seems to be no official spec?), then took a quick look at the existing py2lcov translator (part of the lcov package - see .../scripts/py2lcov and also py2lcov --help.)
On first glance (and totally untested), it appears that this might be close to what you want.

Some things to note:

The py2lcov XML path is a deprecated feature (because we don't use it any more - and I don't know of anyone who does.) That might need to be changed, if this turns out to be useful.
This translator is intended for python, and makes assumptions about input language - specifically, indentation - and might need to be enhanced a bit. This is going to affect function coverpoints - as the script will try to deduce the function end line based on Python indentation expectations.
The XML format generated by coverage.py (which may or may not be the same as the xml format generated by cobertura) supports 'line', 'function', and 'branch' coverage (in lcov parlance). It does not contain enough information to support branch expressions - e.g., so the report you get is less useful than for perl or Verilog (say).

Presuming that this is close to what you are looking for, some refactoring is likely necessary to make it work in a reasonable fashion.

from lcov.

henry2cox commented on August 16, 2024

Implemented.
Will push the fix/enhancement after some testing. Perhaps next week some time.

You may or may not need it, but you can generate a differential report to see the code exercised by A and not B, B but not A, and neither. We sometimes turn up bugs when something is hit that should not be, or vice versa.

from lcov.

henry2cox commented on August 16, 2024

Clarifying the above comment:

the XML coverage data format does not contain enough information to deduce exactly which branch expressions have been taken or not taken.
It reports the total number of branch expressions associated with a particular line, and the number of those which have been taken. There is no way to know (except, possibly by inspection of surrounding code and/or some understanding of your implementation) exactly which ones.

This is a problem in at least 2 ways

It is not straightforward to use the result to improve your regression suite because you don't really know what was exercised/not exercised.
Coverage data merge is problematic.
For eample: you have two testcase XML files, each of which hit 4 of 8 branches on some line. Does that mean you hit 4 of them (both tests exercised the same code), all 8 (tests exercised disjoint subsets), or some number between?
This implementation assumes that the first M branches are the ones which are hit and the remaining N-M were not hit, in each testcase. Thus, the combined result in the above example would claim 4 of 8 branches hit. (This definition turns out to be a lower bound.)

The above issue and interpretation is applied to any coverage data that arrives in lcov via XML import.
Since py2lcov uses XML import internally: it applies to Python code.
Neither gcov or llvm-profdata nor Perl Devel::Cover import have this issue (though they may have other issues).

from lcov.

dilyanpalauzov commented on August 16, 2024

the XML coverage data format does not contain enough information to deduce exactly which branch expressions have been taken or not taken.

cl.exe can generate three XML coverage formats, per above hyperlink, it is not clear which one you mean.

Does the Cobertura format contain this information?

from lcov.

dilyanpalauzov commented on August 16, 2024

The above issue and interpretation is applied to any coverage data that arrives in lcov via XML import.

I do not understand this text about any XML format. Either the needed information is in the input files, or it is not there. How does the choice of XML format create problems with missing data, even the data is avalable in the input.

from lcov.

henry2cox commented on August 16, 2024

cl.exe can generate three XML coverage formats, per above hyperlink, it is not clear which one you mean.

I don't know about the cl.exe data formats - feel free to post or email me a set of representative examples, and I can check.

The format I tested with was found at https://gist.github.com/apetro/fcfffb8c4cdab2c1061d (~10Mb) - and claims to be the XML spec version used by Cobertura.
This is very similar to the XML format generated by Coverage.py - but the data in the above link has some additional fields.
Neither that Cobertura nor Coverage.py format contain enough information to resolve branch expressions.

Does the Cobertura format contain this information?

No.

from lcov.

henry2cox commented on August 16, 2024

I do not understand this text about any XML format. Either the needed information is in the input files, or it is not there. How does the choice of XML format create problems with missing data, even the data is avalable in the input.

From my (limited) reading: it seems that there is some ambiguity or some discussion about exactly what the XML format for coverage data looks like. Certainly, the data produced by the Python tool is different than what appears to be produced by Cobertura (but note that the Cobertura data I looked at was for Java code...I don't know what it would have shown for Python code - nor do I know how it would have generated that Python data, if not through Coverage.py - so I tend to doubt that we would see anything new).

Thus the upshot is: no. None of the XML coverage data that I have seen contains sufficient information to identify and distinguish between branch expressions.
There may be yet another XML flavor which does - but I have not seen such data and do not know of such a tool.
I do know of multiple tools for other languages, which can (and do) contain such information.

from lcov.

henry2cox commented on August 16, 2024

this should be addressed in commit f18d34d
Please give it a try, and see if it works as you expected.
If so..please go ahead and close this issue.
If not: please describe the problems you see - and ideally, include a testcase which illustrates the bugs.

from lcov.

dilyanpalauzov commented on August 16, 2024

I was told, that Cobertura output, created by Microsoft’s cl.exe/instrumentation utilities, does create C++ mangled names, and unmangling is not handled anywhere.

Moreover the same output contains method names, without C++ class names, so taking input from two compilers (gcc and MS/Cobertura) and mapping them one over other, does not match the function names.

I personally have no access to Microsoft software, generating coverage information. At the same time my focused moved away from test coverage, so I’m closing this.

from lcov.

henry2cox commented on August 16, 2024

I was told, that Cobertura output, created by Microsoft’s cl.exe/instrumentation utilities, does create C++ mangled names, and unmangling is not handled anywhere.

Both lcov and 'genhtml support demangling - see the --demangle section in the man pages.
However, the xml2lcov translator does not support demangling.
It should be possible to either read the xml2lcov output (possibly containing mangled names) into lcov (genhtml) demangle, and then write out translated .info data (a demangled HTML report).

Moreover the same output contains method names, without C++ class names, so taking input from two compilers (gcc and MS/Cobertura) and mapping them one over other, does not match the function names.

If mangling is different between different tools and you want a unified report, then you would either have to demangle the individual tool data separately and then aggregate, or write your own demangle tool which handles the different formats.
The latter would be possible only if your wrapper could distinguish gcc vs MS names from context.

I personally have no access to Microsoft software, generating coverage information. At the same time my focused moved away from test coverage, so I’m closing this.

Sounds good to me.
Presumably, if anyone else is interested in XML/Cobertura and finds lcov bugs, they will file a new issue.

Henry

from lcov.

Support cobertura format as input about lcov HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent