Comments (7)
Unless I'm doing something wrong, it looks like Code.AI is suggesting bad code. I wasn't sure where to report this issue [...]
While i am just a user of HAP and not associated with the HAP project nor its authors/maintainers, i would like to point out that the issue tracker for HAP is not the right place to report issues with zzzcode.ai. Instead, i would like to suggest you report the problem you encountered in the issue tracker for zzzcode.ai, whose project site (including its issue tracker) is also on github: https://github.com/zzzprojects/zzzcode.ai
Regardless of your misplaced issue report and whatever zzzcode.ai produced in response to your question for it, note that XPath expressions were not invented for querying HTML, but for querying XML documents. As such, XPath does not know the concept of "classes" as in HTML. Consequentially, you would have to write the XPath expression in a manner so that it explicitly tests class
attributes for the occurrences of certain (sub)strings -- the class names -- using the contains
function similarly to what this SO Q&A details: https://stackoverflow.com/questions/1604471/how-can-i-find-an-element-by-css-class-with-xpath. For testing of the occurrence of multiple strings in an attribute value, chain/combine multiple such contains
test with either the and
or or
operator.
Also keep in mind that HAP relies on .NET's own System.Xml.XPath
infrastructure, which is and will be limited to XPath 1.0 expressions.
from html-agility-pack.
@JonathanMagnan
note that currently, after the fix, the AI code generator generates an example code like
HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[contains(@class, 'class1') or contains(@class, 'class2')]");
This is still not entirely correct. Someone who due to a lack of knowledge has to ask such a question to an AI assistant might intuitively expect that the given answer/code example selects div nodes having the CSS classes "class1" or "class2".
But the answer/code example provided by the bot also selects div nodes with the classes "class13", "class21", "subclass10", etc., which arguably is not what the asking person is looking for.
The StackOverflow Q&A i linked to in my previous post demonstrates an accurate XPath expression for selecting CSS classes that should work flawlessy under any circumstances without making any assumptions about the use case and therefore should be suggested by the bot instead:
//div[contains(concat(' ', normalize-space(@class), ' '), ' Test ')]
It's not pretty (because it involves padding the @class
attribute value as well as the class name with spaces), but that's what is needed to get a robust and accurately working XPath expression for this task.
from html-agility-pack.
Nice 👍👍👍👍
from html-agility-pack.
Hello @elgonzo ,
We don't choose what ChatGPT generates. It might either help or lead you in a bad way, but the more time passes, the better it becomes.
Eventually, it will be easier to train a custom model for a specific subject, and then we will be able to provide him with a ton of examples about what the best practices should be despite what he already knows. Open AI is growing quickly with new features every month, so I believe by the end of 2024, they will provide an easier way to train a custom model dedicated to Html Agility Pack (it's already possible with some third-party software at this moment).
from html-agility-pack.
oh, i didn't know. Thanks for letting me know. I assumed you did something with respect to how zzzcode.ai utilizes ChatGPT, as after noticing you closed the issue i did a quick check of the result it generates now and got a result that is different than what the OP originally got.
Cheers, and Happy New Year!
from html-agility-pack.
Happy New Year @elgonzo ;)
from html-agility-pack.
Whether the AI is improving or not, just like other modern early adopters of GPT, your going to get bad results, and those bad results taint the believability of GPT. I'm not sure I'll fully trust the AI for accurate answers from any source or subject, as it's just a text prediction algorithm, not artificial intelligence. You may say is a good thing to not trust it fully, however, when your product is so light on documentation and relies so heavily on PGT, you're generating support calls and confusion.
IMO, you may want to pop it back in the oven until it's done.
from html-agility-pack.
Related Issues (20)
- Inconsistent comment rendering HOT 6
- update System.Net.Http to 4.3.4 HOT 2
- The html rendering result is different from the html output result HOT 5
- The html rendering result is different from the html output result when tbody is added inside unclosed th HOT 3
- After applying HAP1.11.57, InnerText cannot be obtained correctly. HOT 11
- (2) The html rendering result is different from the html output result HOT 4
- Can't find a node with a long id HOT 1
- it happened again after a Rolled back to version HtmlAgilityPack 1.11.50, then again to HtmlAgilityPack 1.11.57 and it worked. HOT 2
- Can't find a node used HtmlAgilityPack 1.11.57 HOT 2
- How to make the DocumentNode.SelectNodes(XPath) for both text and img content together in the correct sequence? HOT 3
- ??? HOT 2
- Double <p> <p> open tags leave one <p> open even with option setted HOT 6
- The html rendering result is different from the html output result when we have two unclosed tbody tags HOT 2
- When we have a closing tag before the current open tag is closed we have different results between HAP and Chrome rendering
- The formatting is a bit interesting HOT 1
- Set PackageLicenseExpression on nuget HOT 5
- page source code HOT 19
- [HtmlAgilityPack version 1.11.60] request: add HtmlWeb Load() exception HOT 5
- bug: SelectSingleNode not returning anything HOT 7
- Modify a `#text` node name causes a `StackOverflowException` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from html-agility-pack.