Giter Site home page Giter Site logo

phpgt / cssxpath Goto Github PK

View Code? Open in Web Editor NEW
19.0 5.0 11.0 274 KB

Translate CSS selectors to XPath queries

Home Page: https://www.php.gt/cssxpath

License: MIT License

PHP 100.00%
xpath css css-selector xpath-queries css-selector-parser phpgt dom translate-css-selectors

cssxpath's People

Contributors

alkarex avatar chrishow avatar daviddeutsch avatar dependabot-preview[bot] avatar dependabot-support avatar dependabot[bot] avatar g105b avatar peter279k avatar tfedor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

cssxpath's Issues

:not() pseudo selector

There is currently no functionality when using :not due to how the regex parsing is set up, but for completeness this would be nice in v2.

Attribute names are currently matched in a case-sensitive manner. This is correct for XML, but not for HTML

I have noticed that HTML attribute names are matched in a case-sensitive manner, for example:

Given a document with the contents <div data-FOO='bar'>baz</div>, the selector "[data-foo='bar']" does not match.

This is the correct behaviour for XML documents, which are case-sensitive everywhere, but not for HTML documents, where tag names and attribute names are case-insensitive.

I've submitted a draft PR with a trivial fix for this, but this will break matching in XML documents.

Perhaps there should be an optional argument to the Translator constructor which specifies the document's DOMDocumentType? Or a HtmlTranslator subclass with the different behaviour?

I'll be happy to contribute the code, but wanted to get opinions from the community.

I should like to add that I'm very grateful for this excellent package!

Child with named attribute fails to select

form [name] should select all elements with a name attribute that are a child of a form. The selection breaks:

TypeError : Argument 1 passed to Gt\Dom\HTMLCollection::__construct() must be an instance of DOMNodeList, bool given, called in /home/g105b/Code/PhpGt/DomTemplate/vendor/phpgt/dom/src/ParentNode.php on line 72

:last-child, :last-of-type

These are currently not implemented. Specific usage that I've hit is when wanting to add the selected attribute to the last option of a select element.

Attribute selector without tag name doesn't work

HTML:

<form>
  <button name="do" value="save">Save!</button>
</form>

PHP:

$saveButton = $document->querySelector("form [name='do'][value='save']");

Exception raised: Gt\Dom\Exception\XPathQueryException - Query is malformed: //form//[@value="save"]

As a quick fix, I can change the query selector to: form button[name='do'][value='save'], and by explicitly mentioning the button is being selected, the problem goes away.

I think the issue is within the "attribute" part of the regex on line 15. It should optionally match an element before it, outside of its named matching group.

querySelector and attribute's value

When specifying the value for an attribute in querySelector() method like tag[attr='value'] , the method doesn't work as expected and returns null.

Example to reproduce the issue:

<?php

require "vendor/autoload.php";

$html = file_get_contents("https://github.com");
$document = new \Gt\Dom\HTMLDocument($html);

//will print "Enterprise"
echo $document->querySelector("nav > ul > li > a[data-ga-click]")->innerText . "\n";

//will throw PHP Notice: Trying to get property 'innerText' of non-object
echo $document->querySelector("nav > ul > li > a[data-ga-click='(Logged out) Header, go to Enterprise']")->innerText . "\n";

The expected behaviour is that the last line should print "Enterprise"

If I try to run the exact querySelector call in my browser, it works correctly and both lines print "Enterprise":

document.querySelector("nav > ul > li > a[data-ga-click]").innerText

document.querySelector("nav > ul > li > a[data-ga-click='(Logged out) Header, go to Enterprise']").innerText

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.