Giter Site home page Giter Site logo

Comments (6)

scottrigby avatar scottrigby commented on April 25, 2024 2

Regarding <p>&nbsp;</p>, we've tried several XPath selector variations, to no avail. Here is the general ignore rule setup we're testing:

  $config = array(
    'class' => 'IgnoreRule',
    'selector' => $selector,
  );
  $rule = IgnoreRule::createFrom($config);
  $transformer->addRule($rule);

where $selector has been //*[text()="&nbsp;"], //*[text()="\u0160"], //*[text()=codepoints-to-string((160))] (which breaks), and other variations using normalize-space(). Is anyone able to find a way to make this work, or would we need to introduce changes in the SDK to allow it?

from facebook-instant-articles-sdk-php.

everton-rosario avatar everton-rosario commented on April 25, 2024 1

Hey @m4olivei , thanks for digging into this!

This is cool that this solved your cases. I just want to warn you to be aware of this usage.
The Transformer Rules were not designed to have a double pass on same node. So if one specific node is selected by context and the selector, it wont leave this node for another rule.

This way, the empty rule will need a selector that selects only empty #text content.

Im working on a solution right now and will have a pull request soon.
This PR will be:

  • All Elements will have a method isValid()
  • toDOMElements will have an if (!$this->isValid()) -> Wont output itself.
  • Transformer will check all isValid() from elements, if not valid, will add a warning to the transformation

This way we wont:

  • Output invalid/empty elements
  • Wont fail silently (because we will add warnings to the Transformer process)

from facebook-instant-articles-sdk-php.

m4olivei avatar m4olivei commented on April 25, 2024

OK, so I think I've worked out a nice custom Rule class that I'm happy with that works for this purpose:

<?php

/**
 * @file
 * Contains \EmptyRule
 */

use Facebook\InstantArticles\Transformer\Rules\ConfigurationSelectorRule;

/**
 * Matches empty nodes given a selector.  Here empty nodes are defined as nodes
 * in a DOMDocument that have no children, or have only text children containing
 * whitespace characters.
 */
class EmptyRule extends ConfigurationSelectorRule {

  public function __construct() {
  }

  public static function create() {
    return new EmptyRule();
  }

  public static function createFrom($configuration) {
    return self::create()->withSelector($configuration['selector']);
  }

  public function getContextClass() {
    return array(
      InstantArticle::getClassName(),
      Header::getClassName(),
      Footer::getClassName(),
      TextContainer::getClassName(),
    );
  }

  public function matchesContext($context) {
    return TRUE;
  }

  /**
   * @param \DOMNode $node
   * @return mixed
   */
  public function matchesNode($node) {
    // We're only interested in elements here, testing if they are empty.
    if ($node->nodeType !== XML_ELEMENT_NODE) {
      return FALSE;
    }

    // Limit by the selector passed in the configuration.
    if (!parent::matchesNode($node)) {
      return FALSE;
    }

    // Match iff the node has no children and/or all children are empty text
    // nodes.
    if ($node->hasChildNodes()) {
      /* @var \DOMNode $child */
      foreach ($node->childNodes as $child) {
        if ($child->nodeName !== '#text') {
          return FALSE;
        }
        else {
          // @see https://stackoverflow.com/a/27990195/142145
          $trimmed = trim($child->nodeValue, " \t\n\r\0\x0B\xC2\xA0");
          if (!empty($trimmed)) {
            return FALSE;
          }
        }
      }
    }

    return TRUE;
  }

  public function apply($transformer, $context, $element) {
    return $context;
  }
}

Then assuming you've got a \Facebook\InstantArticles\Transformer\Transformer instance called $transfomer kicking around, add a config rule like so:

$transformer->addRule(
    EmptyRule::createFrom(array(
      'class' => 'EmptyRule',
      'selector' => '//p|//div|//span',
    ))
  );

The method EmptyRule::matchesNode is the key. It'll match any element node, matched by the selector given, that either has no child nodes or has only text child nodes containing only whitespace. This reeeeeally annoying part is the need for this trim statement: trim($child->nodeValue, " \t\n\r\0\x0B\xC2\xA0"), which will match the usual whitespace suspects, but also NBSP.

Working for me so far. If you guys are into this, I'd be up for re-jigging this into a proper pull request.

from facebook-instant-articles-sdk-php.

m4olivei avatar m4olivei commented on April 25, 2024

@everton-rosario yeah your right. It's only working for me b/c I have it as the last rule added to the Transformer, which means its the first rule that gets checked against all nodes.

Sounds like a neat solution, look forward to testing it. Thanks for looking into this as well!

from facebook-instant-articles-sdk-php.

Faiyaz avatar Faiyaz commented on April 25, 2024

Any update on this @everton-rosario.

from facebook-instant-articles-sdk-php.

everton-rosario avatar everton-rosario commented on April 25, 2024

Yes @Faiyaz.
Please follow and also help review this PR: #71

from facebook-instant-articles-sdk-php.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.