Giter Site home page Giter Site logo

mikisharp / nzograbberbundle Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nayzo/nzograbberbundle

0.0 1.0 0.0 63 KB

Symfony2/3 Bundle used to Crawl and to Grab all types of links and Tags (img, js, css) from any website

License: MIT License

PHP 100.00%

nzograbberbundle's Introduction

NzoGrabberBundle

Build Status Latest Stable Version

The NzoGrabberBundle is a Symfony Bundle used to Crawl and to Grab all types of links, URLs and Tags for (img, js, css) from any website.

Features include:

  • Compatible Symfony version 3 & 4
  • Url Grabber/Crawler for HTTP/HTTPS
  • Url Grabber/Crawler for HREF / SRC / IMG types
  • Exclude any type of file by extension
  • Prevent specified URLs from Grabbing
  • Compatible php version 5 & 7

Installation

Through Composer:

Install the bundle:

$ composer require nzo/grabber-bundle

Register the bundle in app/AppKernel.php (Symfony V3):

// app/AppKernel.php

public function registerBundles()
{
    return array(
        // ...
        new Nzo\GrabberBundle\NzoGrabberBundle(),
    );
}

Usage

In the controller use the Grabber service and specify the options needed:

Get all URLs:

     public function indexAction($url)
    {
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrls($url);

        //....
    }

OR .. get all URLs not recursively:

Get all URLs no recursive:

     public function indexAction($url)
    {
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrlsNoRecursive($url);

        //....
    }

OR .. get all URLs that does not figure in the exclude array:

     public function indexAction($url)
    {
        $notScannedUrlsTab = ['http://www.exemple.com/about']
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrls($url, $notScannedUrlsTab);

        //....
    }

OR .. you can exclude URLs that contains a specified text and also you can select by file extension:

     public function indexAction($url)
    {
        $exclude = 'someText_to_exclude';
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrls($url, null, $exclude, array('png', 'pdf'));

        //....
    }

OR .. get all URLs selected by file extension:

     public function indexAction($url)
    {
        $tableOfUrls = $this->get('nzo_grabber.grabber')->grabUrls($url, null, null, array('png', 'pdf'));

        //....
    }

OR .. get all Img Files from the specified URL:

     public function indexAction($url)
    {
        $img = $this->get('nzo_grabber.grabber')->grabImg($url);

        //....
    }

OR .. get all Js Files from the specified URL:

     public function indexAction($url)
    {
        $js = $this->get('nzo_grabber.grabber')->grabJs($url);

        //....
    }

OR .. get all Css Files from the specified URL:

     public function indexAction($url)
    {
        $css = $this->get('nzo_grabber.grabber')->grabCss($url);

        //....
    }

OR .. get all Css, Img and Js Files from the specified URL:

     public function indexAction($url)
    {
        $extrat = $this->get('nzo_grabber.grabber')->grabExtrat($url);

        //....
    }

License

This bundle is under the MIT license. See the complete license in the bundle:

See Resources/doc/LICENSE

nzograbberbundle's People

Contributors

nayzo avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.