Giter Site home page Giter Site logo

etcetera-demo's Introduction

etcetera-demo

A small example project as showcase and tutorial for etcetera

Requirements

  • php7.0 (php7.0 php7.0-xml php7.0-mbstring php7.0-zip)
  • composer

Using the demo

  1. Clone the project
  2. run composer update
  3. run php runXlsx.php and/or php runXml.php from project root
  4. See what happens :-)

What does the demo do?

If you look at /data/people.xlsx you'll find a sheet with dummy data generated by sheet-faker containing 1000 tuples of firstname, lastname and email.

We have configured several extractions in /config/people/extractor.yml and /config/people/extractor.json to show most of the possibilities how etcetera may be used.

The demo shows how different kinds of entities and / or relations may be extracted within one single run of etcetera.

You should have a look at /runXlsx.php since this is the place where things are stitched together:

Deep dive (see /runXlsx.php)

Since instanciating an extractor manually may be a lot of code to write etcetera offers the possibility to use a factoring to create the extractor using a config. First the configuration is read from yml or json using one of the according reader provided in etcetera.

After that an instance of StandardExtractorFactory is created. If you wish to use filters, validators or converters for properties or filters and decorators for entities you need to add an according factory to the StandardExtractorFactory having configured your own aliases and mappings for them.

After having configured all you need you are able to create an extractor instance calling StandardExtractorFactory::create using your configuration.

Since readers and writers may be very complex and may strongly vary depending on your goals and complexity of your source we only offer simple readers for CSV, XLSX, XLS and XML. In this example the Excel2007Reader (XLSX) reader is used for our dummy data file.

For using a file reader, you only need to instanciate it and set the source file (so it may be re-used for multi-file processing).

Writers may also be very complex as you could write the extract to any target you want (Neo4J, MongoDB, MQ...) etcetera only offers the interfaces to implement your own by now.

In this example you may have a look at ConsoleWriter and CsvWriter which are implementations of the interface for this demo.

At this point we have an instance for reader, extractor and writer and that's all we need for the processor.

Instanciate the Processor with reader, extractor and writer and call $processor->process() to let etcetera do its job.

After having completed it's work, you may either see a plenty of output on the console (ConsoleWriter) or a bunch of CSV files in the /out folder, representing a set of entities or relations (one file each).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.