Giter Site home page Giter Site logo

massive-parser's Introduction

Live Parser

Social networks content parser.

Project has the following architecture:

App usage:

git clone https://github.com/lenchevskii/beautonomy-parser.git

cd beautonomy-parser

npm ci

YouTube-DL usage:

youtube-dl --id --skip-download --write-description \ 
--write-info-json --write-annotations --write-all-thumbnails \
--write-sub --write-auto-sub <URL>

Comments:

  1. YouTube-DL library have to be installed through the command:

    pip3 install --upgrade youtube-dl

    Do not forget (if the sklearn installation error occured):

    python -m pip install --upgrade pip
  2. AutoSub project.

    Do not forget:

    mkdir audio output
  3. Alias Register was used for the general utilities.

    E.g.: debug tracer:

    require('module-alias/register')
    
    const H = require('@general-helper')
    
    ...
    
    H.trace('smth', ['optional', 'comments', ...])     \\ add tracing function whenever you want to show the result 
                                                       \\ inside a call 
  4. DB extended charset.

    Notice!

    Whereas you use extended charset, you have to modify the configuration of the DB:

    sudo nano /etc/mysql/my.cnf
    
    [client]
    default-character-set = utf8mb4
    
    [mysql]
    default-character-set = utf8mb4
    
    [mysqld]
    character-set-client-handshake = FALSE
    character-set-server = utf8mb4
    collation-server = utf8mb4_unicode_ci

    Restart the system.

    Expected output:

    mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%';
    +--------------------------+--------------------+
    | Variable_name            | Value              |
    +--------------------------+--------------------+
    | character_set_client     | utf8mb4            |
    | character_set_connection | utf8mb4            |
    | character_set_database   | utf8mb4            |
    | character_set_filesystem | binary             |
    | character_set_results    | utf8mb4            |
    | character_set_server     | utf8mb4            |
    | character_set_system     | utf8               |
    | collation_connection     | utf8mb4_unicode_ci |
    | collation_database       | utf8mb4_unicode_ci |
    | collation_server         | utf8mb4_unicode_ci |
    +--------------------------+--------------------+
    10 rows in set (0.00 sec)
  5. Initialize MYSQL DB (if this is a child instance - use parent tables from parent server):

    mysql> source [Absolute]/beautycrash-parser/youtube.table.sql;
  6. Running on AWS:

    nohup npm start &

    Do not forget about AWS S3 Credentials: .aws/

  7. If Error is occured on main server (like process node [HOME]/.../youtube.resolver.tool.js failed) check youtube-dl version and reinstall:

    sudo apt purge youtube-dl
    sudo pip3 install youtube-dl
    sudo apt install youtube-dl
  8. Before starting the servers clean Redis DB.

  9. Count code lines in source/:

    find . -name '*.js' | xargs wc -l
  10. ssh run

    sudo ssh -i Documents/beautonomy-staging-2.pem [email protected]
  11. Tools env generator:

    Generate and write to the shell paths' aliases for tools presented. E.g.: some.service.tool.js which has path /path/to/some.service.tool.js, will be resolved as follows:

    /path/to/some.service.tool.js -> some_service

    Now, we can call the service from the environment:

    ~ node $some_service [--options]
  12. Redis error:

    ../beautycrash-parser/node_modules/redis/index.js:859
            command_obj.callback(undefinedArgError);
                                ^

    ...means .env argument is missed.

massive-parser's People

Watchers

James Cloos avatar mr. dima avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.