Giter Site home page Giter Site logo

xiddoc / taboo Goto Github PK

View Code? Open in Web Editor NEW
9.0 1.0 0.0 33 KB

A LinkedIn scraper on steroids. Many people might accidentally leak confidential information there- this program will collect all of it and try and assess patterns with statistics.

License: GNU Affero General Public License v3.0

Python 100.00%
confidential linkedin linkedin-api linkedin-scraper scraper scraping

taboo's Introduction

Taboo

Legal / Liability

โš  Use this program on your OWN LIABILITY and for EDUCATIONAL PURPOSES ONLY โš 

Summary

A LinkedIn scraper on steroids. Many people accidentally leak confidential information there- this program will collect all of it and try and assess patterns with statistics.

Installing & Setup

We will need to install a few modules for the program to work, you can do this with this command:

pip install -r requirements.txt

Taboo offers a few built-in output formats for the data, once collected. To create the output, Taboo uses packages which might be quite large in file size (namely pandas), so keep this in mind when installing the Python requirements.

Go ahead and make a .env file, which the program will use to login to LinkedIn with. The contents of the .env file will include the LinkedIn credentials to use, they should be formatted like so:

[email protected]=password123
[email protected]=otherpass567

Usage

This README might not be fully updated, feel free to check out all the options and settings by running:

python taboo.py -h

You will need to specify an output format, otherwise Taboo will just download the data and cache it, then quit. So far the only offered output format is excel. You can specify this target output with the -f flag like this:

-f excel

Taboo offers 2 modes to run the engine. The first allows you to query Google and automatically scrape all the search results there with the assumption that they are LinkedIn URLs (it will use the path to grab the username). Here's an example of using this mode to search up cybersecurity researchers at the NSA:

python taboo.py -f excel --query +NSA cyber security researcher site:linkedin.com

If you already have a curated list of LinkedIn profiles you want to analyze, put them into a file with one link on each line, like the following:

https://www.linkedin.com/in/hacker12
https://www.linkedin.com/in/nsa-expert321
https://us.linkedin.com/in/nsa-ceo1234

Then you can let Taboo sic the list with the following command:

python taboo.py -f excel --infile input_file_list.txt

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.