thomasnet-scraper's Introduction

Thomasnet Hardware Suppliers Data Scraper

Clone the repository

git clone https://github.com/KayvanShah1/thomasnet-scraper.git

Enter into the repository's root directory

cd thomasnet-scraper

Create a virtual environment and install the dependencies

python -m venv ENV
source ENV/bin/activate
pip install -r requirements.txt

Visit the source

Link to the source
Find the heading for the product (to be passed as an argument below to run the script)
- Locate the heading in the URL after searching for the interested product
```
https://www.thomasnet.com/nsearch.html?cov=NA&heading=21650809&searchsource=suppliers&searchterm=Hydraulic+Cylinders&searchx=true&what=Hydraulic+Cylinders&which=prod
```
- Here the heading is 21650809

Getting started

Help

/thomasnet-scraper> py src/main.py -h                             
usage: Thomasnet Data Scraper [-h] -k KEYWORD -hd HEADING [-f]

Scrape Suppliers Data from Thomas website

optional arguments:
  -h, --help            show this help message and exit
  -k KEYWORD, --keyword KEYWORD
                        Product Name to search
  -hd HEADING, --heading HEADING
                        Heading for the product from website
  -f, --fast            Fast Scraping

Example

py src/main.py -k "hydraulic cylinder" -hd 21650809 -f

Note

Find the data exported in CSV files in data folder
The data of your interest is {abc}_clean_data.csv

Any problems or bugs to be reported

Create an issue here

Documentation

For more details about the scraping process read the WIKI documentation

Author and Developer:

Kayvan Shah - Data Engineer | NLP | ML/DL Enthusiast

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.

Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

TensorFlow

An Open Source Machine Learning Framework for Everyone

Django

The Web framework for perfectionists with deadlines.

Laravel

A PHP framework for web artisans

D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

web

Some thing interesting about web. New door for the world.

server

A server is a program made to process requests and deliver data to clients.

Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

Visualization

Some thing interesting about visualization, use data art

Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.

Microsoft

Open source projects and samples from Microsoft.

Google

Google ❤️ Open Source for everyone.

Alibaba

Alibaba Open Source for everyone

D3

Data-Driven Documents codes.

Tencent

China tencent open source team.

alex-camman / thomasnet-scraper Goto Github PK