Giter Site home page Giter Site logo

linky's Introduction

linky's People

Contributors

mez-0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

linky's Issues

instruction about how to create the cookie.txt

Hi guys,

Hope you are all well !

Can you add the instruction about how to create the cookie.txt ?

I tried with a google chrome extension to use the following cookie.txt

# HTTP Cookie File for linkedin.com by Genuinous @genuinous.
# To download cookies for this tab click here, or download all cookies.
# Usage Examples:
#   1) wget -x --load-cookies cookies.txt "https://www.linkedin.com/feed/"
#   2) curl --cookie cookies.txt "https://www.linkedin.com/feed/"
#   3) aria2c --load-cookies cookies.txt "https://www.linkedin.com/feed/"
#
.linkedin.com	TRUE	/	TRUE	1618248615	lissc	1
.linkedin.com	TRUE	/	TRUE	1649826467	bcookie	"v=2&xxxxxxx-8363-43b4-8e35-86474cbe7b7a"
.www.linkedin.com	TRUE	/	TRUE	1649826467	bscookie	"v=1&20200412173014413a8dea-b4e9-47c3-8a4c-xxxxxxx-0PWt0"
.linkedin.com	TRUE	/	FALSE	1650194049	_ga	GA1.2.1274459269.1586712618
.linkedin.com	TRUE	/	FALSE	0	xxxxxxx%40AdobeOrg	1
.linkedin.com	TRUE	/	FALSE	1589714054	aam_uuid	xxxxxxx
.linkedin.com	TRUE	/	TRUE	1594488621	liap	true
.www.linkedin.com	TRUE	/	TRUE	1594488621	sl	v=1&7MCd6
.www.linkedin.com	TRUE	/	TRUE	1618248621	li_at	xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-xxxxxxx-xxxxxxx-xxxxxxx
.www.linkedin.com	TRUE	/	TRUE	1594488621	JSESSIONID	"ajax:xxxxxxxxxxxxxx"
.www.linkedin.com	TRUE	/	TRUE	1589304621	lissc1	1
.www.linkedin.com	TRUE	/	TRUE	1589304621	lissc2	1
.linkedin.com	TRUE	/	TRUE	0	lang	v=2&lang=en-us
.www.linkedin.com	TRUE	/	FALSE	0	spectroscopyId	xxxxxxx-11b4-4bbc-87ed-0b94ca1ac8b2
.linkedin.com	TRUE	/	TRUE	1589457227	UserMatchHistory	AQLcZ6t7X8SGswAAAXF4icNceJceoCM3yIrOyZ5LDyxqeYijA1v3bvrL44cubHKRTrzQgabW7eM
.linkedin.com	TRUE	/	TRUE	1589457228	li_oatml	AQGGJyC67vacYgAAAXF4icm74JOhMgG6hSaPHkK4S0OKPIdHmIVoahhXq6gmV_L3n9CGdvSIqoaJOvzbQVcjLciuyBBnpxih
.linkedin.com	TRUE	/	FALSE	1650191868	AMCV_14215E3D5995C57C0A495C55%40AdobeOrg	-1303530583%7CMCIDTS%7C18369%7CMCMID%7C13793629839677649112930854099465204048%7CMCAAMLH-1587724668%7C6%7CMCAAMB-1587724668%7C6G1ynYcLPuiQxYZrsz_pkqfLG9yMXBpb2zX5dvJdYQJzPXImdj0y%7CMCOPTOUT-1587127068s%7CNONE%7CvVersion%7C3.3.0%7CMCCIDH%7C1633815264
.www.linkedin.com	TRUE	/	TRUE	1589714047	UserMatchHistory	AQKKFOW2MEDobgAAAXGH2IBYHTH4Xi8S7FRUDgiL26hEFR_tZ4tSPILYHLh8HiCD4x5LAoWKtO5Du0NNALe83jMbAA8NJDn2JbfwjzhTB8I63Mf1ScxO18qTe8Lktm9jpAPEwsHIyjQhUEDZXoHTASKC9PAVZWGf_A5KiXzbKaFqXTInnNspIJmZu4yZi0yN4K1-T-pzPdCQhFy2V_Pml333G1Enn4eF
.linkedin.com	TRUE	/	FALSE	1587122649	_gat	1
.linkedin.com	TRUE	/	TRUE	1587139709	lidc	"b=TB89:g=2260:u=502:i=1587122050:t=1587139708:s=AQFY26mnb2z2_H8kMmGLt5okV0oI0aYo"

But did not work.
Ps. I replaced by xxxxxxxx the real values

How to fix it ?

Cheers,
X

Please add the cookie to a file

Codename: UhOh365
[20/11/19, 01:35:53] >> Please add the cookie to a file

Hi, where do I have to add the cookie file and in what format should the data be present in it?

Bypassing the 1000 limit

Currently, the api only responds with 1000 results.

url='https://www.linkedin.com/voyager/api/search/cluster?count=40&guides=List(v->PEOPLE,facetCurrentCompany->%s)&origin=OTHER&q=guided&start=0' % company_id

Here, the start=0 determines where the data starts from. Potentially, when the results is approaching 1000, this value gets set to 1000 and starts the process again.

I get this error after scraping about 800 records

Traceback (most recent call last):
File "linky.py", line 147, in
users=core.run(data)
File "C:\Users\Clint\Downloads\Linkedin scraper\lib\core.py", line 63, in run
logger.dump(users,validation)
File "C:\Users\Clint\Downloads\Linkedin scraper\lib\logger.py", line 211, in dump
green('%s (%s): %s at %s' % (GREEN(fullname),email,current_role,GREEN(current_company)))
File "C:\Users\Clint\Downloads\Linkedin scraper\lib\logger.py", line 57, in green
print('['+log_time+']'+GREEN(' >> ' )+string)
File "C:\Users\Clint\DOWNLO1\LINKED1\env\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 108-109: character maps to

error

python3 -r install requirements.txt
Unknown option: -r
usage: python3 [option] ... [-c cmd | -m mod | file | -] [arg] ...
Try `python -h' for more information.
โ”Œโ”€[โœ—]โ”€[user@parrot]โ”€[~/Desktop/linky]
โ””โ”€โ”€โ•ผ $python3 install -r requirements.txt
python3: can't open file 'install': [Errno 2] No such file or directory

can you help me?

Multiple Keywords

--keyword needs to support multiple roles.

Currently, this filth will work:

#!/bin/bash

ROLES='developer engineer director'
ID=1441
COMPANY='google'
DOMAIN='google.com'

for ROLE in $ROLES; do 
	./linky.py --cookie cookie.txt --company-id $ID --domain $DOMAIN --output $COMPANY_employees_$ROLE --format 'firstname.surname' --keyword $ROLE 
	sleep 5
done

Better user validation

As of now, the naming scheme has to be manually set. A potential way around this would be to take a sample size of the users, and generate a multiple naming schemes per name.
Then, attempt to validate these users and identify which naming scheme came back positive.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.