Giter Site home page Giter Site logo

toadlybroodle / spam-bot-3000 Goto Github PK

View Code? Open in Web Editor NEW
134.0 7.0 32.0 6.15 MB

Social media research and promotion, semi-autonomous CLI bot

Python 95.60% Shell 4.40%
marketing promotion bot python command-line-tool social-media reddit twitter cli automation facebook scraper keyword hashtag scrape-dumps instagram research selenium firefox geckodriver

spam-bot-3000's Introduction

spam-bot-3000

A python command-line (CLI) bot for automating research and promotion on popular social media platforms (reddit, twitter, facebook, [TODO: instagram]). With a single command, scrape social media sites using custom queries and/or promote to all relevant results.

Please use with discretion: i.e. choose your input arguments wisely, otherwise your bot could find itself, along with any associated accounts, banned from platforms very quickly. The bot has some built in anti-spam filter avoidance features to help you remain undetected; however, no amount of avoidance can hide blatantly abusive use of this tool.

features

  • reddit
    • scrape subreddit(s) for lists of keyword, dump results in local file (red_scrape_dump.txt)
      • separate keyword lists for AND, OR, NOT search operations (red_subkey_pairs.json)
      • search new, hot, or rising categories
    • reply to posts in red_scrape_dump.txt with random promotion from red_promos.txt
      • ignore posts by marking them in dump file with "-" prefix
    • praw.errors.HTTPException handling
    • write all activity to log (log.txt)
  • twitter
    • maintain separate jobs for different promotion projects
    • update user status
    • unfollow users who don't reciprocate your follow
    • scan twitter for list of custom queries, dump results in local file (twit_scrape_dump.txt)
      • scan continuously or in overwatch mode
    • optional bypassing of proprietary twitter APIs and their inherent limitations
    • promotion abilities
      • tweepy api
        • follow original posters
        • favorite relevant tweets
        • direct message relevant tweets
        • reply to relevant tweets with random promotional tweet from file (twit_promos.txt)
      • Selenium GUI browser
        • favorite, follow, reply to scraped results while bypassing API limits
      • ignore tweets by marking them in dump file with "-" prefix
    • script for new keyword, hashtag research by gleening scraped results
    • script for filtering out irrelevant keywords, hashtags, screen names
    • script for automating scraping, filtering, and spamming only most relevant results
    • relatively graceful exception handling
    • write all activity to log (log.txt)
  • facebook
    • zero reliance on proprietary facebook APIs and their inherent limitations
    • Selenium GUI browser agent
    • scrape public and private user profiles for keywords using AND, OR, NOT operators
      • note: access to private data requires login to authorized account with associated access
    • scrape public and private group feeds for keywords using AND, OR, NOT operators

dependencies

  • install dependencies you probably don't have already, errors will show up if you're missing any others
    • install pip3 sudo apt install python3-pip
    • install dependencies pip3 install --user tweepy bs4 praw selenium

reddit initial setup

  • update 'praw.ini' with your reddit app credentials
  • replace example promotions (red_promos.txt) with your own
  • replace example subreddits and keywords (red_subkey_pairs.json) with your own
    • you'll have to follow the existing json format
    • keywords_and: all keywords in this list must be present for positive matching result
    • keywords_or: at least one keyword in this list must be present for positive match
    • keywords_not: none of these keywords can be present in a positive match
    • any of the three lists may be omitted by leaving it empty - e.g. "keywords_not": []

<praw.ini>

...

[bot1]
client_id=Y4PJOclpDQy3xZ
client_secret=UkGLTe6oqsMk5nHCJTHLrwgvHpr
password=pni9ubeht4wd50gk
username=fakebot1
user_agent=fakebot 0.1

<red_subkey_pairs.json>

{"sub_key_pairs": [
{
  "subreddits": "androidapps",
  "keywords_and": ["list", "?"],
  "keywords_or": ["todo", "app", "android"],
  "keywords_not": ["playlist", "listen"]
}
]}

reddit usage

usage: spam-bot-3000.py reddit [-h] [-s N] [-n | -H | -r] [-p]

optional arguments:
  -h,	--help		show this help message and exit
  -s N,	--scrape N	scrape subreddits in subreddits.txt for keywords in red_keywords.txt; N = number of posts to scrape
  -n,	--new		scrape new posts
  -H,	--hot		scrape hot posts
  -r,	--rising	scrape rising posts
  -p,	--promote	promote to posts in red_scrape_dump.txt not marked with a "-" prefix

twitter initial setup

<credentials.txt>

your_consumer_key
your_consumer_secret
your_access_token
your_access_token_secret
your_twitter_username
your_twitter_password
  • create new 'twit_promos.txt' in job directory to store your job's promotions to spam
    • individual tweets on seperate lines
    • each line must by <= 140 characters long
  • create new 'twit_queries.txt' in job directory to store your job's queries to scrape twitter for
  • create new 'twit_scrape_dump.txt' file to store your job's returned scrape results

twitter usage

usage: spam-bot-3000.py twitter [-h] [-j JOB_DIR] [-t] [-u UNF] [-s] [-c] [-e] [-b]
                          [-f] [-p] [-d]
spam-bot-3000
optional arguments:
 -h, --help		show this help message and exit
 -j JOB_DIR, --job JOB_DIR
	                choose job to run by specifying job's relative directory
 -t, --tweet-status 	update status with random promo from twit_promos.txt
 -u UNF, --unfollow UNF
                        unfollow users who aren't following you back, UNF=number to unfollow

 query:
 -s, --scrape		scrape for tweets matching queries in twit_queries.txt
 -c, --continuous	scape continuously - suppress prompt to continue after 50 results per query
 -e, --english         	return only tweets written in English

spam -> browser:
 -b, --browser          favorite, follow, reply to all scraped results and
                        thwart api limits by mimicking human in browser!

spam -> tweepy api:
 -f, --follow		follow original tweeters in twit_scrape_dump.txt
 -p, --promote		favorite tweets and reply to tweeters in twit_scrape_dump.txt with random promo from twit_promos.txt
 -d, --direct-message	direct message tweeters in twit_scrape_dump.txt with random promo from twit_promos.txt

twitter example workflows

  1. continuous mode
    • -cspf scrape and promote to all tweets matching queries
  2. overwatch mode
    • -s scrape first
    • manually edit twit_scrape_dump.txt
      • add '-' to beginning of line to ignore
      • leave line unaltered to promote to
    • -pf then promote to remaining tweets in twit_scrape_dump.txt
  3. gleen common keywords, hashtags, screen names from scrape dumps
    • bash gleen_keywords_from_twit_scrape.bash
      • input file: twit_scrape_dump.txt
      • output file: gleened_keywords.txt
        • results ordered by most occurrences first
  4. filter out keywords/hashtags from scrape dump
    • manually edit gleened_keywords.txt by removing all relevent results
    • filter_out_strings_from_twit_scrape.bash
      • keywords input file: gleened_keywords.txt
      • input file: twit_scrape_dump.txt
      • output file: twit_scrp_dmp_filtd.txt
  5. browser mode
    • -b thwart api limits by promoting to scraped results directly in firefox browser
      • add username and password to lines 5 and 6 of credentials.txt respectively
  6. automatic scrape, filter, spam
    • auto_spam.bash
      • automatically scrape twitter for queries, filter out results to ignore, and spam remaining results
  7. specify job
    • -j studfinder_example/ specify which job directory to execute

Note: if you don't want to maintain individual jobs in separate directories, you may create single credentials, queries, promos, and scrape dump files in main working directory.

facebook initial setup

  • create new client folder in 'facebook/clients/YOUR_CLIENT'
  • create new 'jobs.json' file to store your client's job information in the following format:

<jobs.json>

{"client_data":
	{"name": "",
	"email": "",
	"fb_login": "",
	"fb_password": "",
	"jobs": [
		{"type": "groups",
			"urls": ["",""],
			"keywords_and": ["",""],
			"keywords_or": ["",""],
			"keywords_not": ["",""] },
		{"type": "users",
			"urls": [],
			"keywords_and": [],
			"keywords_or": [],
			"keywords_not": [] }
	]}
}

facebook usage

  • scrape user and group feed urls for keywords
    • facebook-scraper.py clients/YOUR_CLIENT/
      • results output to 'clients/YOUR_CLIENT/results.txt'

TODO

  • Flesh out additional suite of promotion and interaction tool for facebook platform
  • Organize platforms and their associated data and tools into their own folders and python scripts
  • Future updates will include modules for scraping and promoting to Instagram.

spam-bot-3000's People

Contributors

toadlybroodle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

spam-bot-3000's Issues

No module

Traceback (most recent call last):
File "twatBot.py", line 5, in
import tweepy
ImportError: No module named tweepy

praw.exceptions.InvalidURL Invalid URL: https://i.reddit.it/someimage.jpg

Going to try to assist as much as I can to find a solution for this one. Scraped everything alright...I believe. But was unable to submit promotions. Raised the following error:

spam-bot-3000.py:371: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if ret_code is 3:
spam-bot-3000.py:396: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if ret_code is 0:
spam-bot-3000.py:398: SyntaxWarning: "is" with a literal. Did you mean "=="?
  elif ret_code is 2:
spam-bot-3000.py:400: SyntaxWarning: "is" with a literal. Did you mean "=="?
  elif ret_code is 3:
spam-bot-3000.py:498: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if ret_code is 2: # HTTPError returned from query, move onto next query
spam-bot-3000.py:677: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if ret_code is 3:
spam-bot-3000.py:688: SyntaxWarning: "is" with a literal. Did you mean "=="?
  elif ret_code is 2: # automation detected
spam-bot-3000.py:691: SyntaxWarning: "is" with a literal. Did you mean "=="?
  elif ret_code is 3: # serious error returned, terminate activity
Authenticating...
2020Jul22 06:07:38 Authenticated as: Some user
Traceback (most recent call last):
  File "spam-bot-3000.py", line 735, in <module>
    main(sys.argv[1:])
  File "spam-bot-3000.py", line 722, in main
    replyReddit()
  File "spam-bot-3000.py", line 130, in replyReddit
    submission = reddit.submission(url=pd[1])
  File "/usr/local/lib/python3.8/site-packages/praw/reddit.py", line 800, in submission
    return models.Submission(self, id=id, url=url)
  File "/usr/local/lib/python3.8/site-packages/praw/models/reddit/submission.py", line 540, in __init__
    self.id = self.id_from_url(url)
  File "/usr/local/lib/python3.8/site-packages/praw/models/reddit/submission.py", line 427, in id_from_url
    raise InvalidURL(url)
praw.exceptions.InvalidURL: Invalid URL: https://i.redd.it/xxxxx.jpg

I believe that it is scraping not the post link but the contents of the post itself.

detailed setup

can you explain like I'm 5 how to set this up properly?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.