Giter Site home page Giter Site logo

robertoszek / pleroma-bot Goto Github PK

View Code? Open in Web Editor NEW
101.0 4.0 19.0 18.46 MB

Bot for mirroring one or multiple Twitter accounts in Pleroma/Mastodon/Misskey.

Home Page: https://robertoszek.github.io/pleroma-bot

License: MIT License

Python 98.74% HTML 1.26%
pleroma twitter twitter-api fediverse-account mastodon tweets mastodon-bot python pleroma-bot misskey-bot

pleroma-bot's Introduction

Stork (pleroma-bot)

Build Status Version AUR version codecov Python 3.6 License

Stork

Mirror your favorite Twitter accounts in the Fediverse, so you can follow their updates from the comfort of your favorite instance. Or migrate your own to the Fediverse using a Twitter archive.

Documentation

You can find this project at:

GitHub Gitlab Gitea
pleroma-bot pleroma-bot pleroma-bot

Supports:

Mastodon Pleroma Misskey
Mastodon Pleroma Misskey

Introduction

After using the pretty cool mastodon-bot for a while, I found it was lacking some features which were of use to me.

For precisely those cases I've written this Python project that automates them, asking such info to Twitter's API and updating the relevant fields on the Pleroma API/Mastodon API/Misskey API side.

Features

So basically, it does the following:

  • Can parse a Twitter archive, moving all your tweets to the Fediverse
  • Can use an RSS feed as the source of the tweets to post
  • Retrieves latest tweets and posts them on the Fediverse account if their timestamp is newer than the last post.
    • Can filter out RTs or not
    • Can filter out replies or not
    • Supports Twitter threads
  • Media retrieval and upload of multiple attachments. This includes:
    • Video
    • Images
    • Animated GIFs
    • Polls
  • Retrieves profile info from Twitter and updates it in on the Fediverse account. This includes:
    • Display name
    • Profile picture
    • Banner image
    • Bio text
  • Adds some metadata fields to the Fediverse account, pointing to the original Twitter account or custom text.

Installation

Using pip

$ pip install pleroma-bot

Using a package manager

Here's a list of the available packages.

Package type Link Maintainer
AUR (Arch) https://aur.archlinux.org/packages/python-pleroma-bot robertoszek

Usage

$ pleroma-bot [-c CONFIG] [-l LOG] [--noProfile] [--daemon] [--forceDate [FORCEDATE]] [-a ARCHIVE]
Bot for mirroring one or multiple Twitter accounts in Pleroma/Mastodon.

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
                        path of config file (config.yml) to use and parse. If
                        not specified, it will try to find it in the current
                        working directory.
  -d, --daemon          run in daemon mode. By default it will re-run every
                        60min. You can control this with --pollrate
  -p POLLRATE, --pollrate POLLRATE
                        only applies to daemon mode. How often to run the
                        program in the background (in minutes). By default is
                        60min.
  -l LOG, --log LOG     path of log file (error.log) to create. If not
                        specified, it will try to store it at your config file
                        path
  -n, --noProfile       skips Fediverse profile update (no background image,
                        profile image, bio text, etc.)
  --forceDate [FORCEDATE]
                        forces the tweet retrieval to start from a specific
                        date. The twitter_username value (FORCEDATE) can be
                        supplied to only force it for that particular user in
                        the config
  -s, --skipChecks      skips first run checks
  -a ARCHIVE, --archive ARCHIVE
                        path of the Twitter archive file (zip) to use for
                        posting tweets.
  -t THREADS, --threads THREADS
                        number of threads to use when processing tweets
  -L LOCKFILE, --lockfile LOCKFILE
                        path of lock file (pleroma-bot.lock) to prevent
                        collisions with multiple bot instances. By default it
                        will be placed next to your config file.
  --verbose, -v
  --version             show program's version number and exit

Before running

There are multiple options for using the bot.

You can either choose to use:

  • A Twitter archive
  • An RSS feed
  • Guest tokens
  • Twitter tokens with a Developer account

You'll need to create a configuration file and obtain the Fediverse tokens for your accounts no matter what you choose to use.

If you plan on retrieving tweets from an account which has their tweets protected, you'll also need the following:

  • Consumer Key and Secret. You'll find them on your project app keys and tokens section at Twitter's Developer Portal
  • Access Token Key and Secret. You'll also find them on your project app keys and tokens section at Twitter's Developer Portal. Alternatively, you can obtain the Access Token and Secret by running this locally, while being logged in with a Twitter account which follows or is the owner of the protected account

You'll may also need Elevated access in your Twitter's API project in order for the bot to function properly.

Refer to the docs for more info about this.

Configuration

Create a config.yml file in the same path where you are calling pleroma-bot (or use the --config argument to specify a different path).

There's a config example in this repo called config.yml.sample that can help you when filling yours out.

For more information you can refer to the "Configuration" page on the docs.

Here's what a minimal config looks like:

# Change this to your target Fediverse instance
pleroma_base_url: https://pleroma.instance
# How many tweets to gather per-user in every execution
# Twitter's API hard limit is 3,200
max_tweets: 40
# Twitter bearer token
twitter_token: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
users:
- twitter_username: User1
  pleroma_username: MyPleromaUser1
  # Mastodon/Pleroma bearer token
  pleroma_token: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Running

If you're running the bot for the first time it will ask you for the date you wish to start retrieving tweets from (it will gather all from that date up to the present). If you leave it empty and just press enter it will default to the oldest date that Twitter's API allows ('2010-11-06T00:00:00Z') for tweet retrieval.

To force this behaviour in future runs you can use the --forceDate argument (be careful, no validation is performed with the already posted toots/posts by that Fediverse account and you can end up with duplicates posts/toots!).

Additionally, you can provide a twitter_username if you only want to force the date for one user in your config.

For example:

$ pleroma-bot --forceDate WoolieWoolz

If the --noProfile argument is passed, the profile picture, banner, display name and bio will not be updated on the Fediverse account. However, it will still gather and post the tweets following your config's parameters.

NOTE: An error.log file will be created at the path from which pleroma-bot is being called.

crontab entry example

(everyday at 6:15 AM) update profile and (every 10 min.) post new tweets:

# Post tweets every 10 min
*/10 * * * * cd /home/robertoszek/myvenv/ && . bin/activate && pleroma-bot noProfile

# Update pleroma profile with Twitter info every day at 6:15 AM
15 6 * * * cd /home/robertoszek/myvenv/ && . bin/activate && pleroma-bot

NOTE: If you have issues with cron running the bot you may have to specify the full path of your Python executable

*/10 * * * * /usr/bin/python /usr/local/bin/pleroma-bot

Screenshots

Screenshot

Screenshot

Acknowledgements

These projects proved to be immensely useful, they are Python wrappers for the Mastodon API and Twitter API respectively:

They were part of the impetus for this project, challenging myself to not just import them and use them, instead opting to directly do the heavy lifting with built-in python modules.

That and mastodon-bot not working after upgrading the Pleroma instance I was admin on 😅. This event lead to repurposing this project (initially it only updated the profile info) and adding the tweet gathering and posting capabilities.

Contributing

Patches, pull requests, and bug reports are more than welcome, please keep the style consistent with the original source.

License

MIT License

Copyright (c) 2023 Roberto Chamorro / project contributors

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Tested and confirmed working against :

  • Pleroma BE 2.4.2
  • Mastodon v3.2.1
  • Misskey 12.110.1
  • Twitter API v1.1 and v2

pleroma-bot's People

Contributors

lightnin avatar nemobis avatar reorx avatar robertoszek avatar zoenglinghou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pleroma-bot's Issues

Append original tweet timestamp to end of pleroma post for archiving purposes.

Enable, in the config file, the ability to append the original datestamp of the tweet to the end of the toot.
The use case for this is for those who want to archive their old posts from the birdsite by copying them over to their pleroma instance. Since the tweets being copied may be old (in my case, they are up to 5 years old), and the new toot gets a timestamp that reflects its creation date and not the date of the original tweet, this would be very useful.

I realize that this may be out of scope (as was my request for filtering by hashtags - which I thank you for!) But figured I'd throw it in there for those of us who use twitter as an archive for experiments / etc.

Add Option To Ignore @ replies From Twitter

Having an option to ignore @ replies from tweet would cut down on the amount of tweet to toots I am seeing. Generally 95% of the @ replies from Twitter aren't directed at me so I don't necessarily see any value in adding them to Pleroma/Mastodon. For example, brands do a lot of @ replies as they support users over Twitter.

I made some small changes to _pleroma.py which I believe will skip @ replies. However it doesn't factor in an actual option to enable or disable the functionality. I suppose it's a start though. Around line 79:

    result = tweet_text.startswith('@')
    while not result:
        print (f'Tweet did not start with "@", processing...')
        if self.media_upload:
            for file in media_files:

Probably don't need the print statement but may be useful for debugging.

Getting Error in get_date_last_pleroma_post

Greetings!

I was able to run this once and it worked wonderfully, but then it started to fail with:

pleroma_bot - ERROR - Exception occurred
Traceback (most recent call last):
  File "/home/jimmy/.local/lib/python3.8/site-packages/pleroma_bot/cli.py", line 185, in main
    date_pleroma = user.get_date_last_pleroma_post()
  File "/home/jimmy/.local/lib/python3.8/site-packages/pleroma_bot/_pleroma.py", line 41, in get_date_last_pleroma_post
    datetime.strptime(
  File "/usr/lib/python3.8/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.8/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '2021-01-07T16:47:58.915Z' does not match format '%Y-%m-%dT%H:%M:%S.000Z'

However, unless if it's not obvious those do match. 2021-01-07T16:47:58.915Z uses the format %Y-%m-%dT%H:%M:%S.000Z.

Any ideas? I really hope it's not something obvious, I'm going to feel so stupid haha 😝

Exception with pinned tweet: TypeError: string indices must be integers

While posting to https://respublicae.eu/@Europarl_FI for the first time, I got an exception upon posting and pinning https://respublicae.eu/@Europarl_FI/108067690963226835 from https://nitter.eu/Europarl_FI/status/1502279955908071425 (even though it seems to have worked).

I'm currently using 1.0.3rc6

ℹ 2022-04-03 13:26:50,744 - pleroma_bot - INFO - ====================================== 
ℹ 2022-04-03 13:26:50,744 - pleroma_bot - INFO - Processing user:       108067243532704868 
ℹ 2022-04-03 13:26:50,745 - pleroma_bot - INFO - It seems like pleroma-bot is running for the first time for this Twitter user: Europarl_FI 
⚠ 2022-04-03 13:26:52,241 - pleroma_bot - WARNING - Mastodon doesn't support display names longer than 30 characters, truncating it and trying again... (_utils.py:501) 
ℹ 2022-04-03 13:26:52,241 - pleroma_bot - INFO - How far back should we retrieve tweets from the Twitter account? 
ℹ 2022-04-03 13:26:52,242 - pleroma_bot - INFO - 
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of 
last post in the Fediverse account)]  
⚠ 2022-04-03 13:26:52,531 - pleroma_bot - WARNING - No posts were found in the target Fediverse account (_pleroma.py:43) 
ℹ 2022-04-03 13:26:52,532 - pleroma_bot - INFO - How far back should we retrieve tweets from the Twitter account? 
ℹ 2022-04-03 13:26:52,532 - pleroma_bot - INFO - 
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of 
last post in the Fediverse account)]  
ℹ 2022-04-03 13:26:53,440 - pleroma_bot - INFO - tweets gathered:        9 
ℹ 2022-04-03 13:26:59,471 - pleroma_bot - INFO - tweets to post:         9 
ℹ 2022-04-03 13:26:59,472 - pleroma_bot - INFO - (1/9) 
ℹ 2022-04-03 13:27:02,070 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:02,571 - pleroma_bot - INFO - (2/9) 
ℹ 2022-04-03 13:27:02,912 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:03,413 - pleroma_bot - INFO - (3/9) 
ℹ 2022-04-03 13:27:05,887 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:06,388 - pleroma_bot - INFO - (4/9) 
ℹ 2022-04-03 13:27:09,143 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:09,644 - pleroma_bot - INFO - (5/9) 
ℹ 2022-04-03 13:27:10,116 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:10,617 - pleroma_bot - INFO - (6/9) 
ℹ 2022-04-03 13:27:13,170 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:13,671 - pleroma_bot - INFO - (7/9) 
ℹ 2022-04-03 13:27:16,168 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:16,669 - pleroma_bot - INFO - (8/9) 
ℹ 2022-04-03 13:27:19,685 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:20,186 - pleroma_bot - INFO - (9/9) 
ℹ 2022-04-03 13:27:23,307 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:23,808 - pleroma_bot - INFO - Current pinned:        1502279955908071425 
ℹ 2022-04-03 13:27:23,809 - pleroma_bot - INFO - Previous pinned:       None 
ℹ 2022-04-03 13:27:26,853 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2022-04-03 13:27:26,854 - pleroma_bot - INFO - File with previous pinned post ID not found or empty. Checking last posts for pinned post... 
✖ 2022-04-03 13:27:26,854 - pleroma_bot - ERROR - Exception occurred (cli.py:539) 
Traceback (most recent call last):
  File "/home/federico/mw/pleroma-bot/res/lib64/python3.10/site-packages/pleroma_bot/cli.py", line 524, in main
    user.check_pinned(posted)
  File "/home/federico/mw/pleroma-bot/res/lib64/python3.10/site-packages/pleroma_bot/_utils.py", line 367, in check_pinned
    pleroma_pinned_post = self.pin_pleroma(id_post_to_pin)
  File "/home/federico/mw/pleroma-bot/res/lib64/python3.10/site-packages/pleroma_bot/_pin.py", line 127, in pin_pleroma
    self.unpin_pleroma(pinned_file)
  File "/home/federico/mw/pleroma-bot/res/lib64/python3.10/site-packages/pleroma_bot/_pin.py", line 173, in unpin_pleroma
    _find_pinned(self, pinned_file)
  File "/home/federico/mw/pleroma-bot/res/lib64/python3.10/site-packages/pleroma_bot/_pin.py", line 188, in _find_pinned
    if post["pinned"]:
TypeError: string indices must be integers

Getting 403 error

Exception in user.check_pinned() for Mastodon 3.3: 404 Client Error: Not Found

ℹ 2021-04-28 09:23:07,426 - pleroma_bot - INFO - (6/6) 
ℹ 2021-04-28 09:23:08,119 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2021-04-28 09:23:48,160 - pleroma_bot - INFO - Current pinned:        1385197439225896962 
ℹ 2021-04-28 09:23:48,161 - pleroma_bot - INFO - Previous pinned:       None 
ℹ 2021-04-28 09:23:49,247 - pleroma_bot - INFO - Post in Pleroma:       <Response [200]> 
ℹ 2021-04-28 09:23:49,249 - pleroma_bot - INFO - File with previous pinned post ID not found or empty. Checking last posts for pinned post... 
✖ 2021-04-28 09:23:49,686 - pleroma_bot - ERROR - Exception occurred (cli.py:432) 
Traceback (most recent call last):
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/cli.py", line 420, in main
    user.check_pinned()
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/_utils.py", line 137, in check_pinned
    pleroma_pinned_post = self.pin_pleroma(id_post_to_pin)
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/_pin.py", line 21, in pin_pleroma
    self.unpin_pleroma(pinned_file)
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/_pin.py", line 64, in unpin_pleroma
    _find_pinned(self, pinned_file)
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/_pin.py", line 95, in _find_pinned
    response.raise_for_status()
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://mastodon.technology/api/v1/accounts/RelexSolutions/statuses

Twitter API v2

Traceback (most recent call last):
  File "/home/joker/.local/lib/python3.8/site-packages/pleroma_bot/cli.py", line 209, in main
    user = User(user_item, config)
  File "/home/joker/.local/lib/python3.8/site-packages/pleroma_bot/cli.py", line 171, in __init__
    self.tweets = self._get_tweets("v2")
  File "/home/joker/.local/lib/python3.8/site-packages/pleroma_bot/_twitter.py", line 122, in _get_tweets
    response.raise_for_status()
  File "/home/joker/.local/lib/python3.8/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://api.twitter.com/2/tweets/search/recent?max_results=40&query=from%3Aparlamentopt&poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld

API v2?

Incorrect parsing of HTML character entities in tweet text

For example, the ampersand & in this tweet https://twitter.com/talebwisdom/status/1352851878267277312 gets improperly parsed:

""Science and reason is the best we have" is a statement that is both pseudoscientific &amp; irrational. It conflates science and scientism. Science is rigorous, makes no claims outside a v. narrow domain. "Tradition, risk management, and Skin in the Game is the best we have"." -

HTML char entities for reference
https://dev.w3.org/html5/html-author/charref

Probably it's as simple as passing the text before posting through this:
https://docs.python.org/3/library/html.html#html.unescape

Handle exception when the bio_text is too long

I didn't realise that the original Twitter bio is appended to the text provided via bio_text, so I ended up having too much text. That produced the error:

2021-04-29 08:33:06,479 pleroma_bot ERROR: Exception occurred
Traceback (most recent call last):
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/cli.py", line 435, in main
    user.update_pleroma()
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/_pleroma.py", line 243, in update_pleroma
    response.raise_for_status()
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 422 Client Error: Unprocessable Entity for url: https://mastodon.technology/api/v1/accounts/update_credentials

(Oddly, the update seemed to go through, or at least the profile images were updated at some point.)
Verified by shortening the text and seeing the error went away.

The error is not particularly useful. At a minimum it could be a good idea to remind the user to check the variables which might make the request fail, perhaps?

Handle HTTP 429 for Fedi APIs

Handle rate limits if HTTP 429 Too Many Requests is raised when sending requests to the Pleroma/Mastodon/Misskey API endpoints.

Media gets lost if one of the attachments returns 404

When retweeting on Twitter, the post will only include the text, but not the image of the tweet.

Tested with: 1.0.2

my config.yml

twitter_base_url: https://api.twitter.com/1.1
# Change this to your Fediverse instance
pleroma_base_url: https://pleroma.url
# How many tweets to get in every execution
# Twitter's API hard limit is 3,200
max_tweets: 40
# Twitter bearer token
twitter_token: xxxx
# List of users and their attributes
users:
- twitter_username: xxxx
  pleroma_username: xxxx
  # Mastodon/Pleroma token obtained by following the README.md
  pleroma_token: xxxxx
  # (optional) keys and secrets for using OAuth 1.0a (for protected accounts)
  consumer_key: xxxxx
  consumer_secret: xxxx
  access_token_key: xx-xxx
  access_token_secret: xxxx
  # If you want to add a link to the original status or not
  signature: false
  # (optional) If you want to download Twitter attachments and add them to the Pleroma posts.
  # By default they are not
  media_upload: true
  # (optional) If twitter links should be changed to nitter ones. By default they are not
  nitter: false
  # (optional) If mentions should be transformed to links to the mentioned Twitter profile
  rich_text: true
  # (optional) visibility of the post. Must one of the following: public, unlisted, private, direct
  # by default is "unlisted"
  visibility: "public"
  # (optional) Force all posts for this account to be sensitive or not
  # The NSFW banner for the instance will be shown for attachments as a warning if true
  # If not defined, the original tweet sensitivity will be used on a tweet by tweet basis
  sensitive: false
  # (optional) Include the creation date of the tweet on the Fediverse post body
  original_date: false
  # (optional) Date format to use when adding the creation date of the tweet to the Fediverse post
  original_date_format: "%Y/%m/%d %H:%"
  # (optional) If RTs are to be also be posted in the Fediverse account. By default they are included
  include_rts: true
  # (optional) If replies are to be also posted in the Fediverse account. By default they are included
  include_replies: false
   # (optional) How big attachments can be before being ignored and not being uploaded to the Fediverse post
  # Examples: "30MB", "1.5GB", "0.5TB"
  twitter_bio: false
  file_max_size: 500MB
  bio_text: "my bio text woo"

Retweets should include the image if media upload is true.

Video attachment not being uploaded

On tweets where the attachment is a video only the thumbnail image gets uploaded to the Fediverse account's post.

Like with the following tweet:
JSON:

{
      'created_at':'Thu Oct 08 15:30:00 +0000 2020',
      'id':1314226545418928128,
      'id_str':'1314226545418928128',
      'text':'Get your axe-dodging skills ready because Fall Guys Season 2 is LIVE! \U0001fa93 https://t.co/nw66tI367P',
      'truncated':False,
      'entities':{
         'hashtags':[
            
         ],
         'symbols':[
            
         ],
         'user_mentions':[
            
         ],
         'urls':[
            
         ],
         'media':[
            {
               'id':1314215812849258497,
               'id_str':'1314215812849258497',
               'indices':[
                  72,
                  95
               ],
               'media_url':'http://pbs.twimg.com/amplify_video_thumb/1314215812849258497/img/C1kxYpSXU4MCQnND.jpg',
               'media_url_https':'https://pbs.twimg.com/amplify_video_thumb/1314215812849258497/img/C1kxYpSXU4MCQnND.jpg',
               'url':'https://t.co/nw66tI367P',
               'display_url':'pic.twitter.com/nw66tI367P',
               'expanded_url':'https://twitter.com/GameSpot/status/1314226545418928128/video/1',
               'type':'photo',
               'sizes':{
                  'thumb':{
                     'w':150,
                     'h':150,
                     'resize':'crop'
                  },
                  'medium':{
                     'w':1200,
                     'h':675,
                     'resize':'fit'
                  },
                  'small':{
                     'w':680,
                     'h':383,
                     'resize':'fit'
                  },
                  'large':{
                     'w':1280,
                     'h':720,
                     'resize':'fit'
                  }
               }
            }
         ]
      },
      'extended_entities':{
         'media':[
            {
               'id':1314215812849258497,
               'id_str':'1314215812849258497',
               'indices':[
                  72,
                  95
               ],
               'media_url':'http://pbs.twimg.com/amplify_video_thumb/1314215812849258497/img/C1kxYpSXU4MCQnND.jpg',
               'media_url_https':'https://pbs.twimg.com/amplify_video_thumb/1314215812849258497/img/C1kxYpSXU4MCQnND.jpg',
               'url':'https://t.co/nw66tI367P',
               'display_url':'pic.twitter.com/nw66tI367P',
               'expanded_url':'https://twitter.com/GameSpot/status/1314226545418928128/video/1',
               'type':'video',
               'sizes':{
                  'thumb':{
                     'w':150,
                     'h':150,
                     'resize':'crop'
                  },
                  'medium':{
                     'w':1200,
                     'h':675,
                     'resize':'fit'
                  },
                  'small':{
                     'w':680,
                     'h':383,
                     'resize':'fit'
                  },
                  'large':{
                     'w':1280,
                     'h':720,
                     'resize':'fit'
                  }
               },
               'video_info':{
                  'aspect_ratio':[
                     16,
                     9
                  ],
                  'duration_millis':74700,
                  'variants':[
                     {
                        'bitrate':832000,
                        'content_type':'video/mp4',
                        'url':'https://video.twimg.com/amplify_video/1314215812849258497/vid/640x360/r_A5Zo9bv-Zd9oAh.mp4?tag=13'
                     },
                     {
                        'bitrate':2176000,
                        'content_type':'video/mp4',
                        'url':'https://video.twimg.com/amplify_video/1314215812849258497/vid/1280x720/f_1HmXB7Ltva_fG0.mp4?tag=13'
                     },
                     {
                        'bitrate':288000,
                        'content_type':'video/mp4',
                        'url':'https://video.twimg.com/amplify_video/1314215812849258497/vid/480x270/ivhGrU2x6HcxebGw.mp4?tag=13'
                     },
                     {
                        'content_type':'application/x-mpegURL',
                        'url':'https://video.twimg.com/amplify_video/1314215812849258497/pl/F6pvhblAMaRD-XGx.m3u8?tag=13'
                     }
                  ]
               },
               'additional_media_info':{
                  'title':'',
                  'description':'',
                  'embeddable':True,
                  'monetizable':False
               }
            }
         ]
      },
      'source':'<a href="https://studio.twitter.com" rel="nofollow">Twitter Media Studio</a>',
      'in_reply_to_status_id':None,
      'in_reply_to_status_id_str':None,
      'in_reply_to_user_id':None,
      'in_reply_to_user_id_str':None,
      'in_reply_to_screen_name':None,
      'user':{
         'id':7157132,
         'id_str':'7157132',
         'name':'GameSpot',
         'screen_name':'GameSpot',
         'location':'',
         'description':'Official @GameSpot💥 account. Follow for video game news, livestreams, & giveaways! Also follow @GameSpotDeals for sales/freebies.',
         'url':'https://t.co/zJJ392GpBV',
         'entities':{
            'url':{
               'urls':[
                  {
                     'url':'https://t.co/zJJ392GpBV',
                     'expanded_url':'http://www.GameSpot.com',
                     'display_url':'GameSpot.com',
                     'indices':[
                        0,
                        23
                     ]
                  }
               ]
            },
            'description':{
               'urls':[
                  
               ]
            }
         },
         'protected':False,
         'followers_count':4118803,
         'friends_count':654,
         'listed_count':14963,
         'created_at':'Fri Jun 29 17:54:14 +0000 2007',
         'favourites_count':22350,
         'utc_offset':None,
         'time_zone':None,
         'geo_enabled':True,
         'verified':True,
         'statuses_count':137595,
         'lang':None,
         'contributors_enabled':False,
         'is_translator':False,
         'is_translation_enabled':False,
         'profile_background_color':'131516',
         'profile_background_image_url':'http://abs.twimg.com/images/themes/theme14/bg.gif',
         'profile_background_image_url_https':'https://abs.twimg.com/images/themes/theme14/bg.gif',
         'profile_background_tile':True,
         'profile_image_url':'http://pbs.twimg.com/profile_images/1262429081720836102/wzT3Zdr4_normal.jpg',
         'profile_image_url_https':'https://pbs.twimg.com/profile_images/1262429081720836102/wzT3Zdr4_normal.jpg',
         'profile_banner_url':'https://pbs.twimg.com/profile_banners/7157132/1597864346',
         'profile_link_color':'009999',
         'profile_sidebar_border_color':'EEEEEE',
         'profile_sidebar_fill_color':'EFEFEF',
         'profile_text_color':'333333',
         'profile_use_background_image':True,
         'has_extended_profile':False,
         'default_profile':False,
         'default_profile_image':False,
         'following':None,
         'follow_request_sent':None,
         'notifications':None,
         'translator_type':'none'
      },
      'geo':None,
      'coordinates':None,
      'place':None,
      'contributors':None,
      'is_quote_status':False,
      'retweet_count':27,
      'favorite_count':200,
      'favorited':False,
      'retweeted':False,
      'possibly_sensitive':False,
      'lang':'en'
   }

Handle HTTP 429 from Twitter API

When pleroma-bot cycles through profiles very quickly (for lack of tweets to post), it always tries to update the profiles. The Twitter API request to get the pinned tweet was the first to get an HTTP 429. It would be nice to sleep a bit before continuing with the next user, which otherwise will also fail.

ℹ 2022-04-04 09:00:50,264 - pleroma_bot - INFO - Processing user:       108071095671376766
ℹ 2022-04-04 09:00:50,264 - pleroma_bot - INFO - It seems like pleroma-bot is running for the first time for this Twitter user: ZovkoEU
✖ 2022-04-04 09:00:50,491 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:554) 
Traceback (most recent call last):                                                        
  File "/home/7/federico/mastodon/lib/python3.7/site-packages/pleroma_bot/cli.py", line 435, in main
    user = User(user_item, config, base_path)                                              
  File "/home/7/federico/mastodon/lib/python3.7/site-packages/pleroma_bot/cli.py", line 199, in __init__                 
    self.pinned_tweet_id = self._get_pinned_tweet_id()                                              
  File "/home/7/federico/mastodon/lib/python3.7/site-packages/pleroma_bot/_pin.py", line 252, in _get_pinned_tweet_id
    response.raise_for_status()                                                                     
  File "/home/7/federico/mastodon/lib/python3.7/site-packages/requests/models.py", line 960, in raise_for_status
    raise HTTPError(http_error_msg, response=self)                                                                   
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://api.twitter.com/2/users/by/username/ZovkoEU?user.fields=pinned_tweet_id&expansions=pinned_tweet_id&tweet.fields=entities

Add help page

It would be helpful if executing the following it'd bring up info about the arguments and options available and a quick description of each one:

$ pleroma-bot --help

Needs a rewrite of how arguments are handled.

Tweets with multiple images losing media

On tweets such as this one (https://twitter.com/WoolieWoolz/status/1314252985426489350) that have multiple images only the first one gets posted into the Fediverse account.

JSON:

[
   {
      'created_at':'Thu Oct 08 17:15:04 +0000 2020',
      'id':1314252985426489350,
      'id_str':'1314252985426489350',
      'text':'http://twitch.tv/woolieversus https://t.co/LMXdjjUhUF',
      'truncated':False,
      'entities':{
         'hashtags':[
            
         ],
         'symbols':[
            
         ],
         'user_mentions':[
            
         ],
         'urls':[
            {
               'url':'https://t.co/xuC3tPBiKd',
               'expanded_url':'http://twitch.tv/woolieversus',
               'display_url':'twitch.tv/woolieversus',
               'indices':[
                  0,
                  23
               ]
            }
         ],
         'media':[
            {
               'id':1314252981282500609,
               'id_str':'1314252981282500609',
               'indices':[
                  24,
                  47
               ],
               'media_url':'http://pbs.twimg.com/media/Ej0qH45XcAEyJPw.jpg',
               'media_url_https':'https://pbs.twimg.com/media/Ej0qH45XcAEyJPw.jpg',
               'url':'https://t.co/LMXdjjUhUF',
               'display_url':'pic.twitter.com/LMXdjjUhUF',
               'expanded_url':'https://twitter.com/WoolieWoolz/status/1314252985426489350/photo/1',
               'type':'photo',
               'sizes':{
                  'thumb':{
                     'w':150,
                     'h':150,
                     'resize':'crop'
                  },
                  'small':{
                     'w':376,
                     'h':680,
                     'resize':'fit'
                  },
                  'medium':{
                     'w':434,
                     'h':785,
                     'resize':'fit'
                  },
                  'large':{
                     'w':434,
                     'h':785,
                     'resize':'fit'
                  }
               }
            }
         ]
      },
      'extended_entities':{
         'media':[
            {
               'id':1314252981282500609,
               'id_str':'1314252981282500609',
               'indices':[
                  24,
                  47
               ],
               'media_url':'http://pbs.twimg.com/media/Ej0qH45XcAEyJPw.jpg',
               'media_url_https':'https://pbs.twimg.com/media/Ej0qH45XcAEyJPw.jpg',
               'url':'https://t.co/LMXdjjUhUF',
               'display_url':'pic.twitter.com/LMXdjjUhUF',
               'expanded_url':'https://twitter.com/WoolieWoolz/status/1314252985426489350/photo/1',
               'type':'photo',
               'sizes':{
                  'thumb':{
                     'w':150,
                     'h':150,
                     'resize':'crop'
                  },
                  'small':{
                     'w':376,
                     'h':680,
                     'resize':'fit'
                  },
                  'medium':{
                     'w':434,
                     'h':785,
                     'resize':'fit'
                  },
                  'large':{
                     'w':434,
                     'h':785,
                     'resize':'fit'
                  }
               }
            },
            {
               'id':1314252981274112008,
               'id_str':'1314252981274112008',
               'indices':[
                  24,
                  47
               ],
               'media_url':'http://pbs.twimg.com/media/Ej0qH43XcAgTIk9.jpg',
               'media_url_https':'https://pbs.twimg.com/media/Ej0qH43XcAgTIk9.jpg',
               'url':'https://t.co/LMXdjjUhUF',
               'display_url':'pic.twitter.com/LMXdjjUhUF',
               'expanded_url':'https://twitter.com/WoolieWoolz/status/1314252985426489350/photo/1',
               'type':'photo',
               'sizes':{
                  'large':{
                     'w':1094,
                     'h':1500,
                     'resize':'fit'
                  },
                  'thumb':{
                     'w':150,
                     'h':150,
                     'resize':'crop'
                  },
                  'small':{
                     'w':496,
                     'h':680,
                     'resize':'fit'
                  },
                  'medium':{
                     'w':875,
                     'h':1200,
                     'resize':'fit'
                  }
               }
            }
         ]
      },
      'source':'<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>',
      'in_reply_to_status_id':None,
      'in_reply_to_status_id_str':None,
      'in_reply_to_user_id':None,
      'in_reply_to_user_id_str':None,
      'in_reply_to_screen_name':None,
      'user':{
         'id':2562363992,
         'id_str':'2562363992',
         'name':'Woolie Versus',
         'screen_name':'WoolieWoolz',
         'location':'Montreal',
         'description':"Rewinding sucks. Vergil is life. Kirby is hometown. #Punchgirl is truth. Stands are real. {podcasts} {let's plays} {fighting games} https://t.co/CTAE8bMA1Q",
         'url':'https://t.co/S4CsnvPB95',
         'entities':{
            'url':{
               'urls':[
                  {
                     'url':'https://t.co/S4CsnvPB95',
                     'expanded_url':'http://youtube.com/WoolieVersus',
                     'display_url':'youtube.com/WoolieVersus',
                     'indices':[
                        0,
                        23
                     ]
                  }
               ]
            },
            'description':{
               'urls':[
                  {
                     'url':'https://t.co/CTAE8bMA1Q',
                     'expanded_url':'http://linktr.ee/wooliewoolz',
                     'display_url':'linktr.ee/wooliewoolz',
                     'indices':[
                        132,
                        155
                     ]
                  }
               ]
            }
         },
         'protected':False,
         'followers_count':106567,
         'friends_count':701,
         'listed_count':253,
         'created_at':'Thu Jun 12 00:44:01 +0000 2014',
         'favourites_count':2401,
         'utc_offset':None,
         'time_zone':None,
         'geo_enabled':True,
         'verified':False,
         'statuses_count':15789,
         'lang':None,
         'contributors_enabled':False,
         'is_translator':False,
         'is_translation_enabled':False,
         'profile_background_color':'000000',
         'profile_background_image_url':'http://abs.twimg.com/images/themes/theme14/bg.gif',
         'profile_background_image_url_https':'https://abs.twimg.com/images/themes/theme14/bg.gif',
         'profile_background_tile':False,
         'profile_image_url':'http://pbs.twimg.com/profile_images/1257467851184705542/Uw00CuX__normal.jpg',
         'profile_image_url_https':'https://pbs.twimg.com/profile_images/1257467851184705542/Uw00CuX__normal.jpg',
         'profile_banner_url':'https://pbs.twimg.com/profile_banners/2562363992/1598651481',
         'profile_link_color':'19CF86',
         'profile_sidebar_border_color':'000000',
         'profile_sidebar_fill_color':'000000',
         'profile_text_color':'000000',
         'profile_use_background_image':False,
         'has_extended_profile':True,
         'default_profile':False,
         'default_profile_image':False,
         'following':None,
         'follow_request_sent':None,
         'notifications':None,
         'translator_type':'none'
      },
      'geo':None,
      'coordinates':None,
      'place':None,
      'contributors':None,
      'is_quote_status':False,
      'retweet_count':9,
      'favorite_count':110,
      'favorited':False,
      'retweeted':False,
      'possibly_sensitive':False,
      'lang':'und'
   }
]

KeyError while processing url_entity

Here is the log:

$ pleroma-bot 

                        `^y6gB@@BBQA{,
                      :fB@@@@@@BBBBBQgU"
                    `f@@@@@@@@BBBBQgg80H~
                    H@@B@BB@BBBB#Qgg&0RNT
                   z@@&B@BBBBBBQgg80RD6HK
                  ;@@@QB@BBBB#Qgg&0RN6WqS
                  q@@@@@BBBBQgg80RN6HAqSo          _             _
                 z@@@@BBBB#Qg8&0RN6WqSUhr         | |           | |
               -H@@@@BBBBQQg80RD6HAqSKh(       ___| |_ ___  _ __| | __
              rB@@@BBBB#6Lm00DN6WqSUhfv       / __| __/ _ \| '__| |/ /
             f@@@@BBBBf= |0RD6HAqSKhfv        \__ \ || (_) | |  |   <
           =g@@@BBBBF=  "RDN6WqSUhff{         |___/\__\___/|_|  |_|\_|
          c@@@@BBgu_   ~WD9HAqSKhfkl`
        _6@@@BBNr     'qN6WqSUhhfXI'     .                           .       .
       rB@@@B0r      `S6HAqSKhfkoCr  ,-. |  ,-. ,-. ,-. ,-,-. ,-.    |-. ,-. |-
     `X@@@BQx       `I6WASShhfXFIy_  | | |  |-' |   | | | | | ,-| -- | | | | |
    _g@@@Q\`        JHAqSKhfXoCwJz_  |-' `' `-' '   `-' ' ' ' `-^    `-' `-' `'
   rB@@#x`         }WASShhfXsIyzuu,  |
 `y@@&|          .IAqSKhfXoCwJzu1lr  '
`D@&|           :KqSUhffXsIyzuu1llc,
ff=            `==:::""",,,,________


ℹ 2021-06-04 15:00:48,547 - pleroma_bot - INFO - ====================================== 
ℹ 2021-06-04 15:00:48,547 - pleroma_bot - INFO - Processing user:       xxx 
ℹ 2021-06-04 15:00:50,202 - pleroma_bot - INFO - tweet count:    16 
Processing tweets...  |✖ 2021-06-04 15:00:50,508 - pleroma_bot - ERROR - Exception occurred (cli.py:482) 
Traceback (most recent call last):
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/pleroma_bot/_processing.py", line 228, in _expand_urls
    for url_entity in tweet["entities"]["urls"]:
KeyError: 'entities'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/pleroma_bot/cli.py", line 454, in main
    tweets_to_post = user.process_tweets(tweets)
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/pleroma_bot/_utils.py", line 93, in wrapper
    input_thread.join()
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/pleroma_bot/_utils.py", line 50, in join
    raise self.exc
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/pleroma_bot/_utils.py", line 38, in run
    self.ret = self._target(*self._args, **self._kwargs)
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/pleroma_bot/_utils.py", line 84, in <lambda>
    target=lambda q, *arg1, **kwarg1: q.put(fct(*arg1, **kwarg1)),
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/pleroma_bot/_processing.py", line 70, in process_tweets
    tweet["text"] = _expand_urls(self, tweet)
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/pleroma_bot/_processing.py", line 250, in _expand_urls
    response = session.head(match.group(), allow_redirects=True)
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/requests/sessions.py", line 577, in head
    return self.request('HEAD', url, **kwargs)
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/requests/sessions.py", line 528, in request
    prep = self.prepare_request(req)
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/requests/sessions.py", line 466, in prepare_request
    hooks=merge_hooks(request.hooks, self.hooks),
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/requests/models.py", line 316, in prepare
    self.prepare_url(url, params)
  File "/home/mstdn/bot/venv/lib/python3.7/site-packages/requests/models.py", line 390, in prepare_url
    raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL 'Twitch.tv/xxx': No schema supplied. Perhaps you meant http://Twitch.tv/xxx?

Blocking

Implement blocking in such a way only one instance can run at the same time or interact with I/O, so it does not interfere with an instance launched previously which may still be running.

requests.exceptions.HTTPError: 422 Client Error: Unprocessable Entity for url:

For some reason, some of the twitter account result in error, and previously those accounts worked. Last one is @Ultrekillblast and before that it was official @startrek account. I got this:

...
ℹ 2022-02-22 18:23:31,681 - pleroma_bot - INFO - tweets gathered: 27
ℹ 2022-02-22 18:23:34,148 - pleroma_bot - INFO - tweets to post: 27
ℹ 2022-02-22 18:23:34,150 - pleroma_bot - INFO - (1/27)
✖ 2022-02-22 18:23:34,402 - pleroma_bot - ERROR - Exception occurred (cli.py:502)
Traceback (most recent call last):
File "/home/mastodon/.local/lib/python3.8/site-packages/pleroma_bot/_pleroma.py", line 103, in post_pleroma
response.raise_for_status()
File "/home/mastodon/.local/lib/python3.8/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 422 Client Error: Unprocessable Entity for url: https://salocha.online/api/v1/media

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/mastodon/.local/lib/python3.8/site-packages/pleroma_bot/cli.py", line 482, in main
post_id = user.post_pleroma(
File "/home/mastodon/.local/lib/python3.8/site-packages/pleroma_bot/_pleroma.py", line 117, in post_pleroma
response.raise_for_status()
File "/home/mastodon/.local/lib/python3.8/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 422 Client Error: Unprocessable Entity for url: https://salocha.online/api/v1/media
...

UnicodeEncodeError Logger exception with iso8859_2 encoding (non UTF-8)

I've got an error, what can I do? This is the error log:

2022-02-12 08:55:45,426 pleroma_bot ERROR: Exception occurred
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/pleroma_bot/cli.py", line 363, in main
config = yaml.safe_load(stream)
File "/usr/local/lib/python3.8/dist-packages/yaml/init.py", line 125, in safe_load
return load(stream, SafeLoader)
File "/usr/local/lib/python3.8/dist-packages/yaml/init.py", line 81, in load
return loader.get_single_data()
File "/usr/local/lib/python3.8/dist-packages/yaml/constructor.py", line 49, in get_single_data
node = self.get_single_node()
File "/usr/local/lib/python3.8/dist-packages/yaml/composer.py", line 36, in get_single_node
document = self.compose_document()
File "/usr/local/lib/python3.8/dist-packages/yaml/composer.py", line 58, in compose_document
self.get_event()
File "/usr/local/lib/python3.8/dist-packages/yaml/parser.py", line 118, in get_event
self.current_event = self.state()
File "/usr/local/lib/python3.8/dist-packages/yaml/parser.py", line 193, in parse_document_end
token = self.peek_token()
File "/usr/local/lib/python3.8/dist-packages/yaml/scanner.py", line 129, in peek_token
self.fetch_more_tokens()
File "/usr/local/lib/python3.8/dist-packages/yaml/scanner.py", line 223, in fetch_more_tokens
return self.fetch_value()
File "/usr/local/lib/python3.8/dist-packages/yaml/scanner.py", line 577, in fetch_value
raise ScannerError(None, None,
yaml.scanner.ScannerError: mapping values are not allowed here
in "/usr/local/lib/python3.8/dist-packages/pleroma_bot/config.yml", line 2, column 17

Handle HTTP 429 error (rate limits)

2021-04-26 12:58:17,140 pleroma_bot ERROR: Exception occurred
Traceback (most recent call last):
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/cli.py", line 406, in main
    user.post_pleroma(
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/_pleroma.py", line 153, in post_pleroma
    response.raise_for_status()
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://mastodon.technology//api/v1/statuses

Maybe it's a good idea also to raise the wait from 0.5 seconds to 2 or something?

Exception if not first_time run but no previous post available

Thank you for this bot, I love how easy it is to configure!

I interrupted the first run, so I had no posts on the target mastodon account, and upon the second try I got:

Traceback (most recent call last):
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/cli.py", line 378, in main
    date_pleroma = user.get_date_last_pleroma_post()
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/pleroma_bot/_pleroma.py", line 35, in get_date_last_pleroma_post
    response.raise_for_status()
  File "/home/federico/git/federico/mastodon/lib64/python3.9/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://mastodon.technology//api/v1/accounts/<username>/statuses

I could easy work around this by using --forceDate, which is mentioned in https://robertoszek.github.io/pleroma-bot/gettingstarted/usage/#first-run , so no big deal, but maybe the date should fall back to the default?

Mirror Twitter profile metadata to Mastodon (like website)

Most Twitter accounts (?) link a website, for instance https://nitter.eu/peterliese links https://www.peter-liese.de/ .

There is no direct mapping in Mastodon, but I usually use one of the 4 metadata slots with a label "WWW" and put the website URL there.

The website is reachable through the nitter URLs, and most pleroma-bot users create the fediverse profile manually, so the main use case is someone who's mirroring multiple profiles, wants to give attribution by link to the source and doesn't want to rely on nitter for that.

Gets stuck on "Processing tweets..."

Sometimes this happens:

ℹ 2022-04-04 10:14:59,842 - pleroma_bot - INFO - ====================================== 
ℹ 2022-04-04 10:14:59,842 - pleroma_bot - INFO - Processing user:       108070211019592095 
ℹ 2022-04-04 10:14:59,842 - pleroma_bot - INFO - It seems like pleroma-bot is running for the first time for this Twitter user: linagalvezmunoz 
ℹ 2022-04-04 10:15:00,903 - pleroma_bot - INFO - How far back should we retrieve tweets from the Twitter account? 
ℹ 2022-04-04 10:15:00,903 - pleroma_bot - INFO - 
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of 
last post in the Fediverse account)]  
⚠ 2022-04-04 10:15:01,106 - pleroma_bot - WARNING - No posts were found in the target Fediverse account (_pleroma.py:43) 
ℹ 2022-04-04 10:15:01,106 - pleroma_bot - INFO - How far back should we retrieve tweets from the Twitter account? 
ℹ 2022-04-04 10:15:01,106 - pleroma_bot - INFO - 
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of 
last post in the Fediverse account)]  
ℹ 2022-04-04 10:15:02,614 - pleroma_bot - INFO - tweets gathered:        50 
Processing tweets...  \^C

This was stuck for 45 min there. Maybe there's a need of a hard timeout somewhere.

The error seems to be somehow user-dependent, in that I've seen it more than once for some users.

Allow user to copy tweets ordered by date

Upon executing pleroma-bot to copy tweets with a certain hashtag from my twitter account, I noticed that the order in which the tweets are copied over is seemingly random (when no start date is specified).

For example, when copying tweets with the hashtag #tinkering, like these:
https://twitter.com/search?q=%40amoslightnin%20%23tinkering&src=typed_query&f=live

One of the first statuses copied over was this one:
https://masto.amosamos.net/notice/A9e0sWGhO6Da69j7rc

Which is from a tweet from 2018.
https://twitter.com/AmosLightnin/status/1020098097110806529?s=20

But there are tweets with the #tinkering hashtag from earlier years on my twitter timeline.

Again - this is fantastic software already, and I recognize that the user base this would be useful for may be fairly small. But then again, I think it's nice if we can provide ways for people to migrate their content from the big platforms onto their own little pleroma / activitypub instances, especially when they've used twitter as a repository for documentation, notes, or experiments, as I have.
Thanks again for making and sharing this software!

Some tweets have their links or media skipped (unified cards)

Some fancy accounts seem to be using some Twitter feature which pleroma-bot doesn't support yet.

This is typically spotted in tweets which follow the trend of containing a mere "↓" as warning that the main content of the update is actually somewhere else, like this: https://respublicae.eu/@EU_Commission/108092396818818757 https://nitter.eu/EU_Commission/status/1512123194785898503 which is just a link to https://ec.europa.eu/commission/presscorner/detail/en/statement_22_2331 . These tweets look like just any other tweet whose main URL has been "eaten" by Twitter and shown only as attached "card", but they seem to be different.

Others are more complicated like https://respublicae.eu/@EU_Commission/108103776666586079 https://nitter.eu/EU_Commission/status/1512777762909655043 which contains a "broadcast": https://nitter.eu/i/broadcasts/1BRJjnyZoZdJw . I guess there isn't much to do about these, other than documenting it somewhere so that people make informed decisions about the nitter and signature configs.

Allow multiple 'twitter_username' values in config

Support ‘twitter_username’ being a list of Twitter accounts to mirror and consolidate all tweets into a single Fediverse account. It'd be nice if the tweets are posted based on date, not dumped from each Twitter user one after the other.

Don’t update the Pleroma profile if 'twitter_username' is a list, I guess.

Something along the lines of this:

users:
- twitter_username: 
  - TwitterUser1
  - TwitterUser2
  - TwitterUser3
  pleroma_username: NewsBot
  pleroma_token: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
- twitter_username: OnlyOneUser
  pleroma_username: MyUniqueAccount
  pleroma_token: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

This will necessitate the change of the behaviour of some loops or some of the User object attributes.

Do not throw exception for suspended/deleted account

Arguably not a bug but user error:

User "SaferInternetEU" has been suspended

https://nitter.eu/SaferInternetEU

ℹ 2022-04-03 18:30:42,941 - pleroma_bot - INFO - It seems like pleroma-bot is running for the first time for this Twitter user: SaferInternetEU 
✖ 2022-04-03 18:30:43,993 - pleroma_bot - ERROR - Exception occurred (cli.py:539) 
Traceback (most recent call last):
  File "/home/federico/mw/pleroma-bot/res/lib64/python3.10/site-packages/pleroma_bot/cli.py", line 422, in main
    user = User(user_item, config, base_path)
  File "/home/federico/mw/pleroma-bot/res/lib64/python3.10/site-packages/pleroma_bot/cli.py", line 211, in __init__
    self._get_twitter_info()
  File "/home/federico/mw/pleroma-bot/res/lib64/python3.10/site-packages/pleroma_bot/_twitter.py", line 45, in _get_twitter_info
    user = json.loads(response.text)["data"]
KeyError: 'data'

I'd prefer to just log such errors and proceed. During setup I can remove the accounts that shouldn't have been there in the first place, but in the future if a single account gets renamed or suspended everything will fail.

Support for OAuth 1.0a (needed for protected accounts)

Hiyo!

First off apologies on not getting around to testing 0.7.0 until now (#28, #29). Things were working initially for the first run it seems, but I am now getting an error:

pleroma_bot - ERROR - Exception occurred
Traceback (most recent call last):
  File "/home/jimmy/.local/lib/python3.8/site-packages/pleroma_bot/cli.py", line 286, in main
    if tweets["meta"]["result_count"] > 0:
KeyError: 'meta'

It appears to be getting "stuck" at the same Twitter account during each run. Is there anything else I can provide to help troubleshoot?

Add archival options

Let the user provide the path to a previously created archive of tweets (zip/tar file) and posting them in the target Fediverse account (in chronological order, oldest to newest).

Add some way to gather all tweets of an account and download them locally for archival purposes (zip/tar them up, for example)
Get the tweets from an official archive as an alternative.

If the token has the permissions for it (and the user wants to) allow deleting the last 3200 tweets from the account, so we can fetch the next batch of 3200 (keep in mind rate limits with Twitter's API) until we reach the oldest tweet.
(Verify that the /2/tweets endpoint does not include deleted tweets towards the total 3200 limit)
https://developer.twitter.com/en/docs/twitter-api/tweets/lookup/api-reference/get-tweets

Also, support deleting the tweets from an official archive as an input.
https://twitter.com/settings/your_twitter_data

KeyErrors when updating from certain twitter accounts

Hi there,

On 1.0.0 and 1.0.1, I've recently seen that trying to get a profile update from some accounts seems to be resulting in a KeyError. I'm curious what could be triggering it.

For example:

ℹ 2022-02-14 10:29:19,739 - pleroma_bot - INFO - ======================================
ℹ 2022-02-14 10:29:19,739 - pleroma_bot - INFO - Processing user: kamithefishbot
ℹ 2022-02-14 10:29:21,733 - pleroma_bot - INFO - Current pinned: None
ℹ 2022-02-14 10:29:21,733 - pleroma_bot - INFO - Previous pinned: None
✖ 2022-02-14 10:29:21,893 - pleroma_bot - ERROR - Exception occurred (cli.py:502)
Traceback (most recent call last):
File "/home/pleromabot/myvenv/lib/python3.9/site-packages/pleroma_bot/cli.py", line 498, in main
user.update_pleroma()
File "/home/pleromabot/myvenv/lib/python3.9/site-packages/pleroma_bot/_pleroma.py", line 189, in update_pleroma
if self.profile_banner_url[t_user]:
KeyError: 'kamithefish'

"kamithefish" is the twitter account.

(myvenv) pleromabot@ded1 ~/myvenv $ python -V
Python 3.9.9

(myvenv) pleromabot@ded1 ~/myvenv $ pip list
Package Version


certifi 2021.5.30
charset-normalizer 2.0.3
idna 3.2
oauthlib 3.1.1
pip 22.0.3
pleroma-bot 1.0.1
PyYAML 5.4.1
requests 2.26.0
requests-oauthlib 1.3.0
setuptools 56.0.0
urllib3 1.26.6

If I remove the problematic entries from the config, there's a bunch of others that are updating without issue.

The same issue isn't occurring on runs that have --noProfile

Make RTs optional

Right now every tweet, RT or not is always mirrored into the Fediverse account. Give the option to post RTs or not in the config file.
Not a very involved change.

Max upload size

Allow setting a Pleroma/Mastodon attachment size limit on the config.
If the downloaded file from Twitter exceeds this limit, drop it from the tweet (so the text is still posted, without this attachment) and send a warning to the log.

Maybe add some text to the post indicating there's a missing attachment? That seems a tad excessive and intrusive, idk.

Error When Twitter Display Name Is 31+ Characters

When a Twitter user's display name is 31+ characters you get the following error:

pleroma_bot - ERROR - Exception occurred
Traceback (most recent call last):
  File "/home/jimmy/.local/lib/python3.8/site-packages/pleroma_bot/cli.py", line 229, in main
    user.update_pleroma()
  File "/home/jimmy/.local/lib/python3.8/site-packages/pleroma_bot/_pleroma.py", line 245, in update_pleroma
    response.raise_for_status()
  File "/home/jimmy/.local/lib/python3.8/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 422 Client Error: Unprocessable Entity for url: https://domain.com/api/v1/accounts/update_credentials

After some debugging I was able to work around this by updating:

https://github.com/robertoszek/pleroma-twitter-info-grabber/blob/83da9b65e1e4eac35609900ef165f4deb3e22e59/pleroma_bot/_pleroma.py#L196

to:

data = {"note": self.bio_text, "display_name": self.display_name[0:30]}

In my opinion it's less than ideal to just chop away data. Unfortunately what I believe would be the best fix fix would need to come from Mastodon in this case where the character limit on the display_name is upped. It looks like Twitter's display name limit is currently 50 characters.

Migrate remaining enpoints using v1.1 to v2

Twitter introduced a new tier of access (Elevated) to their API projects and although existing projects were promoted automatically, new users by default will get Essential access instead, which is not allowed to make requests to API v1.1.

Users can apply for Elevated access here, but it should not be a hard requirement.

We should move the little that's left of v1.1 to v2 and once fully migrated, remove the redundant twitter_base_url mapping from the config samples (and continue to use twitter_base_url_v2 internally, so we don't break old config.yml files created by users before these changes).

Endpoints which we still use v1.1 as of the time of writing:

https://developer.twitter.com/en/docs/twitter-api/migrate/data-formats/standard-v1-1-to-v2

https://developer.twitter.com/en/docs/twitter-api/getting-started/about-twitter-api#v2-access-level
https://blog.twitter.com/developer/en_us/topics/tools/2021/build-whats-next-with-the-new-twitter-developer-platform

Grab tweets from before bot was initially started

Hello! I was wondering, is there any possibility of an option for the bot to mirror over the tweets (as a one-time thing) from before it was run the first time? This way the pleroma profile could be a complete recreation of the twitter profile it's mirroring.

HTTPError: 401 Client Error: Unauthorized for url: https://api.twitter.com/2/users/by/username/retroscifiart?user.fields=pinned_tweet_id

I recently changed my fediverse instance from mastodon to pleroma (soapboxfe). I try to install pleroma-bot and started with one bot but got this error:
...
ℹ 2022-04-08 09:38:23,560 - pleroma_bot - INFO - config path: /root/config.yml
ℹ 2022-04-08 09:38:23,561 - pleroma_bot - INFO - tweets temp folder: /root/tweets
ℹ 2022-04-08 09:38:23,572 - pleroma_bot - INFO - ======================================
ℹ 2022-04-08 09:38:23,572 - pleroma_bot - INFO - Processing user: Retroscifiart
ℹ 2022-04-08 09:38:23,573 - pleroma_bot - INFO - It seems like pleroma-bot is running for the first time for this Twitter user: retroscifiart
✖ 2022-04-08 09:38:23,858 - pleroma_bot - ERROR - Exception occurred (cli.py:536)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/pleroma_bot/cli.py", line 419, in main
user = User(user_item, config, base_path)
File "/usr/local/lib/python3.6/dist-packages/pleroma_bot/cli.py", line 189, in init
self.pinned_tweet_id = self._get_pinned_tweet_id()
File "/usr/local/lib/python3.6/dist-packages/pleroma_bot/_pin.py", line 239, in _get_pinned_tweet_id
response.raise_for_status()
File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://api.twitter.com/2/users/by/username/retroscifiart?user.fields=pinned_tweet_id&expansions=pinned_tweet_id&tweet.fields=entities
[1]+ Exit 127 requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://api.twitter.com/2/users/by/username/retroscifiart?user.fields=pinned_tweet_id
...

More config options for the user

It would be nice to have options for the scope and sensitivity of the statuses for a user. I could implement this in a PR if you like.

ps. I am also happy to help with other issues & TODOs

Animated gifs cause an exception

When a tweet includes an attachment of type 'animated_gif' it causes an exception.

Traceback (most recent call last):
  File "pleroma-twitter-info-grabber/updateInfoPleroma.py", line 710, in <module>
    main()
  File "pleroma-twitter-info-grabber/updateInfoPleroma.py", line 673, in main
    tweets_to_post = user.process_tweets(tweets_to_post)
  File "pleroma-twitter-info-grabber/updateInfoPleroma.py", line 372, in process_tweets
    media_url = item['url']
KeyError: 'url'

Do not ask to enter date when --forceDate is provided

I have to populate a number of accounts for the first time, so it's annoying to have to require manual input like this:

ℹ 2022-04-03 13:35:08,271 - pleroma_bot - INFO - ====================================== 
ℹ 2022-04-03 13:35:08,271 - pleroma_bot - INFO - Processing user:       108067315577208905 
ℹ 2022-04-03 13:35:08,272 - pleroma_bot - INFO - It seems like pleroma-bot is running for the first time for this Twitter user: eucopresident 
ℹ 2022-04-03 13:35:10,628 - pleroma_bot - INFO - How far back should we retrieve tweets from the Twitter account? 
ℹ 2022-04-03 13:35:10,629 - pleroma_bot - INFO - 
Enter a date (YYYY-MM-DD):
[Leave it empty to retrieve *ALL* tweets or enter 'continue'
if you want the bot to execute as normal (checking date of 
last post in the Fediverse account)]

Am I supposed to use --forceDate together with --skipChecks the first time? Currently I'm using a silly patch like this in _utils.py after logger.info(date_msg):

    # input_date = input()
    if True:

I don't quite understand what cli.py is trying to do here:

            if (
                (args.forceDate and args.forceDate in user.twitter_username)
                or args.forceDate == "all"
                or user.first_time
            ) and not args.skipChecks:
                date_pleroma = user.force_date()
            else:
                if user.instance == "misskey":  # pragma
                    date_pleroma = user.get_date_last_misskey_post()
                else:
                    date_pleroma = user.get_date_last_pleroma_post()

It feels like date_pleroma is never set to args.forceDate.

Expand URLs in profile bio too

Currently the URLs in profile bios are not expanded, for instance https://respublicae.eu/@peterliese contains

Impressum: https://t.co/gOXDmFJjsx

which is a link to https://peter-liese.de/index.php/impressum . I think the reason for this is that URLs in bios count against the character limit. So I guess there are two possibilities if you want to change this:

  • expand the URL but keep the t.co short URL if it wouldn't otherwise fit in the space;
  • support URL shortening via kutt.it (https://pypi.org/project/kutt/ ), add a configuration variable to pick a kutt instance.

Support for polls

Create a poll in Pleroma/Fediverse account with the same values as the original tweet.

Twitter v2 essential api errors

Hi, I am just setting up this bot for the first time.
When I run the bot it gives me an error

✖ 2021-12-10 15:11:38,072 - pleroma_bot - ERROR - Exception occurred (cli.py:423) 
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pleroma_bot/cli.py", line 303, in main
    for user_item in user_dict[:]:
TypeError: unhashable type: 'slice'

I have installed using pip3 on ubuntu 20.04
my config looks like this:

twitter_base_url: https://api.twitter.com/2
pleroma_base_url: https:///bae.st
max_tweets: 40
twitter_token: mytoken
users:
  pleroma_token: mytoken
  consumer_key: mykey
  consumer_secret: mysecret
  access_token_key: myaccess
  access_token_secret: myaccesssecret
  signature: false
  media_upload: true
  nitter: false
  rich_text: true
  visibility: "public"
  sensitive: true
  original_date: false
  original_date_format: "%Y/%m/%d %H:%"
  include_rts: true
  include_replies: false
  file_max_size: 500MB

Please let me know what I should do or how I can help.

Enable targetting of particular tweets by ID or hashtag

This project seems immensely useful for people who are transitioning from twitter to the fediverse - or want to maintain presences on both.
I wonder if it could be adapted to enable the transfer of targeted tweets from twitter to an account on the fediverse. The use case I'm imagining is my own: I have been documenting tinkering projects with brief videos shared to twitter, most of which have the hashtag #tinkering . I would like to be able to move just those tweets and associated media over to my pleroma account, which would then be a repository of tinkering projects in the past, as well as going forward.

This seems to have nearly the functionality required to do this except it's not feasible to target tweets with particular hashtags or IDs.
Apologies if this is out of scope. I just thought I would see if such functionality was interesting as it seems only a step or two away from the current implementation, and could be achieved by adding a check against hashtags in the tweets.
Thanks for your attention, and for sharing this tool.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.