Giter Site home page Giter Site logo

finance-dl's Introduction

Python package for scraping personal financial data from financial institutions.

License: GPL v2 PyPI Build

This package may be useful on its own, but is specifically designed to be used with beancount-import.

Supported data sources

Setup

To install the most recent published package from PyPi, simply type:

pip install finance-dl

To install from a clone of the repository, type:

pip install .

or for development:

pip install -e .

Configuration

Create a configuration file called something like finance_dl_config.py. For a complete example of this file and some documentation, see example_finance_dl_config.py.

Refer to the documentation of the individual scraper modules for further details.

Basic Usage

You can run a scraping configuration named myconfig as follows:

python -m finance_dl.cli --config-module example_finance_dl_config --config myconfig

The configuration myconfig refers to a function named CONFIG_myconfig in the configuration module.

Make sure that your configuration module is accessible in your Python sys.path. Since sys.path includes the current directory by default, you can simply run this command from the directory that contains your configuration module.

By default, the scrapers run fully automatically, and the ones based on selenium and chromedriver run in headless mode. If the initial attempt for a selenium-based scraper fails, it is automatically retried again with the browser window visible. This allows you to manually complete the login process and enter any multi-factor authentication code that is required.

To debug a scraper, you can run it in interactive mode by specifying the -i command-line argument. This runs an interactive IPython shell that lets you manually invoke parts of the scraping process.

Automatic Usage

To run multiple configurations at once, and keep track of when each configuration was last updated, you can use the finance_dl.update tool.

To display the update status, first create a logs directory and run:

python -m finance_dl.update --config-module example_finance_dl_config --log-dir logs status

Initially, this will indicate that none of the configurations have been updated. To update a single configuration myconfig, run:

python -m finance_dl.update --config-module example_finance_dl_config --log-dir logs update myconfig

With a single configuration specified, this does the same thing as the finance_dl.cli tool, except that the log messages are written to logs/myconfig.txt and a logs/myconfig.lastupdate file is created if it is successful.

If multiple configurations are specified, as in:

python -m finance_dl.update --config-module example_finance_dl_config --log-dir logs update myconfig1 myconfig2

then all specified configurations are run in parallel.

To update all configurations, run:

python -m finance_dl.update --config-module example_finance_dl_config --log-dir logs update --all

Note on Chromedriver Versioning

Chromedriver and Chrome are very tightly coupled; their versions need to match. finance_dl uses Chromedriver from the chromedriver_binary Python package (not your system's installed Chromedriver binary). However, Chromedriver, by default, uses your system's installed version of Chrome. Depending on how you manage the two installations on your system, this combination may frequently end up causing finance_dl to fail with messages like

selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 97
Current browser version is 96.0.4664.45 with binary path /usr/bin/google-chrome

In this event, you have a few options:

  1. Explicitly manage your version of the chromedriver_binary Python package to match your installed version of Chrome;
  2. Explicitly manage your installed version of Chrome to match your version of the chromedriver_binary Python package; or
  3. Install the version of Chrome matching your version of chromedriver_binary somewhere other than your system's default Chrome version, and set the environment variable CHROMEDRIVER_CHROME_BINARY to point to it. (You can do this from within your finance_dl config script, e.g. with a line like os.environ["CHROMEDRIVER_CHROME_BINARY"] = "/usr/bin/google-chrome-beta").

License

Copyright (C) 2014-2018 Jeremy Maitin-Shepard.

Distributed under the GNU General Public License, Version 2.0 only. See LICENSE file for details.

finance-dl's People

Contributors

arition avatar bayesianmind avatar carljm avatar chandler150 avatar filosottile avatar gateswong avatar inderpreet99 avatar jbms avatar jktomer avatar karlicoss avatar moritzj29 avatar nlydv avatar philipsd6 avatar rkok avatar skoster avatar tbm avatar witten avatar xentac avatar zburatorul avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

finance-dl's Issues

Selenium: Can not connect to the Service finance-dl-chromedriver-wrapper

Platform: Windows 10
Selenium Version: 4.2.0

When running finance_dl.cli the program crashes with the below error.

(.venv) PS C:\Users\user\source\repos\beancount> python -m finance_dl.cli --config-module finance_dl_config --config paypal -i
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\finance_dl\cli.py", line 94, in <module>     
    main()
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\finance_dl\cli.py", line 85, in main
    with interactive_func(**spec) as ns:
  File "C:\Python311\Lib\contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\finance_dl\scrape_lib.py", line 435, in interact_with_scraper
    with temp_scraper(scraper_class, **kwargs) as scraper:
  File "C:\Python311\Lib\contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\finance_dl\scrape_lib.py", line 390, in temp_scraper
    scraper = scraper_type(*args, download_dir=download_dir,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\finance_dl\paypal.py", line 124, in __init__ 
    super().__init__(use_seleniumrequests=True, **kwargs)
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\finance_dl\scrape_lib.py", line 176, in __init__
    self.driver = driver_class(
                  ^^^^^^^^^^^^^
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\seleniumrequests\request.py", line 144, in __init__
    super(RequestsSessionMixin, self).__init__(*args, **kwargs)
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\selenium\webdriver\chrome\webdriver.py", line 70, in __init__
    super(WebDriver, self).__init__(DesiredCapabilities.CHROME['browserName'], "goog",
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\selenium\webdriver\chromium\webdriver.py", line 89, in __init__
    self.service.start()
  File "C:\Users\user\source\repos\beancount\.venv\Lib\site-packages\selenium\webdriver\common\service.py", line 105, in start
    raise WebDriverException("Can not connect to the Service %s" % self.path)
selenium.common.exceptions.WebDriverException: Message: Can not connect to the Service finance-dl-chromedriver-wrapper

Error on OFX account with no transactions

Background

I opened a new account at an institution, but I haven't funded it yet. Meaning that it has no transactions whatsoever. However, I'm running finance-dl anyway, because I have other accounts at the same institution whose transactions I'd like to fetch. When I run fintance-dl to fetch transactions from this institution via OFX, I get an error:

$ python3 -m finance_dl.update --config-module finance_dl_config --log-dir logs update institution --force

Actual behavior

Here's the tail end of the traceback I get:

...
[0/1] institution [41s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/ofx.py", line 187, in get_earliest_data
[0/1] institution [41s elapsed]     account.number)
[0/1] institution [41s elapsed] RuntimeError: Failed to retrieve any data for account: 12345678
[1/1] institution [41s elapsed] FAILED with return code 1

Expected behavior

Maybe logging/warning that there are no transactions for that account, and merrily continuing onward to the other accounts. I'm open to ideas about what right behavior here is!

Problem with latest selenium-python lib.

I recently setup an environment which has "selenium 4.4.3" installed. It's been installed as a direct or indirect dependency of finance-dl. This version of selenium-python has the following change recently, which breaks all find element API calls:

baijum/selenium-python@3b13c2f

Causes:

  File "somepath\.venv\lib\site-packages\finance_dl\scrape_lib.py", line 222, in wait_for_page_load
    old_page = self.driver.find_element_by_tag_name('html')
AttributeError: 'WebDriver' object has no attribute 'find_element_by_tag_name'

Mint module does not retry in non-headless state to enter MFA

Hi,

When using the mint module to download data via mint, it never opens a non-headless browser so I can enter in my MFA information.

In the log below, it says Retrying login interactively, but never opens a browser in which I can select and enter my MFA.

--connect=http://127.0.0.1:49760 --session-id=e915dc4a973f92fe277f10bb69476134
 --connect=http://127.0.0.1:49774 --session-id=b718b621b3cdd23b9130ad83d8bf5042
2019-11-29 21:25:04,884 mint.py:119 [INFO] Logging into mint
2019-11-29 21:25:10,855 mint.py:123 [INFO] Waiting to enter username and password
2019-11-29 21:25:10,904 mint.py:127 [INFO] Entering username and password
2019-11-29 21:25:11,134 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:12,206 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:13,212 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:14,218 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:15,227 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:16,234 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:17,242 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:18,251 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:19,259 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:20,267 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:21,276 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:22,290 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:23,297 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:24,305 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:25,310 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:26,319 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:27,328 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:28,335 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:29,342 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:30,348 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:31,354 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:32,360 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:33,365 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:34,372 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:35,378 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:36,382 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:37,390 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:38,395 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:39,401 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:40,407 mint.py:135 [INFO] Waiting for MFA
Traceback (most recent call last):
  File "/Users/jkf/accounting/.venv/lib/python3.7/site-packages/finance_dl/mint.py", line 182, in connect
    try_login(scraper)
  File "/Users/jkf/accounting/.venv/lib/python3.7/site-packages/finance_dl/mint.py", line 173, in try_login
    scraper.login()
  File "/Users/jkf/accounting/.venv/lib/python3.7/site-packages/finance_dl/mint.py", line 142, in login
    raise TimeoutError("Login failed to complete within timeout")
TimeoutError: Login failed to complete within timeout
2019-11-29 21:25:41,427 mint.py:195 [INFO] Retrying login interactively
 --connect=http://127.0.0.1:49807 --session-id=c89a622843c182d121d10f26afbc5633
 --connect=http://127.0.0.1:49821 --session-id=9432d53cc738ac098854582993c0c5de
2019-11-29 21:25:43,960 mint.py:119 [INFO] Logging into mint
2019-11-29 21:25:49,641 mint.py:123 [INFO] Waiting to enter username and password
2019-11-29 21:25:49,700 mint.py:127 [INFO] Entering username and password
2019-11-29 21:25:50,004 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:51,014 mint.py:135 [INFO] Waiting for MFA
2019-11-29 21:25:52,018 mint.py:135 [INFO] Waiting for MFA
...

Amazon module is failing after reaching 2018 in my order history

My amazon CONFIG:

def CONFIG_amazon():
    return dict(
        module='finance_dl.amazon',
        credentials=amazon.credentials,
        output_directory=os.path.join(data_dir, 'amazon'),
        profile_dir=os.path.join(profile_dir, 'amazon'),
    )

After executing the cli:

python -m finance_dl.cli --config-module finance_dl_config --config amazon --log=INFO

The output shows:

 --connect=http://127.0.0.1:57275 --session-id=93c41049e6af86c2d7d455703aed1a63
2019-04-27 15:50:33,070 amazon.py:102 [INFO] Initiating log in                                                                        
2019-04-27 15:50:35,883 amazon.py:109 [INFO] You must be already logged in!
2019-04-27 15:50:38,120 amazon.py:206 [INFO] Retrieving order group: 'last 30 days'                                                   
2019-04-27 15:50:40,159 amazon.py:185 [INFO] Found no more pages
2019-04-27 15:50:40,309 amazon.py:206 [INFO] Retrieving order group: 'past 6 months'                                                  
2019-04-27 15:50:42,365 amazon.py:185 [INFO] Found no more pages  
2019-04-27 15:50:42,536 amazon.py:206 [INFO] Retrieving order group: '2019'                                                           
2019-04-27 15:50:44,433 amazon.py:185 [INFO] Found no more pages
2019-04-27 15:50:44,529 amazon.py:206 [INFO] Retrieving order group: '2018'                                                           
Traceback (most recent call last):                                
  File "/home/philipsd6/devel/finance-dl/finance_dl/scrape_lib.py", line 416, in retry                                                  
    return func()                          
  File "/home/philipsd6/devel/finance-dl/finance_dl/scrape_lib.py", line 436, in fetch                                                  
    scraper.run()                                                                                                                     
  File "/home/philipsd6/devel/finance-dl/finance_dl/amazon.py", line 265, in run                                                        
    self.get_orders(regular=self.regular, digital=self.digital)
  File "/home/philipsd6/devel/finance-dl/finance_dl/amazon.py", line 224, in get_orders
    retrieve_all_order_groups()                                                                                                       
  File "/home/philipsd6/devel/finance-dl/finance_dl/amazon.py", line 209, in retrieve_all_order_groups
    get_invoice_urls()                                                                                                                
  File "/home/philipsd6/devel/finance-dl/finance_dl/amazon.py", line 159, in get_invoice_urls                                           
    invoices, = self.wait_and_return(invoice_finder)
  File "/home/philipsd6/devel/finance-dl/finance_dl/scrape_lib.py", line 252, in wait_and_return                                        
    WebDriverWait(self.driver, timeout).until(predicate, message=message)
  File "/home/philipsd6/.local/venvs/beancount/lib/python3.6/site-packages/selenium/webdriver/support/wait.py", line 80, in until       
    raise TimeoutException(message, screen, stacktrace) 
selenium.common.exceptions.TimeoutException: Message: Waiting to match conditions                                                     

This module used to work for me in the past when I first set it up, but now it's failing in 2018.

For "last 30 days", "past 6 months", and "2019" I have < 10 orders and no pagination, but when it gets to "2018" I have > 50 orders, and > 5 pages of orders. This module navigates to the second page of orders for "2018" but then it times out.

The call that is polling until it times out is this:

2019-04-27 15:49:40,011 remote_connection.py:388 [DEBUG] POST http://127.0.0.1:37509/session/9a048830f7858ba83482d1a0f641a2bd/elements
{"using": "xpath", "value": "//a[contains(@href, \"summary/print.html\")]", "sessionId": "9a048830f7858ba83482d1a0f641a2bd"}          
2019-04-27 15:49:40,053 connectionpool.py:396 [DEBUG] http://127.0.0.1:37509 "POST /session/9a048830f7858ba83482d1a0f641a2bd/elements H
TTP/1.1" 200 70                                                

Amazon(.de) not downloading orders

Trying to download amazon(.de) orders but doesn't seem to want to download anything, not sure what condition it was waiting on.

Also had to change orderFilter to timeFilter here to get it to scrape the orders
https://github.com/jbms/finance-dl/blob/4b8e28a29b8f0faf5ab3457b5cded2079e73f3fd/finance_dl/amazon.py#L449C24-L449C24

DevTools listening on ws://127.0.0.1:64033/devtools/browser/d936fdb9-bceb-4579-bd14-74d94bfbbda2
 --connect=http://localhost:64028 --session-id=c1a93491b891b70821a27455a8cb60f8
2023-10-29 21:31:05,228 amazon.py:277 [INFO] Initiating log in
2023-10-29 21:31:05,972 amazon.py:284 [INFO] You must be already logged in!
2023-10-29 21:31:06,464 amazon.py:464 [INFO] Retrieving order group: 'den letzten 30 Tagen'
2023-10-29 21:31:06,871 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:31:06,899 amazon.py:464 [INFO] Retrieving order group: 'den letzten 3 Monaten'
2023-10-29 21:31:07,272 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:31:07,306 amazon.py:464 [INFO] Retrieving order group: '2023'
[28972:31576:1029/213110.343:ERROR:device_event_log_impl.cc(225)] [21:31:10.343] USB: usb_service_win.cc:415 Could not read device interface GUIDs: Het systeem kan het opgegeven bestand niet vinden. (0x2)
[28972:31576:1029/213110.344:ERROR:device_event_log_impl.cc(225)] [21:31:10.343] USB: usb_service_win.cc:104 SetupDiGetDeviceProperty({{A45C254E-DF1C-4EFD-8020-67D146A850E0}, 6}) failed: Kan element niet vinden. (0x490)
2023-10-29 21:31:10,385 amazon.py:427 [INFO] Found order '305-6007193-8432350'
2023-10-29 21:31:11,514 amazon.py:427 [INFO] Found order '305-1364969-2322728'
2023-10-29 21:31:12,697 amazon.py:427 [INFO] Found order '305-7507294-1425151'
2023-10-29 21:31:12,701 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:31:12,725 amazon.py:464 [INFO] Retrieving order group: '2022'
2023-10-29 21:31:14,640 amazon.py:427 [INFO] Found order '304-7286585-0240307'
2023-10-29 21:31:16,847 amazon.py:427 [INFO] Found order '304-0532664-5059518'
2023-10-29 21:31:18,000 amazon.py:427 [INFO] Found order '304-9561231-4685903'
2023-10-29 21:31:19,150 amazon.py:427 [INFO] Found order '304-1342413-3851547'
2023-10-29 21:31:20,295 amazon.py:427 [INFO] Found order '304-4127407-3957141'
2023-10-29 21:31:21,428 amazon.py:427 [INFO] Found order '304-9986951-1473951'
2023-10-29 21:31:22,585 amazon.py:427 [INFO] Found order '304-1950788-6290723'
2023-10-29 21:31:23,732 amazon.py:427 [INFO] Found order '304-7040015-3071518'
2023-10-29 21:31:24,874 amazon.py:427 [INFO] Found order '304-6516850-4829932'
2023-10-29 21:31:26,051 amazon.py:427 [INFO] Found order '304-5372088-9945954'
2023-10-29 21:31:26,069 amazon.py:441 [INFO] Next page.
2023-10-29 21:31:28,034 amazon.py:427 [INFO] Found order '304-7060525-9372365'
2023-10-29 21:31:29,158 amazon.py:427 [INFO] Found order '304-4859404-2549141'
2023-10-29 21:31:30,296 amazon.py:427 [INFO] Found order '304-4924192-0299519'
2023-10-29 21:31:31,412 amazon.py:427 [INFO] Found order '305-8808333-1320322'
2023-10-29 21:31:32,559 amazon.py:427 [INFO] Found order '304-0657323-9229902'
2023-10-29 21:31:33,739 amazon.py:427 [INFO] Found order '305-2521019-8087569'
2023-10-29 21:31:34,939 amazon.py:427 [INFO] Found order '306-6714178-4185146'
2023-10-29 21:31:34,945 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:31:34,975 amazon.py:464 [INFO] Retrieving order group: '2021'
2023-10-29 21:31:37,610 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-5183444-7545952'
2023-10-29 21:31:37,674 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-5183444-7545952'
2023-10-29 21:31:37,739 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-5183444-7545952'
2023-10-29 21:31:37,819 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-5183444-7545952'
2023-10-29 21:31:37,882 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-5183444-7545952'
2023-10-29 21:31:37,958 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-5183444-7545952'
2023-10-29 21:31:39,651 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4252588-7295546'
2023-10-29 21:31:39,709 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4252588-7295546'
2023-10-29 21:31:39,781 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4252588-7295546'
2023-10-29 21:31:39,845 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4252588-7295546'
2023-10-29 21:31:39,869 amazon.py:441 [INFO] Next page.
2023-10-29 21:31:41,930 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-2656410-5349950'
2023-10-29 21:31:42,006 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-2656410-5349950'
2023-10-29 21:31:42,086 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-2656410-5349950'
2023-10-29 21:31:42,151 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-2656410-5349950'
2023-10-29 21:31:42,218 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-2656410-5349950'
2023-10-29 21:31:42,291 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-2656410-5349950'
2023-10-29 21:31:43,989 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4665213-3963505'
2023-10-29 21:31:44,059 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4665213-3963505'
2023-10-29 21:31:44,124 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4665213-3963505'
2023-10-29 21:31:44,194 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4665213-3963505'
2023-10-29 21:31:44,206 amazon.py:441 [INFO] Next page.
2023-10-29 21:31:46,816 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-6833943-6329939'
2023-10-29 21:31:46,879 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-6833943-6329939'
2023-10-29 21:31:46,953 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-6833943-6329939'
2023-10-29 21:31:47,016 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-6833943-6329939'
2023-10-29 21:31:47,081 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-6833943-6329939'
2023-10-29 21:31:47,146 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-6833943-6329939'
2023-10-29 21:31:48,830 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4527795-0562718'
2023-10-29 21:31:48,892 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4527795-0562718'
2023-10-29 21:31:48,970 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4527795-0562718'
2023-10-29 21:31:49,034 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-4527795-0562718'
2023-10-29 21:31:49,050 amazon.py:441 [INFO] Next page.
2023-10-29 21:31:51,003 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8824203-0488334'
2023-10-29 21:31:51,059 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8824203-0488334'
2023-10-29 21:31:51,124 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8824203-0488334'
2023-10-29 21:31:51,177 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8824203-0488334'
2023-10-29 21:31:51,241 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8824203-0488334'
2023-10-29 21:31:51,246 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:31:51,279 amazon.py:464 [INFO] Retrieving order group: '2020'
2023-10-29 21:31:53,301 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8315787-5112305'
2023-10-29 21:31:53,368 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8315787-5112305'
2023-10-29 21:31:53,446 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8315787-5112305'
2023-10-29 21:31:53,510 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8315787-5112305'
2023-10-29 21:31:53,580 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8315787-5112305'
2023-10-29 21:31:53,641 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-8315787-5112305'
2023-10-29 21:31:55,319 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6574931-4521130'
2023-10-29 21:31:55,385 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6574931-4521130'
2023-10-29 21:31:55,451 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6574931-4521130'
2023-10-29 21:31:55,510 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6574931-4521130'
2023-10-29 21:31:55,527 amazon.py:441 [INFO] Next page.
2023-10-29 21:31:57,390 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-3192929-2299507'
2023-10-29 21:31:57,444 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-3192929-2299507'
2023-10-29 21:31:57,499 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-3192929-2299507'
2023-10-29 21:31:57,556 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-3192929-2299507'
2023-10-29 21:31:57,607 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-3192929-2299507'
2023-10-29 21:31:57,659 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-3192929-2299507'
2023-10-29 21:31:57,710 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-3192929-2299507'
2023-10-29 21:31:57,716 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:31:57,744 amazon.py:464 [INFO] Retrieving order group: '2019'
2023-10-29 21:31:59,862 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6447883-6259537'
2023-10-29 21:31:59,931 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6447883-6259537'
2023-10-29 21:31:59,993 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6447883-6259537'
2023-10-29 21:32:00,058 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6447883-6259537'
2023-10-29 21:32:00,118 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6447883-6259537'
2023-10-29 21:32:00,181 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-6447883-6259537'
2023-10-29 21:32:01,322 amazon.py:425 [INFO] Skipping already-downloaded invoice: '306-7975075-7649159'
2023-10-29 21:32:01,382 amazon.py:425 [INFO] Skipping already-downloaded invoice: '306-7975075-7649159'
2023-10-29 21:32:01,450 amazon.py:425 [INFO] Skipping already-downloaded invoice: '306-7975075-7649159'
2023-10-29 21:32:01,514 amazon.py:425 [INFO] Skipping already-downloaded invoice: '306-7975075-7649159'
2023-10-29 21:32:01,534 amazon.py:441 [INFO] Next page.
2023-10-29 21:32:03,450 amazon.py:425 [INFO] Skipping already-downloaded invoice: '306-6615968-4057139'
2023-10-29 21:32:03,453 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:32:03,487 amazon.py:464 [INFO] Retrieving order group: '2018'
2023-10-29 21:32:05,546 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-9120740-4793910'
2023-10-29 21:32:05,611 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-9120740-4793910'
2023-10-29 21:32:05,684 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-9120740-4793910'
2023-10-29 21:32:05,749 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-9120740-4793910'
2023-10-29 21:32:05,814 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-9120740-4793910'
2023-10-29 21:32:05,883 amazon.py:425 [INFO] Skipping already-downloaded invoice: '302-9120740-4793910'
2023-10-29 21:32:07,038 amazon.py:425 [INFO] Skipping already-downloaded invoice: '028-4791616-8929953'
2023-10-29 21:32:07,107 amazon.py:425 [INFO] Skipping already-downloaded invoice: '028-4791616-8929953'
2023-10-29 21:32:07,172 amazon.py:425 [INFO] Skipping already-downloaded invoice: '028-4791616-8929953'
2023-10-29 21:32:07,234 amazon.py:425 [INFO] Skipping already-downloaded invoice: '028-4791616-8929953'
2023-10-29 21:32:07,246 amazon.py:441 [INFO] Next page.
2023-10-29 21:32:09,193 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-7201984-1193117'
2023-10-29 21:32:09,244 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-7201984-1193117'
2023-10-29 21:32:09,301 amazon.py:425 [INFO] Skipping already-downloaded invoice: '305-7201984-1193117'
2023-10-29 21:32:09,308 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:32:09,350 amazon.py:464 [INFO] Retrieving order group: '2017'
2023-10-29 21:32:09,759 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:32:09,783 amazon.py:464 [INFO] Retrieving order group: '2016'
2023-10-29 21:32:10,184 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:32:10,211 amazon.py:464 [INFO] Retrieving order group: '2015'
2023-10-29 21:32:10,612 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:32:10,648 amazon.py:464 [INFO] Retrieving order group: 'Archivierte Bestellungen'
2023-10-29 21:32:11,001 amazon.py:436 [INFO] Found no more pages
Traceback (most recent call last):
  File "D:\Finance\finance-dl-master\finance_dl\scrape_lib.py", line 403, in retry
    return func()
  File "D:\Finance\finance-dl-master\finance_dl\scrape_lib.py", line 423, in fetch
    scraper.run()
  File "D:\Finance\finance-dl-master\finance_dl\amazon.py", line 585, in run
    self.get_orders(
  File "D:\Finance\finance-dl-master\finance_dl\amazon.py", line 478, in get_orders
    retrieve_all_order_groups()
  File "D:\Finance\finance-dl-master\finance_dl\amazon.py", line 448, in retrieve_all_order_groups
    (order_filter,), = self.wait_and_return(
  File "D:\Finance\finance-dl-master\finance_dl\scrape_lib.py", line 239, in wait_and_return
    WebDriverWait(self.driver, timeout).until(predicate, message=message)
  File "D:\Finance\env-financedl\lib\site-packages\selenium\webdriver\support\wait.py", line 87, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: Waiting to match conditions

PayPal issue with CSRF-Token

The PayPal-importer seems to fail when trying to get the csrf-token after logging in.
It seems like the whole structure of the webpage has changed.
I wasnt able to find the csrf-token manually.

Relevant Traceback:
Traceback (most recent call last): File "/opt/homebrew/Cellar/[email protected]/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/homebrew/Cellar/[email protected]/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/cli.py", line 94, in <module> main() File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/cli.py", line 90, in main module.run(**spec) File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/paypal.py", line 262, in run scrape_lib.run_with_scraper(Scraper, **kwargs) File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/scrape_lib.py", line 433, in run_with_scraper retry(fetch) File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/scrape_lib.py", line 411, in retry return func() File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/scrape_lib.py", line 431, in fetch scraper.run() File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/paypal.py", line 258, in run self.save_transactions() File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/paypal.py", line 200, in save_transactions transaction_list = self.get_transaction_list() File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/paypal.py", line 193, in get_transaction_list resp = self.make_json_request(url) File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/paypal.py", line 166, in make_json_request 'x-csrf-token': self.get_csrf_token(), File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/paypal.py", line 177, in get_csrf_token body_element, = self.wait_and_locate((By.ID, "__react_data__")) File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/scrape_lib.py", line 263, in wait_and_locate return self.wait_and_return( File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/finance_dl/scrape_lib.py", line 247, in wait_and_return WebDriverWait(self.driver, timeout).until(predicate, message=message) File "/Users/thies/Finanzen/Beancount/venv/lib/python3.10/site-packages/selenium/webdriver/support/wait.py", line 87, in until time.sleep(self._poll)

PayPal JSON schema changed

The exception is:

  File "/home/user/.local/lib/python3.8/site-packages/beancount_import/webserver.py", line 493, in _handle_reconciler_loaded
    loaded_reconciler = loaded_future.result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/home/user/.local/lib/python3.8/site-packages/beancount_import/thread_helpers.py", line 13, in wrapper
    f.set_result(fn(*args, **kwargs))
  File "/home/user/.local/lib/python3.8/site-packages/beancount_import/reconcile.py", line 396, in __init__
    all_source_results = self._prepare_sources()
  File "/home/user/.local/lib/python3.8/site-packages/beancount_import/reconcile.py", line 515, in _prepare_sources
    source.prepare(self.editor, source_results)
  File "/home/user/.local/lib/python3.8/site-packages/beancount_import/source/paypal.py", line 624, in prepare
    jsonschema.validate(txn, transaction_schema)
  File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 934, in validate
    raise error
jsonschema.exceptions.ValidationError: 'isCredit' is a required property

it occurs on the following object:

    {'amount': {'feeAmount': '$0.00',
                'grossAmount': '$19.99',
                'isZeroFee': True,
                'netAmount': '$19.99'},
     'cameFromResCenter': False,
     'cameFromSummary': False,
     'counterparty': {'detailsCounterpartyText': 'Company Inc.',
                      'name': 'Company Inc.'},
     'counterpartyAccountNumber': '123456789',
     'counterpartyBizName': 'Company Name.',
     'flags': {'isBuyer': True,
               'isOrder': True,
               'shouldUpgradeAccount': False},
     'fptiTag': 'orderplaced',
     'isNewActivityUIEnabled': True,
     'links': {'reportDispute': {'linkUrl': '/resolutioncenter/O-0T34207418947273H',
                                 'target': '_blank'},
               'upgradeToBusinessAcct': {'linkUrl': '/US/merchantsignup/router',
                                         'target': ''}},
     'merchantLogoUrl': None,
     'printDetailsLink': {'linkUrl': '/myaccount/transactions/print-orders/O-0T34207418947273H',
                          'target': '_blank'},
     'statusInfo': ['Your payment method will be charged when Company Inc. completes your order.'],
     'transactionId': 'O-0T34207418947273H',
     'transactionType': 'Order Placed',
     'viewContext': 'tdfullpage'}

I would appreciate some guidance on how to modify the schema; I'll be happy to submit a PR if I get it fixed.

ultipro_google fails with "element not interactable"

When I use ultipro_google, it logs in properly but crashes with:

[0/1] google_payroll [41s elapsed] 2019-09-27 21:17:23,347 ultipro_google.py:144 [INFO] Document datetime.date(2019, 9, 26) : 'UA29063061':  Downloading
[0/1] google_payroll [42s elapsed] Traceback (most recent call last):
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/finance_dl/scrape_lib.py", line 402, in retry
[0/1] google_payroll [42s elapsed]     return func()
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/finance_dl/scrape_lib.py", line 422, in fetch
[0/1] google_payroll [42s elapsed]     scraper.run()
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/finance_dl/ultipro_google.py", line 206, in run
[0/1] google_payroll [42s elapsed]     self.download_statements()
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/finance_dl/ultipro_google.py", line 200, in download_statements
[0/1] google_payroll [42s elapsed]     downloaded_statements=downloaded_statements,
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/finance_dl/ultipro_google.py", line 152, in get_next_statement
[0/1] google_payroll [42s elapsed]     download_link.click()
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 80, in click
[0/1] google_payroll [42s elapsed]     self._execute(Command.CLICK_ELEMENT)
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
[0/1] google_payroll [42s elapsed]     return self._parent.execute(command, params)
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
[0/1] google_payroll [42s elapsed]     self.error_handler.check_response(response)
[0/1] google_payroll [42s elapsed]   File "/home/mark/p/finances/.venv/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
[0/1] google_payroll [42s elapsed]     raise exception_class(message, screen, stacktrace)
[0/1] google_payroll [42s elapsed] selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable
[0/1] google_payroll [42s elapsed]   (Session info: chrome=77.0.3865.90)

amazon module fails to download digital invoices

finance-dl formats the url as
https://www.amazon.com/gp/css/summary/print.html?ie=UTF8&orderID=D01-1380792-3469006

which results in an error.

This one, which matches the pattern when I manually visit digital order invoices, works.
https://www.amazon.com/gp/digital/your-account/order-summary.html/ref=ppx_yo_dt_b_dpi_o00?ie=UTF8&orderID=D01-1380792-3469006&print=1

Dunno if the url has been changed since the code was written or if I'm hitting a unique issue.

Amazon(.co.uk) not seeing orders

Can't really provide much more info since there isn't any error

DevTools listening on ws://127.0.0.1:50082/devtools/browser/f72cd410-a08b-41e8-ab1a-cc6b26aad299
 --connect=http://localhost:50079 --session-id=7d8f0a6373be86bc7ee5a26c1d39f98d
2023-10-29 21:55:41,054 amazon.py:277 [INFO] Initiating log in
2023-10-29 21:55:41,608 amazon.py:288 [INFO] Looking for sign-in link
2023-10-29 21:55:41,943 amazon.py:294 [INFO] Looking for username link
2023-10-29 21:55:42,388 amazon.py:304 [INFO] Looking for password link
2023-10-29 21:55:42,435 amazon.py:310 [INFO] Looking for "remember me" checkbox
2023-10-29 21:55:43,689 amazon.py:319 [INFO] Logged in
2023-10-29 21:55:44,225 amazon.py:464 [INFO] Retrieving order group: 'last 30 days'
2023-10-29 21:55:44,611 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:44,639 amazon.py:464 [INFO] Retrieving order group: 'past three months'
2023-10-29 21:55:45,041 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:45,068 amazon.py:464 [INFO] Retrieving order group: '2023'
2023-10-29 21:55:45,878 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:45,910 amazon.py:464 [INFO] Retrieving order group: '2022'
2023-10-29 21:55:46,303 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:46,331 amazon.py:464 [INFO] Retrieving order group: '2021'
2023-10-29 21:55:47,726 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:47,753 amazon.py:464 [INFO] Retrieving order group: '2020'
2023-10-29 21:55:48,153 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:48,178 amazon.py:464 [INFO] Retrieving order group: '2019'
2023-10-29 21:55:48,568 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:48,599 amazon.py:464 [INFO] Retrieving order group: '2018'
2023-10-29 21:55:49,003 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:49,033 amazon.py:464 [INFO] Retrieving order group: '2017'
2023-10-29 21:55:49,438 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:49,464 amazon.py:464 [INFO] Retrieving order group: '2016'
2023-10-29 21:55:49,875 amazon.py:436 [INFO] Found no more pages
2023-10-29 21:55:49,902 amazon.py:464 [INFO] Retrieving order group: '2015'
2023-10-29 21:55:50,319 amazon.py:436 [INFO] Found no more pages

README command for automatic updates doesn't work

The README describes starting the automatic updates process by checking the status:

python -m finance_dl.cli --config-module example_finance_dl_config --log-dir logs status

On my system this fails:

usage: cli.py [-h] [--config-module CONFIG_MODULE]
              (--config CONFIG | --spec SPEC) [--interactive] [--visible]
              [--log LOG]
cli.py: error: one of the arguments --config/-c --spec/-s is required

If I add my --config google_purchases parameter, then it complains about the --log-dir parameter:

usage: cli.py [-h] [--config-module CONFIG_MODULE]
              (--config CONFIG | --spec SPEC) [--interactive] [--visible]
              [--log LOG]
cli.py: error: unrecognized arguments: --log-dir logs status

I think the entire Automatic Usage section needs to be updated, because I can't figure out how to accomplish those tasks with the current code.

CHROMEDRIVER_CHROME_BINARY TypeError: Binary Location Must be a String

I am getting

TypeError: Binary Location Must be a String

when trying to run finance_dl.cli with schwab configuration. It is looking for os.getenv("CHROMEDRIVER_CHROME_BINARY") but could not find it. Could you please tell me how to fix the issue?

Steps to reproduce:
Create an environment file

% cat env_test_finance-dl.yml
name: test_finance-dl
channels:
  - defaults
dependencies:
  - python=3.12
  - pip
  - pip:
    - git+https://github.com/jbms/finance-dl

Create the environment

% conda env create -f env_test_finance-dl.yml

Activate the environment

% conda activate test_finance-dl

Create a configuration file

% cat finance_dl_config.py
import os
profile_dir = os.path.join(os.getenv('HOME'), '.cache', 'finance_dl')
data_dir = '/home/rajulocal/x/x5'
def CONFIG_schwab():
    return dict(
        module='finance_dl.schwab',
        credentials={
            'username': 'XXXXXX',
            'password': 'XXXXXX',
        },
        output_directory=os.path.join(data_dir, 'schwab'),
        profile_dir=profile_dir,
        headless=False,
        min_start_date='2023-11-01'
    )

Run

% python -m finance_dl.cli --config-module finance_dl_config --config schwab
Traceback (most recent call last):
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 411, in retry
    return func()
           ^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 430, in fetch
    with temp_scraper(scraper_class, **kwargs) as scraper:
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 393, in temp_scraper
    scraper = scraper_type(*args, download_dir=download_dir,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/schwab.py", line 87, in __init__
    super().__init__(use_seleniumrequests=True, **kwargs)
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 146, in __init__
    chrome_options.binary_location = os.getenv("CHROMEDRIVER_CHROME_BINARY")
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/selenium/webdriver/chromium/options.py", line 52, in binary_location
    raise TypeError(self.BINARY_LOCATION_ERROR)
TypeError: Binary Location Must be a String
Waiting 0 seconds before retrying
Traceback (most recent call last):
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 411, in retry
    return func()
           ^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 430, in fetch
    with temp_scraper(scraper_class, **kwargs) as scraper:
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 393, in temp_scraper
    scraper = scraper_type(*args, download_dir=download_dir,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/schwab.py", line 87, in __init__
    super().__init__(use_seleniumrequests=True, **kwargs)
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 146, in __init__
    chrome_options.binary_location = os.getenv("CHROMEDRIVER_CHROME_BINARY")
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/selenium/webdriver/chromium/options.py", line 52, in binary_location
    raise TypeError(self.BINARY_LOCATION_ERROR)
TypeError: Binary Location Must be a String
Waiting 0 seconds before retrying
Traceback (most recent call last):
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 411, in retry
    return func()
           ^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 430, in fetch
    with temp_scraper(scraper_class, **kwargs) as scraper:
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 393, in temp_scraper
    scraper = scraper_type(*args, download_dir=download_dir,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/schwab.py", line 87, in __init__
    super().__init__(use_seleniumrequests=True, **kwargs)
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 146, in __init__
    chrome_options.binary_location = os.getenv("CHROMEDRIVER_CHROME_BINARY")
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/selenium/webdriver/chromium/options.py", line 52, in binary_location
    raise TypeError(self.BINARY_LOCATION_ERROR)
TypeError: Binary Location Must be a String
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/cli.py", line 94, in <module>
    main()
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/cli.py", line 90, in main
    module.run(**spec)
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/schwab.py", line 374, in run
    scrape_lib.run_with_scraper(SchwabScraper, **kwargs)
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 433, in run_with_scraper
    retry(fetch)
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 411, in retry
    return func()
           ^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 430, in fetch
    with temp_scraper(scraper_class, **kwargs) as scraper:
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 393, in temp_scraper
    scraper = scraper_type(*args, download_dir=download_dir,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/schwab.py", line 87, in __init__
    super().__init__(use_seleniumrequests=True, **kwargs)
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/finance_dl/scrape_lib.py", line 146, in __init__
    chrome_options.binary_location = os.getenv("CHROMEDRIVER_CHROME_BINARY")
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/selenium/webdriver/chromium/options.py", line 52, in binary_location
    raise TypeError(self.BINARY_LOCATION_ERROR)
TypeError: Binary Location Must be a String

There is no CHROMEDRIVER_CHROME_BINARY environment variable set on my machine.

% ipython
Python 3.12.0 | packaged by Anaconda, Inc. | (main, Oct  2 2023, 17:29:18) [GCC 11.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: 
import os

In [2]: 
os.getenv("CHROMEDRIVER_CHROME_BINARY") is None
Out[2]: 
True

FWIW chromedriver-path shows

% chromedriver-path 
/opt/rajulocal/miniconda3/envs/test_finance-dl/lib/python3.12/site-packages/chromedriver_binary

Please document requirements

It would be nice to have the requirements documented, with examples for the most common systems. I'm having trouble getting this to work on my XUbuntu VM after installing it in my beancount virtualenv:

selenium.common.exceptions.WebDriverException: Message: 'chromedriver_wrapper.py' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home

There's no description for how #1 was solved. I tried installing google-chrome-beta, google-chrome-stable, as well as the chromium snap, and chromium from the ubuntu repositories. I also tried installing chromedriver and it's dependencies:

sudo apt-get install chromium-chromedriver

And still, same error. I'm not sure how to get this working.

Paypal Selenium TimeoutException

It has been a year+ since I used finance-dl. I tried it on my previous pc where I downloaded finance-dl and imported it as a module locally. I think I messed up my env at some point so I remade it when I wanted to use it again a week ago and nothing
(only use amazon, paypal) worked. Now I'm at a point that I can actually login again on paypal but now I'm getting this error.

Anyone can supply a requirements file? I had a lot of issues getting up to this point because of newer version packages that don't work.
Now I got up to a point I have a persistent google profile, which loads paypal but then just runs into this error. This is when the paypal page is on the activities page.

I also noticed this is an issue with the CSFR token? Because I found this in the paypal file, might be an issue with paypal already being on activities page?

    def get_csrf_token(self):
        if self.csrf_token is not None: return self.csrf_token
        logging.info('Getting CSRF token')
        self.driver.get('https://www.paypal.com/myaccount/transactions/')
        # Get CSRF token
        body_element, = self.wait_and_locate((By.XPATH,
                                              '//body[@data-token!=""]'))
        self.csrf_token = body_element.get_attribute('data-token')
        return self.csrf_token
Traceback (most recent call last):
  File "C:\Users\Dieter\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Dieter\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "d:\finance\import beancount\finance-dl\finance_dl\cli.py", line 91, in <module>
    main()
  File "d:\finance\import beancount\finance-dl\finance_dl\cli.py", line 87, in main
    module.run(**spec)
  File "d:\finance\import beancount\finance-dl\finance_dl\paypal.py", line 270, in run
    scrape_lib.run_with_scraper(Scraper, **kwargs)
  File "d:\finance\import beancount\finance-dl\finance_dl\scrape_lib.py", line 424, in run_with_scraper
    retry(fetch)
  File "d:\finance\import beancount\finance-dl\finance_dl\scrape_lib.py", line 402, in retry
    return func()
  File "d:\finance\import beancount\finance-dl\finance_dl\scrape_lib.py", line 422, in fetch
    scraper.run()
  File "d:\finance\import beancount\finance-dl\finance_dl\paypal.py", line 266, in run
    self.save_transactions()
  File "d:\finance\import beancount\finance-dl\finance_dl\paypal.py", line 207, in save_transactions
    transaction_list = self.get_transaction_list()
  File "d:\finance\import beancount\finance-dl\finance_dl\paypal.py", line 196, in get_transaction_list
    resp = self.make_json_request(url)
  File "d:\finance\import beancount\finance-dl\finance_dl\paypal.py", line 169, in make_json_request
    'x-csrf-token': self.get_csrf_token(),
  File "d:\finance\import beancount\finance-dl\finance_dl\paypal.py", line 180, in get_csrf_token
    body_element, = self.wait_and_locate((By.XPATH,
  File "d:\finance\import beancount\finance-dl\finance_dl\scrape_lib.py", line 254, in wait_and_locate
    return self.wait_and_return(
  File "d:\finance\import beancount\finance-dl\finance_dl\scrape_lib.py", line 238, in wait_and_return
    WebDriverWait(self.driver, timeout).until(predicate, message=message)
  File "D:\Finance\Import Beancount\env\lib\site-packages\selenium\webdriver\support\wait.py", line 87, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: Waiting to locate (('xpath', '//body[@data-token!=""]'),)
Stacktrace:
        GetHandleVerifier [0x00007FF61DA08EF2+54786]
        (No symbol) [0x00007FF61D975612]
        (No symbol) [0x00007FF61D82A64B]
        (No symbol) [0x00007FF61D86B79C]
        (No symbol) [0x00007FF61D86B91C]
        (No symbol) [0x00007FF61D8A6D87]
        (No symbol) [0x00007FF61D88BEAF]
        (No symbol) [0x00007FF61D8A4D02]
        (No symbol) [0x00007FF61D88BC43]
        (No symbol) [0x00007FF61D860941]
        (No symbol) [0x00007FF61D861B84]
        GetHandleVerifier [0x00007FF61DD57F52+3524194]
        GetHandleVerifier [0x00007FF61DDAD800+3874576]
        GetHandleVerifier [0x00007FF61DDA5D7F+3843215]
        GetHandleVerifier [0x00007FF61DAA5086+694166]
        (No symbol) [0x00007FF61D980A88]
        (No symbol) [0x00007FF61D97CA94]
        (No symbol) [0x00007FF61D97CBC2]
        (No symbol) [0x00007FF61D96CC83]
        BaseThreadInitThunk [0x00007FFC4336257D+29]
        RtlUserThreadStart [0x00007FFC4428AA78+40]

unexpected keyword argument 'required' in add_subparsers

Updated from 1.0.3 to 1.2.0, and am now getting this error:

python -m finance_dl.update --config-module finance_dl_config --log-dir logs status
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/...snip.../venvs/beancount/lib/python3.6/site-packages/finance_dl/update.py", line 187, in <module>
    main()
  File "/...snip.../venvs/beancount/lib/python3.6/site-packages/finance_dl/update.py", line 153, in main
    subparsers = ap.add_subparsers(dest='command', required=True)
  File "/usr/lib/python3.6/argparse.py", line 1716, in add_subparsers
    action = parsers_class(option_strings=[], **kwargs)
TypeError: __init__() got an unexpected keyword argument 'required'

Fairly broken all around?

I love the idea here but I'm having lots of trouble with these importers.

Here's my summary trying to get a few to work:

  • healthequity: works
  • amazon: very broken w/ scraping and navigation problems
  • mint: broken cuz Chrome crashes weirdly
  • google_purchases: google thinks its a suspicious login
  • stockplanconnect: login completion fails. (their login forms changed recently)
  • pge: scraper doesnt notice its logged in.
  • anthem: anthem thinks its a suspicious login.

I'm not throwing shade as we all know how hard it is to maintain scrapers. Just noting this here to potentially save other folks time.

(then again if someone is having success with these, then ignore me :)

OFX downloading is not working properly

I've got both Chase and Amex OFX downloading working, however, the behavior I'm seeing is that the updater downloads the OFX data for min_start_date and then walks backward day by day and never stopping. I end up with a output_dir full of ofx files with the same data. The only thing that differs between the files is the dtstart/dtend. The actual transactions contained are the same.

For reference, here is how I'm running this:

python -m finance_dl.update --config-module finance_dl_config --log-dir logs update amex

The earliest transaction I can get back from the Amex OFX endpoint is from 2017-09-16, but my output_dir contains files from 19850625-19850625--1556073305.ofx to 19900102-19900102--1556070848.ofx

From what I can discern from reading your code and parsing the OFX data it appears to me, that if you call account.download(days=num_days) the dtstart/dtend returned are based on the num_days, and not the actual transactions dtposted. So since each call returns a valid dtstart/dtend range, the updater thinks it has to keep walking backward to find a range that is invalid. But it never will.

From my testing:

In [42]: num_days = 5353 # this is what the first value of `mid` is
In [43]: data = account.download(days=num_days).read()
In [44]: dtstart = parse_ofx_time(re.findall(r'<DTSTART>([^<]+)', data)[0])
In [45]: dtend = parse_ofx_time(re.findall(r'<DTEND>([^<]+)', data)[0])
In [46]: print('#', dtstart.date(), '--', dtend.date())
# 2004-08-27 -- 2019-04-24
In [47]: txn_dates = [parse_ofx_time(d) for d in re.findall(r'<DTPOSTED>([^<]+)', data)]
In [48]: txn_dates.sort()
In [49]: print('#', txn_dates[0], '--', txn_dates[-1])
# 2017-09-16 00:00:00 -- 2019-04-17 00:00:00

finance_dl.ofx only returns past 2 years data for Fidelity Investments or Fidelity NetBenefits

I used default parameters and ran this command on different days for Fidelity, but it seems it returns data only exactly 2 years from today, even though ofx.py is searching from 1990-01-01.

Is this an issue with finance-dl, ofxclient, or Fidelity?

def CONFIG_fidelity_investments():
    # To determine the correct values for `id`, `org`, and `url` for your
    # financial institution, search on https://www.ofxhome.com/
    ofx_params = {
        'id': '7776',
        'org': 'fidelity.com',
        'url': 'https://ofx.fidelity.com/ftgw/OFX/clients/download',
        'username': '',
        'password': '',
    }
    return dict(
        module='finance_dl.ofx',
        ofx_params=ofx_params,
        output_directory=os.path.join(data_dir, 'fidelity_investments'),
    )
โžœ  python -m finance_dl.cli --config-module finance_dl_config --config fidelity_investments
2020-04-14 13:19:49,658 ofx.py:183 [INFO] Binary searching to find earliest data available for account X12345678.
2020-04-14 13:19:49,658 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2005-02-21.
2020-04-14 13:19:50,156 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1997-07-28.
2020-04-14 13:19:50,747 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1993-10-14.
2020-04-14 13:19:51,329 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1991-11-23.
2020-04-14 13:19:51,786 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-12-12.
2020-04-14 13:19:52,149 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-06-22.
2020-04-14 13:19:52,544 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-03-28.
2020-04-14 13:19:52,919 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-02-13.
2020-04-14 13:19:53,393 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-01-22.
2020-04-14 13:19:53,832 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-01-11.
2020-04-14 13:19:54,242 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-01-06.
2020-04-14 13:19:54,656 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-01-03.
2020-04-14 13:19:55,063 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 1990-01-02.
2020-04-14 13:19:55,474 ofx.py:268 [INFO] Received data 2018-04-15 16:19:55 -- 2018-07-14 16:19:55
2020-04-14 13:19:55,491 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2018-07-12.
2020-04-14 13:19:56,288 ofx.py:268 [INFO] Received data 2018-07-12 00:00:00 -- 2018-10-10 00:00:00
2020-04-14 13:19:56,304 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2018-10-08.
2020-04-14 13:19:56,700 ofx.py:268 [INFO] Received data 2018-10-08 00:00:00 -- 2019-01-05 23:00:00
2020-04-14 13:19:56,718 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2019-01-03.
2020-04-14 13:19:57,517 ofx.py:268 [INFO] Received data 2019-01-03 00:00:00 -- 2019-04-03 01:00:00
2020-04-14 13:19:57,535 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2019-04-01.
2020-04-14 13:19:58,342 ofx.py:268 [INFO] Received data 2019-04-01 00:00:00 -- 2019-06-30 00:00:00
2020-04-14 13:19:58,360 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2019-06-28.
2020-04-14 13:19:59,166 ofx.py:268 [INFO] Received data 2019-06-28 00:00:00 -- 2019-09-26 00:00:00
2020-04-14 13:19:59,183 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2019-09-24.
2020-04-14 13:19:59,567 ofx.py:268 [INFO] Received data 2019-09-24 00:00:00 -- 2019-12-22 23:00:00
2020-04-14 13:19:59,583 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2019-12-20.
2020-04-14 13:19:59,978 ofx.py:268 [INFO] Received data 2019-12-20 00:00:00 -- 2020-03-19 01:00:00
2020-04-14 13:19:59,993 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2020-03-17.
2020-04-14 13:20:00,682 ofx.py:268 [INFO] Received data 2020-03-17 00:00:00 -- 2020-04-14 16:20:00
2020-04-14 13:20:00,702 ofx.py:146 [INFO] Trying to retrieve data for X12345678 starting at 2020-04-12.
2020-04-14 13:20:01,101 ofx.py:268 [INFO] Received data 2020-04-12 00:00:00 -- 2020-04-14 16:20:00

Amazon scraper fails to log in

When I try to use the amazon scraper, amazon presents me with just the email text box, with a Continue button.

Once I click that, the scraper works, as it can then find the password.

It might be nice for it to automatically click on the Continue button if there is one?

paypal login breaks when "security challenge" is shown

Paypal shows a recaptcha after we enter the username / password. However, we navigate away from the security challenge screen before I can click on the recaptcha button.

1560296704

Here is the log:

[0/1] paypal [0s elapsed] starting
[0/1] paypal [2s elapsed] 2019-06-11 18:44:27,880 paypal.py:136 [INFO] Finding username field
[0/1] paypal [2s elapsed] 2019-06-11 18:44:28,016 paypal.py:139 [INFO] Entering username
[0/1] paypal [3s elapsed] 2019-06-11 18:44:28,154 paypal.py:142 [INFO] Finding password field
[0/1] paypal [3s elapsed] 2019-06-11 18:44:29,016 paypal.py:145 [INFO] Entering password
[0/1] paypal [4s elapsed] 2019-06-11 18:44:30,065 paypal.py:149 [INFO] Logged in
[0/1] paypal [4s elapsed] 2019-06-11 18:44:30,065 paypal.py:175 [INFO] Getting transaction list
[0/1] paypal [4s elapsed] 2019-06-11 18:44:30,065 paypal.py:163 [INFO] Getting CSRF token
[0/1] paypal [38s elapsed] Traceback (most recent call last):
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/scrape_lib.py", line 402, in retry
[0/1] paypal [38s elapsed]     return func()
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/scrape_lib.py", line 422, in fetch
[0/1] paypal [38s elapsed]     scraper.run()
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/paypal.py", line 246, in run
[0/1] paypal [38s elapsed]     self.save_transactions()
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/paypal.py", line 189, in save_transactions
[0/1] paypal [38s elapsed]     transaction_list = self.get_transaction_list()
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/paypal.py", line 182, in get_transaction_list
[0/1] paypal [38s elapsed]     resp = self.make_json_request(url)
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/paypal.py", line 156, in make_json_request
[0/1] paypal [38s elapsed]     'x-csrf-token': self.get_csrf_token(),
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/paypal.py", line 167, in get_csrf_token
[0/1] paypal [38s elapsed]     '//body[@data-token!=""]'))
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/scrape_lib.py", line 256, in wait_and_locate
[0/1] paypal [38s elapsed]     message='Waiting to locate %r' % (locators, ))
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/finance_dl/scrape_lib.py", line 238, in wait_and_return
[0/1] paypal [38s elapsed]     WebDriverWait(self.driver, timeout).until(predicate, message=message)
[0/1] paypal [38s elapsed]   File "/usr/lib/python3.7/site-packages/selenium/webdriver/support/wait.py", line 80, in until
[0/1] paypal [38s elapsed]     raise TimeoutException(message, screen, stacktrace)
[0/1] paypal [38s elapsed] selenium.common.exceptions.TimeoutException: Message: Waiting to locate (('xpath', '//body[@data-token!=""]'),)

chromedriver issue

I'm getting the following error when I run the command:

python3 -m finance_dl.cli --config-module finance_dl_config --config mint

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/selenium/webdriver/common/service.py", line 76, in start
    stdin=PIPE)
  File "/usr/lib/python3.6/subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1344, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
PermissionError: [Errno 13] Permission denied: '/usr/lib/python3.6/site-packages/finance_dl/chromedriver_wrapper.py'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3.6/site-packages/finance_dl/cli.py", line 91, in <module>
    main()
  File "/usr/lib/python3.6/site-packages/finance_dl/cli.py", line 87, in main
    module.run(**spec)
  File "/usr/lib/python3.6/site-packages/finance_dl/mint.py", line 463, in run
    balances_output_prefix=balances_output_prefix, **kwargs)
  File "/usr/lib/python3.6/site-packages/finance_dl/mint.py", line 435, in fetch_mint_data
    with connect(credentials, kwargs) as mint:
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/usr/lib/python3.6/site-packages/finance_dl/mint.py", line 169, in connect
    **scraper_args) as scraper:
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/usr/lib/python3.6/site-packages/finance_dl/scrape_lib.py", line 385, in temp_scraper
    headless=headless, **kwargs)
  File "/usr/lib/python3.6/site-packages/finance_dl/mint.py", line 114, in __init__
    super().__init__(use_seleniumrequests=True, **kwargs)
  File "/usr/lib/python3.6/site-packages/finance_dl/scrape_lib.py", line 170, in __init__
    service_args=service_args,
  File "/usr/lib/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
    self.service.start()
  File "/usr/lib/python3.6/site-packages/selenium/webdriver/common/service.py", line 88, in start
    os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'chromedriver_wrapper.py' executable may have wrong permissions. Please see https://sites.google.com/a/chromium.org/chromedriver/home

Any thoughts on how I can fix this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.