The amazon-orders from alexdlaird

amazon-orders's Issues

.gitattributes Should Be Hiding HTML From Linguistics

But it's not, so HTML is listed as the dominant language just from test resources.

Support for international Amazon sites

Acknowledgements

I have written a descriptive issue title
I have searched Issues to see if the feature has already been requested
If relevant, I have enabled debug mode and am attaching relevant console logs and HTML files

Describe the Feature

The code is only compatible with Amazon US.
It would be really nice to make it possible to use it on international Amazon sites.

Describe Alternative Solutions/Workarounds

Setting AMAZON_BASE_URL is not sufficient, because the parsing code expects English text and US Dollar amounts.

Implement Private Resources Folder for Testing

Implement a private-resources folder that is .gitignored. The contents of this folder will be JSON files that correspond to order numbers. The integration tests will then load these files using kwargs and assert against the contents.

This will allow us to generically define integration tests, but keep the contents private, which will allow us to write even more (or others to write their own generically) in the future.

Use Requests Cookie Jar to Persist Session Between Runs

Requests's CookieJar can be saved and loaded to persist session data across runs.

https://scrapfly.io/blog/save-and-load-cookies-in-requests-python/

Add Support for Async Auth Flow

Currently it's non-intuitive if someone doesn't want to use console input to answer OTP or Captcha questions. Add a way for them to pass their own lambda to AuthForms (still default to input), then document how this can be used.

CSS Selectors

Currently we're relying on beautifulsoup's find() methods and the like. Chaining them together can be brittle, and also just requires more time and testing if/when they need updated. Relying on a single CSS Selector (when possible or XPath for every field would also make the library much more extendable (and more quickly patched when things break).

Improve docs

Auto-generated docs just cover the basic functionality right now, they could be expanded greatly to have no blank docstrings.

I love this project! Unable to pickle orders

Acknowledgements

I have written a descriptive issue title
I have searched Issues to see if the feature has already been requested
If relevant, I have enabled debug mode and am attaching relevant console logs and HTML files

Describe the Feature

I'd like to be able to easily cache/memoize my Amazon orders. This makes sense to me since every order from any year besides the current one will essentially never change. At the moment I'm working on a personal script and am constantly re-fetching them from Amazon. I worry that doing this too much might trigger some hacking/abuse detection stuff from Amazon and they'll lock my account. Also it's just a bit slow.

I'd love it if there was a built in cache. Of course, full caching support could be a big feature, so a simpler request would just to make it so an Order object is serializable and I can cache them myself. If I try to do pickle.dump(some_order, file_handler) I get a RecursionError: maximum recursion depth exceeded error. I assume this because Orders contain some sort of circular reference though I haven't investigated why.

Describe Alternative Solutions/Workarounds

At the moment I don't have a clear workaround. If I were to approach this myself I'd probably just dig into the code and figure out how to use some sort of cache on all HTTP requests. There's caching wrappers for requests that you can configure so they cache everything (ignore normal HTTP caching rules) so that'd be one alternative approach. Though making this a first class feature is more work.

AmazonOrdersAuthError on amazon.com (sometimes)

If you rerun the integration tests too quickly, you get an error like this:

>               raise AmazonOrdersAuthError(
                    "An error occurred, this is an unknown page: {}. To capture the page to a file, set the `debug` flag.".format(
                        self.last_response.url))
E               amazonorders.exception.AmazonOrdersAuthError: An error occurred, this is an unknown page: https://www.amazon.com. To capture the page to a file, set the `debug` flag.

../amazonorders/session.py:189: AmazonOrdersAuthError

The stored session cookies should be recognizing we're already logged in, which is why it sends us to amazon.com instead of the sign-in page. Normally this works, and in fact if you wait 30-60 seconds to rerun, it will work, but there must be some edge case we should debug if you rerun too quickly (maybe something adjacent to Captcha, another flow we need to deduce?).

Returns no orders when more than zero were expected.

Acknowledgements

I have written a descriptive issue title
I have searched Issues to see if the bug has already been reported
I have searched Stack Overflow to ensure the issue I'm experiencing has not already been discussed
If possible, I have enabled debug mode and am attaching relevant console logs and HTML files

Operating System

EndeavourOS

Python Version

3.11

amazon-orders Version

1.0.6

Describe the Bug

from amazonorders.session import AmazonSession
from amazonorders.orders import AmazonOrders
import csv
import os
email = os.environ['AMAZON_EMAIL']
password = os.environ['AMAZON_PASSWORD']
amazon_session = AmazonSession(email, password)

amazon_session.login()

amazon_orders = AmazonOrders(amazon_session)
orders = amazon_orders.get_order_history(year=2023)
print(orders)

The above prints []
I wrote the file out in the get_orders_history/orders.py

        while next_page:
            self.amazon_session.get(next_page)
            response_parsed = self.amazon_session.last_response_parsed
            with open("idk.hml", 'w', newline='') as f:
                f.write(str(response_parsed))

It looks like the expected HTML is very different from what it gets back.
There is no "order-card" but there is a "js-order-card"

I've not use beautiful soup much, so I'm not familiar with the selector stuffs.

FYI, PyPI only has version 1.0.4

Steps to Reproduce

Run code with an account that has orders in 2023.

Expected Behavior

To pull all the orders from 2023

Handle SMS OTP when captchas fail

Acknowledgements

I have written a descriptive issue title
I have searched Issues to see if the feature has already been requested

Describe the Feature

When the captcha is failed enough times, Amazon will switch to doing an SMS OTP. This request is to add support for that.

Describe Alternative Solutions/Workarounds

No response

Finish build-test-resources.py

Finish implement build-test-resources.py script to take integration test data and obfuscate to a unit test. For some reason this script presently also gets stuck on Captcha messages.

On CLI, usename and password aren't actually needed every time now that sessions are persisted

If a previous run of the application still has a valid session, username/password are actually ignored (and this can be a little confusing in the user experience). Find a way to surface this so it's obvious to the user what is happening, and also in that case username/password can become optional.

alexdlaird / amazon-orders Goto Github PK

amazon-orders's People

Contributors

Stargazers

Watchers

Forkers

amazon-orders's Issues

Acknowledgements

Describe the Feature

Describe Alternative Solutions/Workarounds

Acknowledgements

Describe the Feature

Describe Alternative Solutions/Workarounds

Acknowledgements

Operating System

Python Version

amazon-orders Version

Describe the Bug

Steps to Reproduce

Expected Behavior

Acknowledgements

Describe the Feature

Describe Alternative Solutions/Workarounds

Recommend Projects

Recommend Topics

Recommend Org