Giter Site home page Giter Site logo

choonster / catalogue-scanner Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 1.82 MB

Scans catalogues for specific items

C# 51.24% HTML 45.72% CSS 0.47% JavaScript 0.10% Bicep 1.99% PowerShell 0.27% Smalltalk 0.22%
azure azure-functions blazor catalog catalogue csharp dotnet dotnet6

catalogue-scanner's People

Contributors

choonster avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

catalogue-scanner's Issues

Fix Functions App deployment errors

Deployment of the Functions App often fails with the following error:

When request Azure resource at PublishContent, Sync Trigger Functionapp : Failed to perform sync trigger on function app. Function app may have malformed content. Please manually restart your function app and inspect the package from WEBSITE_RUN_FROM_PACKAGE.

Based on this comment, it seems that Zip Deploy is recommend over the RBAC deployment we're currently using.

Add option to reset scan state for catalogue

If scanning a catalogue gets stuck on an error, the user should be able to reset the scan state to allow the next invocation of the scan function to run completely.

Either add this to configuration UI via a HTTP trigger function or add a manual trigger function that can be triggered from the Azure Portal.

API requests from UI return 401 some time after first request in session

When a user opens the Configuration UI for the first time in a browsing session, Azure's automatic app service authentication authenticates the user and creates a session cookie. Some time after this (roughly a few hours), the token obtained from ITokenAcquisition.GetAccessTokenForUserAsync in _Host.cshtml seems to expire and requests to the Web API functions start returning 401, despite the session still being valid for the UI. This persists until the user ends their browsing session (e.g. by closing the browser) or manually clears the session cookie.

The UI should automatically detect this and obtain a new token, possibly by clearing the session cookie to force re-authentication.

Set up CI/CD

Need to set up automatic build and publish with GitHub Actions.

Add nested match rules

It should be possible to configure nested match rules with operators like AND (all child rules must match) or OR (any child rule can match).

Need to figure out how to edit them in the configuration UI. Possibly a dialog with the same table layout as the main Matching Configuration page?

Fix DownloadColesOnlineSpecialsPage function constantly timing out

DownloadColesOnlineSpecialsPage often times out, sometimes causing the orchestration to fail. One possible way to fix this would be to throttle the number of concurrent executions of the function using the durableTask/maxConcurrentActivityFunctions setting described here; but this would also apply to other functions.

Add support for scanning multiple catalogues per store

In addition to scanning the current catalogue, it should also be possible to scan next week's catalogue and any additional catalogues; when these are available.

The notification email should include the start and end dates of the catalogue and indicate whether it's current or future.

Can't run Playwright installation script on Linux: permission denied

With the change in 41bad70, the application fails at startup on Linux app service plan with this error:

System.Diagnostics.Process: An error occurred trying to start process '/home/site/wwwroot/bin/.playwright/node/linux/playwright.sh' with working directory '/'. Permission denied.

This could be due to the directory the script is in, or the file permissions on the script itself (e.g. execute permission not set).

Update to .NET 6/Azure Functions 4

The update to .NET 6 may remove the need for CatalogueScanner.WebScraping.API to be a separate Web API application, if Playwright works inside the Azure Functions host process (Durable Functions still aren't supported in the isolated process model).

This will also allow CatalogueScanner.DefaultHost, CatalogueScanner.ConfigurationUI and all the class libraries to target the same framework version, rather than a mix of .NET Standard/Core/5 like they do now.

Add Pricing information to match rules and digest emails

It should be possible to add match rules to filter on item prices, and item prices should be included in the digest email.

This should be relatively easy to implement for Coles/Woolworths Online as their response data includes prices, but it may be more difficult for SaleFinder catalogues.

Fix Coles Online API errors

The build ID in the Coles Online Data URLs has been updated. We need to automatically fetch the current build ID from the website instead of hardcoding it.

Add location support for Woolworths Online

Currently, the Woolworths Online specials scanning will always use the default location; which is probably based on a geo IP lookup of the Functions app. Ideally there should be an option to configure this in the configuration UI, but it looks like Woolworths Online only supports this for logged-in users rather than using a simple cookie like other sites.

Fix Woolworths Online errors

Woolworths Online requests often time out, causing the scan function to fail. We may be able to mitigate this and #56 by throttling the number of individual functions that can run concurrently, probably by splitting the download functions into "pages" of 25(?) and waiting for each page to complete before starting on the next one.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.