Giter Site home page Giter Site logo

cookidump_paprika3's Introduction

cookidump

List and export recipes from your Cookidoo collections.

Description

This program allows you to list and export recipes from your created and saved collections on Cookidoo. The export is in JSON compatible with Paprika 3.

In order to list or export the recipes, a valid subscription is needed.

This program is derived from auino/cookidump; see that program for origins, citiations, disclaimers, etc.

Installing dependencies

  1. Install Python 3

  2. Install Python dependencies for the dumpCollections.py script:

     pip3 install -U -r requirements.txt
    
  3. Download the Chrome WebDriver, naming it appropriately for the current architecture, and update the script cookidump if needed for your architecture.

  4. Install npm

  5. Install prettier, and the plugin prettier-plugin-sort-json

Running the script

cookidump [-r recipes_folder] [-j json_folder] [-p pattern]

where:

  • -r recipes_folder names a folder (directory) where the lists of recipes in each collection will go If this option is not specified, the default is ./recipes.

    Note: If this option is not specified and no -p pattern option is specified, then the default recipes folder ./recipes is deleted and recreated; this includes the nested folder containing the JSON recipe files. In other words, if you do not want to start from scratch, you must specify a pattern or a recipes folder.

  • -j json_folder names a folder (directory) where the JSON files for each recipe are written. This folder is named relative to the recipes folder. If the -j json_folder option is not specified, no JSON is written.

  • -p pattern provides a filter for deciding which collections and/or recipes will be dumped. The pattern is of the form regular_expression[::regular_expression]. The first regular expression is used to match collection names; only collections matching the given regular expression are listed. The second regular expression (if given) is used to match recipes; only recipes matching the given regular expression are dumped to JSON files (and then only if the -j json_folder option is specified). If you want to dump certain recipes from any collection, use a filter of the form .::recipe_pattern; the initial dot will match all collections. If you want to dump all recipes from certain collections, use a filter of the form collection_pattern::..

The program will open a Google Chrome window and wait until you are logged in into your Cookidoo account. The script currently hard-codes the starting URL for the USA Cookidoo - you can change that as appropriate for your locale.

The program creates a file per created or saved collection, plus one for your bookmarks, plus one for your created recipes. That file contains lines such as:

r470647 https://cookidoo.thermomix.com/recipes/recipe/en-US/r470647 Sesame Orange Chicken

the recipe id (such as r470647), which is a globally unique ID for the recipe (a recipe has that same ID in all instances of Cookidoo), the recipe URL (which starts off differently in each Cookidoo regional instance, but ends with the same recipe ID), and the recipe name.

The program also creates a couple of index files, containing the names of your collections and the number of recipes in that collection.

Output is represented by an index.html file, included in outputdir, plus a set of recipes inside of structured folders. By opening the generated index.html file on your browser, it is possible to have a list of recipes downloaded and surf to the desired recipe.

The number of exported recipes is limited to around 1000 for each execution. Hence, use of filters may help in this case to reduce the number of recipes exported.

Other approaches

A different approach, previously adopted, is based on the retrieval of structured data on recipes. More information can be found on the datastructure branch. Output is represented in this case in a different (structured) format, hence, it has to be interpreted. Such interpretation is not implemented in the linked previous commit.

TODO

  • Bypass the limited number of exported recipes
  • Parse downloaded recipes to store them on a database, or to generate a unique linked PDF
  • Make Chrome run headless for better speeds
  • Set up a dedicated container for the program

Supporters

  • @vikramsoni2, regarding JSON saves plus minor enhancements
  • @mrwogu, regarding additional information to be extracted on the generated JSON file, plus suggestions on the possibility to save recipes on dedicated JSON files
  • @nilskrause, regarding argument parsing and updates on the link to download the Chrome WebDriver
  • @NightProgramming, regarding the use of selenium version 3
  • @morela, regarding the update of the tool to support a newer version of Selenium
  • @ndjc, fixing some deprecation warnings

Disclaimer

The authors of this program are not responsible of the usage of it. This program is released only for research and dissemination purposes. Also, the program provides users the ability to locally and temporarily store recipes accessible through a legit subscription. Before using this program, check Cookidoo subscription terms of service, according to the country related to the exploited subscription. Sharing of the obtained recipes is not a legit activity and the authors of this program are not responsible of any illecit and sharing activity accomplished by the users.

Contacts

You can find me on Twitter as @auino.

This program is derived from auino/cookidump; see that program for origins, citiations, disclaimers, etc.

Installing dependencies

  1. Install Python 3

  2. Install Python dependencies for the dumpCollections.py script:

     pip3 install -r requirements.txt
    
  3. Download the Chrome WebDriver, naming it appropriately for the current architecture, and update the script cookidump if needed for your architecture.

  4. Install npm

  5. Install prettier, and the plugin prettier-plugin-sort-json

Running the script

cookidump [-r recipes_folder] [-j json_folder] [-p pattern]

where:

  • -r recipes_folder names a folder (directory) where the lists of recipes in each collection will go If this option is not specified, the default is ./recipes.

    Note: If this option is not specified and no -p pattern option is specified, then the default recipes folder ./recipes is deleted and recreated; this includes the nested folder containing the JSON recipe files. In other words, if you do not want to start from scratch, you must specify a pattern or a recipes folder.

  • -j json_folder names a folder (directory) where the JSON files for each recipe are written. This folder is named relative to the recipes folder. If the -j json_folder option is not specified, no JSON is written.

  • -p pattern provides a filter for deciding which collections and/or recipes will be dumped. The pattern is of the form regular_expression[::regular_expression]. The first regular expression is used to match collection names; only collections matching the given regular expression are listed. The second regular expression (if given) is used to match recipes; only recipes matching the given regular expression are dumped to JSON files (and then only if the -j json_folder option is specified). If you want to dump certain recipes from any collection, use a filter of the form .::recipe_pattern; the initial dot will match all collections. If you want to dump all recipes from certain collections, use a filter of the form collection_pattern::..

The program will open a Google Chrome window and wait until you are logged in into your Cookidoo account. The script currently hard-codes the starting URL for the USA Cookidoo - you can change that as appropriate for your locale.

The program creates a file per created or saved collection, plus one for your bookmarks, plus one for your created recipes. That file contains lines such as:

r470647 https://cookidoo.thermomix.com/recipes/recipe/en-US/r470647 Sesame Orange Chicken

the recipe id (such as r470647), which is a globally unique ID for the recipe (a recipe has that same ID in all instances of Cookidoo), the recipe URL (which starts off differently in each Cookidoo regional instance, but ends with the same recipe ID), and the recipe name.

The program also creates a couple of index files, containing the names of your collections and the number of recipes in that collection.

cookidump_paprika3's People

Contributors

auino avatar ndjc avatar kczulko avatar nilskrause avatar vikramsoni2 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.