For the starters there must an interactive way to see how to create and use different

I added a demo here using google colab: <a href="https://colab.research.google.com

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Thanks <a class="user-mention notranslate" data-hovercard-type="user" dat

Add Jupyter Notebooks to demonstrate pipelines for business use cases and demos about augraphy HOT 19 CLOSED

sparkfish commented on May 30, 2024

Add Jupyter Notebooks to demonstrate pipelines for business use cases and demos

from augraphy.

Comments (19)

kwcckw commented on May 30, 2024 1

I'm thinking to use augraphy to further improve the work in old document restoration, since those documents are quite rare and it would be helpful if we are able to produce some synthetic samples to further aid in their model training process.

Example of old document restoration:
https://arxiv.org/pdf/2001.08742.pdf

Do you guys think this is a good idea?

I tried to create some old documents by using different fonts and papers. Here are the current results:

Is this real enough? Or do let me know if you think there's any area of improvement.

One of the constraint right now is the page is in perfect condition, it would be nice if we can create some teared paper effect.

from augraphy.

proofconstruction commented on May 30, 2024 1

I think this should be a "pipeline level" demo where we apply several augmentations to produce a more relatable real life application. So eventually we can have several predefined pipelines (sort of utility class), each with different purpose, for example photocopy machine effect, old document effect, and etc (can't think of more now).

I like this idea, and I think we can use the pipelines for the Reproducing Archetypes work as the predefined ones. Once we get those working well, we can add them to augraphy.default.prebuilt or something, so users can select one of those or the default_augraphy_pipeline containing all augmentations.

from augraphy.

kwcckw commented on May 30, 2024

I'm thinking to use augraphy to further improve the work in old document restoration, since those documents are quite rare and it would be helpful if we are able to produce some synthetic samples to further aid in their model training process.

Example of old document restoration:
https://arxiv.org/pdf/2001.08742.pdf

Do you guys think this is a good idea?

from augraphy.

kwcckw commented on May 30, 2024

I added a demo here using google colab:
https://colab.research.google.com/drive/1g0hDeneIGcy0sYtIzZSnNmZfesFdu5Sn?usp=sharing

from augraphy.

jboarman commented on May 30, 2024

Thanks @kwcckw

Can you update that notebook to use gdown?

Once you've compiled all the files into a Google Drive folder, then you will need to file IDs for each file: https://clay-atlas.com/us/blog/2021/07/01/python-en-gdown-package-download-file-google-drive/

If we are sharing more than a few files, then it might be best to ZIP the files together first so that it's a single file to be downloaded and unpacked within the notebook.

from augraphy.

kwcckw commented on May 30, 2024

Thanks @kwcckw

Can you update that notebook to use gdown?

Once you've compiled all the files into a Google Drive folder, then you will need to file IDs for each file: https://clay-atlas.com/us/blog/2021/07/01/python-en-gdown-package-download-file-google-drive/

If we are sharing more than a few files, then it might be best to ZIP the files together first so that it's a single file to be downloaded and unpacked within the notebook.

Thanks, i will look into this and apply the changes accordingly.

from augraphy.

kwcckw commented on May 30, 2024

I added a new example with gdown, now user will be able to download files and run the same thing from their end. Is there anyone can try this out and let me know if this is run-able?

Link: https://colab.research.google.com/drive/1IDkfmKKajMb2dlchW-rWN6kmrnGk2TWX?usp=sharing

i just tried with another account and with a new copy of notebook, so far there's isn't any issue yet and please let me know if you guys facing any problem to run the notebook.

from augraphy.

jboarman commented on May 30, 2024

Can you remove all dependency on connecting to a personal Google Drive (outside of using gdown I mean)?

For example, we should be able to just use the local folder on the Colab server instead mounting the “My Drive” reference.

from augraphy.

kwcckw commented on May 30, 2024

Can you remove all dependency on connecting to a personal Google Drive (outside of using gdown I mean)?

For example, we should be able to just use the local folder on the Colab server instead mounting the “My Drive” reference.

Alright, at this point i'm not aware of such function yet, let me check again.

from augraphy.

kwcckw commented on May 30, 2024

Here's the updated notebook which is able to run without connecting to any gdrive, but you still need to login to your google account to run it:
https://colab.research.google.com/drive/1MXNhPQ_37aiPmAOkfKNCDhKnlaPc1zDt?usp=sharing

Also if the code is updated in the pip installation , then we can skip the section to download the zip file too.

from augraphy.

kwcckw commented on May 30, 2024

I added new notebook on the example of pipeline for photocopy paper, and I've updated the previous example on old document to use the code directly from the repo. So right now there's 2 examples from my end:

Photocopy paper:
https://colab.research.google.com/drive/1hLHyyvNi-cRRDOcqtPnZPCw4j25oZMDG?usp=sharing
Old document:
https://colab.research.google.com/drive/1MXNhPQ_37aiPmAOkfKNCDhKnlaPc1zDt?usp=sharing

from augraphy.

tbrettallen commented on May 30, 2024

Photocopy paper:
https://colab.research.google.com/drive/1hLHyyvNi-cRRDOcqtPnZPCw4j25oZMDG?usp=sharing

I'm having an issue running this notebook. I've attached the issue I've run into in the image below.

from augraphy.

kwcckw commented on May 30, 2024

Photocopy paper:
https://colab.research.google.com/drive/1hLHyyvNi-cRRDOcqtPnZPCw4j25oZMDG?usp=sharing

I'm having an issue running this notebook. I've attached the issue I've run into in the image below.

Sorry, it should be caused by a typo in the init file. Now i updated the code and tested it. It should be having no issue to run now.

from augraphy.

proofconstruction commented on May 30, 2024

I get a type error running this.

from augraphy.

kwcckw commented on May 30, 2024

I get a type error running this.

Sorry, this notebook is still pending update, since it is using the previous badphotocopy code. I will post here again when it is updated.

from augraphy.

kwcckw commented on May 30, 2024

Sorry, this notebook is still pending update, since it is using the previous badphotocopy code. I will post here again when it is updated.

@proofconstruction I just updated the notebook, please try again and let me know if you facing any issue to run the notebook.

from augraphy.

proofconstruction commented on May 30, 2024

These look really good!

Is the "Photocopy machine" code now part of BadPhotoCopy?

The "Synthetic old documents" code can probably become a new augmentation, or maybe a utility class. We should add a new Issue to discuss this.

from augraphy.

kwcckw commented on May 30, 2024

These look really good!

Is the "Photocopy machine" code now part of BadPhotoCopy?

The "Synthetic old documents" code can probably become a new augmentation, or maybe a utility class. We should add a new Issue to discuss this.

I think this should be a "pipeline level" demo where we apply several augmentations to produce a more relatable real life application. So eventually we can have several predefined pipelines (sort of utility class), each with different purpose, for example photocopy machine effect, old document effect, and etc (can't think of more now).

from augraphy.

proofconstruction commented on May 30, 2024

Closed with PR #154

from augraphy.

Add Jupyter Notebooks to demonstrate pipelines for business use cases and demos about augraphy HOT 19 CLOSED

Comments (19)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent