andreagemelli / doc2graph Goto Github PK

Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.

Home Page: https://link.springer.com/chapter/10.1007/978-3-031-25069-9_22

License: MIT License

Python 26.85% Jupyter Notebook 73.15%

deep-learning document-understanding geometric-deep-learning gnn key-information-extraction layout-analysis nlp pytorch table-detection

doc2graph's Introduction

💫 About Me:

Hi 👋,
I am Andrea, a passionate person who loves to study and discorver.
I've started a Ph.D. about 🌱 Graph Neural Networks and Document Understanding in 2020, based in 📍 Universtà degli Studi di Firenze, Florence. I have also been working one year as a visiting researcher at the Computer Vision Center (CVC) in Barcelona, for a joint program between our labs. At the start of 2023 I have been entitled as IAPR International Scholar winning the IAPR RESEARCH SCHOLARSHIPS.

📐 My Github Projects

On my github you can find several repos / projects, that I have been working on through the last five years.
I used to work on Machine Learning 🧠, Games 🕹️ and Softwares 💾

During Ph.D.

🧠 Doc2Graph: transforms your documents into graphs and exploit a GNN to solve several tasks. (🔗 repo | 📄 paper)
🧠 GNN-TableExtraction: a graph-based technique to extract tables from scientific papers. (🔗 repo | 📄 paper)
🧠 DA-GraphTab: data augmentation for graph structures (🔗 repo | 📄 paper)
🧠 cte-dataset: a dataset for Contextualized Table Extraction (🔗 repo | 📄 paper)
🕹️ guessmylanguage: a team-based game to annotate handwritings (🔗 repo)

During MCS

🧠 Action-recognition-by-2D-skeleton-analysis: a computer vision project to recognize actions in a scene, analyzing the movement of people skeleton. (🔗 repo | 📄 report)
🧠 Flying-Objects-Detection-and-Recognition: a computer vision project aiming at detecting UAVs in airport spaces. (🔗 repo | 📄 report)

🧠 Floorplan-Text-Detection-and-Recognition: automatic detection and recognition of text in floorplan images, e.g. rooms' names (🔗 repo | 📄 report)

💾 lapis: a tool to organize professors and students meetings (🔗 repo | 📄 report)

💾 OnlineShopSimulator: a TDD project, simulating an online shop (🔗 repo)

🕹️ Escape-Room-VR: a 3D escape room game developed using an Oculus Rift (🔗 repo)

🕹️ gameoflife: the famous game developed using Kivy (🔗 repo)

During free time

🕹️ FightGPT: an RPG game developed using ChatGPT (🔗 repo)
💾 andreagemelli.github.io: my portoflio website (🔗 repo)

📬 Keep in touch!

You can write me at ✉️ [email protected]
Get to know more about me on my Web Page

doc2graph's People

Contributors

Stargazers

Watchers

Forkers

jaycedowns42 huyhoang17 pedropaiola hsakatech vineel7871 naveenvinayaks thinh-huynh-re minhthanghus nattachaiwat mxsurui hovduc bilykigor michaelfong2017 standardgalactic spaeth roysh tlyim

doc2graph's Issues

python src/main.py giving KeyError: 'feat'

Training on DocVQA?

Super interesting project you have here!

I'm interested in training for DocVQA. I've used models like LiLT and LayoutLMV1/2/3 in the past, but graph approaches like this are pretty new to me.

Traditionally, DocVQA models are setup to predict start/end positions of the answer within the input text. Any tips on setting this up with Doc2Graph?

Passing linking for test images .

Hi ,

At the time of prediction we are passing linking of graph which we need to predict ?
in graph builder.py we creating linking for graph nodes during training and performing the same for testing also.

How to set tg batch_size

When I train FUNSD e2e, Error appears on the relu function line:

RuntimeError: CUDA out of memory. Tried to allocate 4.21 GiB (GPU 0; 11.74 GiB total capacity; 4.55 GiB already allocated; 3.44 GiB free; 4.56 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF.

batch_size: 16
epochs: 10000
lr: 1e-3
weight_decay: 1e-4
val_size : 0.1
optimizer: Adam # or AdamW, SGD - NOT YET IMPLEMENTED
scheduler:
ReduceLROnPlateau # CosineAnnealingLR, None- NOT YET IMPLEMENTED
stopper_metric: acc # loss or acc
seed: 42
the batch_size is useless,
train_graphs = [data.graphs[i] for i in train_index]
tg = dgl.batch(train_graphs)
tg = tg.int().to(device)

If I don't k-fold, train_graphs size is 149, so tg's batch_size is 149. Then CUDA out of memory has occurred. I have tested both fully and knn, all error. How do I set tg batch_size to aviod RuntimeError. train.yaml above seems to be useless.

My GPU is 12G 3060, Thank you!

I have a prediciton for 50 images..but how do i map prediction to each and every result

RunTime error: float division by zero in main.py

when i run python main.py i am facing the above issue.Kindly help me in this

Environment installation problem

Hi! Thank you for repo!
I have a problem installing dependencies. Could you help solve this?

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
dglgo 0.0.2 requires pydantic>=1.9.0, but you have pydantic 1.8.2 which is incompatible.

And when i run scripts:

Traceback (most recent call last):
  File "/home/addudkin/doc2graph/doc2graph/src/main.py", line 4, in <module>
    from src.inference import inference
  File "/home/addudkin/doc2graph/doc2graph/src/inference.py", line 7, in <module>
    from src.data.feature_builder import FeatureBuilder
  File "/home/addudkin/doc2graph/doc2graph/src/data/feature_builder.py", line 3, in <module>
    import spacy
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/__init__.py", line 14, in <module>
    from . import pipeline  # noqa: F401
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/pipeline/__init__.py", line 1, in <module>
    from .attributeruler import AttributeRuler
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/pipeline/attributeruler.py", line 6, in <module>
    from .pipe import Pipe
  File "spacy/pipeline/pipe.pyx", line 8, in init spacy.pipeline.pipe
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/training/__init__.py", line 11, in <module>
    from .callbacks import create_copy_from_base_model  # noqa: F401
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/training/callbacks.py", line 3, in <module>
    from ..language import Language
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/language.py", line 25, in <module>
    from .training.initialize import init_vocab, init_tok2vec
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/training/initialize.py", line 14, in <module>
    from .pretrain import get_tok2vec_ref
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/training/pretrain.py", line 16, in <module>
    from ..schemas import ConfigSchemaPretrain
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/schemas.py", line 216, in <module>
    class TokenPattern(BaseModel):
  File "pydantic/main.py", line 299, in pydantic.main.ModelMetaclass.__new__
  File "pydantic/fields.py", line 411, in pydantic.fields.ModelField.infer
  File "pydantic/fields.py", line 342, in pydantic.fields.ModelField.__init__
  File "pydantic/fields.py", line 451, in pydantic.fields.ModelField.prepare
  File "pydantic/fields.py", line 545, in pydantic.fields.ModelField._type_analysis
  File "pydantic/fields.py", line 550, in pydantic.fields.ModelField._type_analysis
  File "/home/addudkin/miniconda3/envs/doc2graph/lib/python3.9/typing.py", line 852, in __subclasscheck__
    return issubclass(cls, self.__origin__)
TypeError: issubclass() arg 1 must be a class

My packages:
my_requirements.txt

OSError: [E050] Can't find model 'en_core_web_lg'.

get RuntimeError: Error(s) in loading state_dict for E2E: when doing test on model from training

here's my config for training, i use FUNSD dataset
%run 'main.py' --add-geom --add-embs --add-hist --add-visual --add-eweights --src-data 'FUNSD' --gpu 0 --edge-type 'fully' --node-granularity 'gt' --model 'e2e' --weights *.pt

then i run the best model using this:
%run 'main.py' -addG -addT -addE -addV --gpu 0 --test --weights e2e-20230213-0530.pt

then i got the Error:
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1497, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for E2E:
Unexpected key(s) in state_dict: "projector.modalities.3.0.weight", "projector.modalities.3.0.bias", "projector.modalities.3.1.weight", "projector.modalities.3.1.bias".
size mismatch for projector.modalities.1.0.weight: copying a param with shape torch.Size([300, 4]) from checkpoint, the shape in current model is torch.Size([300, 300]).
size mismatch for projector.modalities.2.0.weight: copying a param with shape torch.Size([300, 300]) from checkpoint, the shape in current model is torch.Size([300, 1448]).
size mismatch for message_passing.linear.weight: copying a param with shape torch.Size([1200, 2400]) from checkpoint, the shape in current model is torch.Size([900, 1800]).
size mismatch for message_passing.linear.bias: copying a param with shape torch.Size([1200]) from checkpoint, the shape in current model is torch.Size([900]).
size mismatch for message_passing.lynorm.weight: copying a param with shape torch.Size([1200]) from checkpoint, the shape in current model is torch.Size([900]).
size mismatch for message_passing.lynorm.bias: copying a param with shape torch.Size([1200]) from checkpoint, the shape in current model is torch.Size([900]).
size mismatch for edge_pred.W1.weight: copying a param with shape torch.Size([300, 2414]) from checkpoint, the shape in current model is torch.Size([300, 1814]).
size mismatch for node_pred.0.weight: copying a param with shape torch.Size([4, 1200]) from checkpoint, the shape in current model is torch.Size([4, 900]).

what does this actually tell by this prediciton .?

This is the prediction that i am getting on E2E with FUNSD dataset

what does this actually tell by this prediciton .?

FUNSD test result is very low!

Hi,I tried your public model to test FUNSD,but get low result,is some thing wrong?
I tried edge model and e2e model

How to train my data ?

Pydantic problems with dependency mismatches

I get some conflicts where I can't run the main.py script.


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spacy 3.3.0 requires pydantic!=1.8,!=1.8.1,<1.9.0,>=1.7.4, but you have pydantic 2.5.3 which is incompatible.
thinc 8.0.17 requires pydantic!=1.8,!=1.8.1,<1.9.0,>=1.7.4, but you have pydantic 2.5.3 which is incompatible.
doc2graph 0.2.0b0.post7+git.99ac9e69 requires pydantic==1.8.2, but you have pydantic 2.5.3 which is incompatible.

then if I install pydantic 1.8.2

I get


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
dglgo 0.0.2 requires pydantic>=1.9.0, but you have pydantic 1.8.2 which is incompatible.

and back and forth.

[edit]
Here is what I get when I run main.py for set-up.

Traceback (most recent call last):
  File "/Users/z1ggy/projects/forma/doc2graph/src/main.py", line 4, in <module>
    from src.inference import inference
  File "/Users/z1ggy/projects/forma/doc2graph/src/inference.py", line 7, in <module>
    from src.data.feature_builder import FeatureBuilder
  File "/Users/z1ggy/projects/forma/doc2graph/src/data/feature_builder.py", line 3, in <module>
    import spacy
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/__init__.py", line 14, in <module>
    from . import pipeline  # noqa: F401
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/pipeline/__init__.py", line 1, in <module>
    from .attributeruler import AttributeRuler
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/pipeline/attributeruler.py", line 6, in <module>
    from .pipe import Pipe
  File "spacy/pipeline/pipe.pyx", line 8, in init spacy.pipeline.pipe
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/training/__init__.py", line 11, in <module>
    from .callbacks import create_copy_from_base_model  # noqa: F401
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/training/callbacks.py", line 3, in <module>
    from ..language import Language
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/language.py", line 25, in <module>
    from .training.initialize import init_vocab, init_tok2vec
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/training/initialize.py", line 14, in <module>
    from .pretrain import get_tok2vec_ref
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/training/pretrain.py", line 16, in <module>
    from ..schemas import ConfigSchemaPretrain
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/spacy/schemas.py", line 216, in <module>
    class TokenPattern(BaseModel):
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/pydantic/main.py", line 299, in __new__
    fields[ann_name] = ModelField.infer(
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/pydantic/fields.py", line 411, in infer
    return cls(
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/pydantic/fields.py", line 342, in __init__
    self.prepare()
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/pydantic/fields.py", line 451, in prepare
    self._type_analysis()
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/pydantic/fields.py", line 545, in _type_analysis
    self._type_analysis()
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/site-packages/pydantic/fields.py", line 550, in _type_analysis
    if issubclass(origin, Tuple):  # type: ignore
  File "/Users/z1ggy/anaconda3/envs/doc2graph/lib/python3.9/typing.py", line 852, in __subclasscheck__
    return issubclass(cls, self.__origin__)
TypeError: issubclass() arg 1 must be a class

any protips here?

Anyone successfully had this run on M1 Mac CPU?

As the title says 👍

Curious what I need to change as I keep running into run-time errors:

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

RuntimeError: weight tensor should be defined either for all 4 classes or no classes but got weight tensor of shape: [3]

When I set train_batch_size=1, I got this error.
It happen in

 n_loss = compute_crossentropy_loss(n_scores.to(device), tg.ndata['label'].to(device)) 
==>  
def compute_crossentropy_loss(scores: torch.Tensor, labels: torch.Tensor):
    w = class_weight.compute_class_weight(class_weight='balanced', classes=np.unique(labels.cpu().numpy()),
                                          y=labels.cpu().numpy())
    return torch.nn.CrossEntropyLoss(weight=torch.tensor(w, dtype=torch.float32).to('cuda:0'))(scores, labels)

RuntimeError: weight tensor should be defined either for all 4 classes or no classes but got weight tensor of shape: [3]

I think it may be because there are only three labels in one image that caused this error. Could you tell me the reason and how to fix it? Thank you so much.

RuntimeError: CUDA out of memory

How can I fix this error? I have a 23GB of GPU.

e2e-funsd-best.pt Error(s) in loading state_dict

`
import torch
from src.models.graphs import SetModel
from src.paths import CHECKPOINTS

sm = SetModel(name='e2e', device=device)
model = sm.get_model(4, 2, chunks, False) # 4 and 2 refers to nodes and edge classes, check paper for details!
model.load_state_dict(torch.load(CHECKPOINTS / 'e2e-funsd-best.pt', map_location=torch.device('cpu'))) # load pretrained model
model.eval() # set the model for inference only
`

MODEL

-> Using E2E
-> Total params: 7674914
-> Device: False

RuntimeError Traceback (most recent call last)
Cell In[19], line 7
5 sm = SetModel(name='e2e', device=device)
6 model = sm.get_model(4, 2, chunks, False) # 4 and 2 refers to nodes and edge classes, check paper for details!
----> 7 model.load_state_dict(torch.load(CHECKPOINTS / 'e2e-funsd-best.pt', map_location=torch.device('cpu'))) # load pretrained model
8 model.eval() # set the model for inference only

File /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/torch/nn/modules/module.py:1671, in Module.load_state_dict(self, state_dict, strict)
1666 error_msgs.insert(
1667 0, 'Missing key(s) in state_dict: {}. '.format(
1668 ', '.join('"{}"'.format(k) for k in missing_keys)))
1670 if len(error_msgs) > 0:
-> 1671 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
1672 self.class.name, "\n\t".join(error_msgs)))
1673 return _IncompatibleKeys(missing_keys, unexpected_keys)

RuntimeError: Error(s) in loading state_dict for E2E:
Missing key(s) in state_dict: "projector.modalities.3.0.weight", "projector.modalities.3.0.bias", "projector.modalities.3.1.weight", "projector.modalities.3.1.bias", "projector.modalities.4.0.weight", "projector.modalities.4.0.bias", "projector.modalities.4.1.weight", "projector.modalities.4.1.bias", "projector.modalities.5.0.weight", "projector.modalities.5.0.bias", "projector.modalities.5.1.weight", "projector.modalities.5.1.bias".
size mismatch for projector.modalities.0.0.weight: copying a param with shape torch.Size([300, 4]) from checkpoint, the shape in current model is torch.Size([300, 0]).
size mismatch for projector.modalities.1.0.weight: copying a param with shape torch.Size([300, 300]) from checkpoint, the shape in current model is torch.Size([300, 0]).
size mismatch for projector.modalities.2.0.weight: copying a param with shape torch.Size([300, 1448]) from checkpoint, the shape in current model is torch.Size([300, 0]).
size mismatch for message_passing.linear.weight: copying a param with shape torch.Size([900, 1800]) from checkpoint, the shape in current model is torch.Size([1800, 3600]).
size mismatch for message_passing.linear.bias: copying a param with shape torch.Size([900]) from checkpoint, the shape in current model is torch.Size([1800]).
size mismatch for message_passing.lynorm.weight: copying a param with shape torch.Size([900]) from checkpoint, the shape in current model is torch.Size([1800]).
size mismatch for message_passing.lynorm.bias: copying a param with shape torch.Size([900]) from checkpoint, the shape in current model is torch.Size([1800]).
size mismatch for edge_pred.W1.weight: copying a param with shape torch.Size([300, 1814]) from checkpoint, the shape in current model is torch.Size([300, 3614]).
size mismatch for node_pred.0.weight: copying a param with shape torch.Size([4, 900]) from checkpoint, the shape in current model is torch.Size([4, 1800]).

dont know where the output /prediciton is saved..

python src/main.py -addG -addT -addE -addV --gpu 0 --test --weights e2e-funsd-best.pt

using this command i got BEST and AVERAGE results for the model , but i dont know where the output/prediciton is saved..