#VIMER 视觉预训练基础模型仓库
通用视觉自监督预训练模型
字段级多模态特征增强的OCR结构化预训练模型
统一特征表示预训练模型
统一多源信息建模的商品图文表征预训练模型
视觉预训练基础模型仓库
请问v2版本layout analysis 该怎么运行呢?我看代码里面似乎没有支持着一个任务的task,但是readme里面似乎是写了能做这个任务的
没有支持端到端信息抽取的推理代码吗?
能不能提供一下必要的文件,问题好多,希望能指导一下,跑通ums,respect!
请问:
1.是真实场景下的图片还是类似于pdf的图片呢?
2.如果是真实场景下的图片,是由经过透视变换矫正的还是有歪歪斜斜的?
文档都是英文,中文让你丢人了?
请问下,xfund和funsd数据集是用什么工具标注的
Hi,
Do you have plan to release the pre-training codes? Thanks!
First of all, thank you for sharing this amazing work!
I encountered some confusion while reading the code and hope the author can answer it. I am very grateful for that.
The description of the Relationship Extraction Module in the paper is as follows:
However, the implementation in https://github.com/PaddlePaddle/VIMER/blob/main/StrucTexT/external/linking/modules/model.py#L101 is as follows and which is a linear transformation of the absolute value of the difference between the features of the two nodes.
Is there any special consideration here?
Thanks again to the author for your reading and help!
如题
First of all, thank you for sharing this amazing work!
I want to ask if you plan to release the training script for multi-task models? I'm most intrigued in details of how the models are trained.
CAE 缺失encoder_weight.pd decoder_weight.pd,请问可以提供下下载链接吗?
如题
Do you have Entity Linking Inference Model for StrucText V2 available for download?
请问StrucText v2 中的table recognize中的link_up、link_down、link_left和link_right代表什么?
link_up = link_probs[:, 0:1, :, :]
link_down = link_probs[:, 1:2, :, :]
link_left = link_probs[:, 2:3, :, :]
link_right = link_probs[:, 3:4, :, :]
非常期待您的解答!
在v2中运行下面脚本出现好多问题
python -u ./tools/eval.py --config_file=configs/end2end_ocr/ocr_funsd_base.json --task_type=end2end_ocr --label_path=./data/funsd/dataset/testing_data/otations --image_path=./data/funsd/dataset/testing_data/images --weights_path=StrucTexT_v2_end2end_ie_base.pdparams
Hi, I noticed that your FUNSD entity linking scores reported in this repo. are higher than the number in the paper. For example, "StrucTexT-chn&eng base" is 0.7045 and "StrucTexT-eng base (paper)" is 0.4410. Could you let me know what contributes to the improvement here? Or is anything wrong with the original paper's approach? Thanks!
i have read the code and confused about the code below:
utils/metrics/rescore_metric.py. line 38~45:
for row in range(rows):
for col in range(cols):
if label_b[row, col] == 1:
rel = {"head": (row, row+1),
"tail": (col, col+1),
why should we set "head": (row, row+1) rather than "head": (row)
在读代码中发现好多类或者函数实现缺失,比如 NameAdapter等好多处,希望作者能够补全,谢谢
Hi,
Are u planning to provide finetuning example for information extraction?
Hi I am trying run the inference code on FUSD labeling task but I am getting loading problem. Can anyone put a comment on this.
in load load_result = pickle.load(f, encoding='latin1') MemoryError
@linan142857 请问单张图片的预测代码有开源吗?谢谢
如题
Hi! Nice work on https://github.com/PaddlePaddle/VIMER/tree/main/StrucTexT/v2.
What is the license of the pretrained model and the code?
Is it Apache License (same as the Paddle and the PaddleOCR repo?)
CC: @zhouwei25
Hi VIMER Team, I am trying to run a visualization of the prediction using the script from "Infer fine-tuned models" but it only returns me the metrics of model itself. Could you please guide me how to do the visualization like the demo picture at the bottom of the Read.me? Thanks!
it's there more information about UFO?paper?docker image?training script?training GPUs?
按照 https://aistudio.baidu.com/aistudio/datasetdetail/147611 创建项目,看到 data 已经预先存放 4 个模型文件,然后按照README 安装好 requirements, 当要下载 RVL-CDIP文档图像分类 文件时,不成功,看样子是由于文件存放在google docs 的原因,这类存放在无法下载网址的文件,官方可否事先存放在 aistudio.baidu.com 可下载的位置?
这个项目很贴近真实应用场景,非常有前景,但由于文档和示例缺失,感觉很难参与进来,希望官方重视这点。如果像PaddleOCR之类有完善的测试环境和详尽的文档的项目,参与者热情肯定高涨。迫切希望官方先在 aistudio.baidu.com 提供一个测试示例,感激不尽!
有计划开放预训练模型下载和训练代码吗,还有自己数据集需要怎么准备?
请问UFO的训练平台?对硬件要求和限制
请问有人又关于UFO详细的网络模型结构吗?能不能发出来参考一下,感谢!
May I ask if anyone has any further information about the detailed network model structure of UFO? Can you send it for reference? Thank you!
请问有完整的目标检测微调和推理代码嘛
Hi,完成超网训练后,发现保存的模型是task粒度的,即每个任务的模型目录下保存了一个模型。与extract_task_specific_model.py中展示的超网模型的参数结构有所不同,请问下,是否有一个额外的整合所有task模型的过程?如有,可否提供下整合的流程或者代码?
非常期待您的解答~
How to conduct the pre-training of detection tasks, and whether to support downstream detection tasks, and whether there are relevant documents for reference? thank you!
Thanks for your contribution to the development of the unified model.
I notice that the config file for ViT base model is not uploaded.
The model configuration described in the paper is based on ViT base, I really hope that you could upload it.
Hi, I am trying to test the inference but I am not able to download pretrained model artifact. Could you please verify the link of it? Thanks!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.