Giter Site home page Giter Site logo

Comments (12)

lck1201 avatar lck1201 commented on June 14, 2024 1

谢谢回复,
那么整个系统流程,按照我的理解,是
1)由某种方法(预测或者G.T)得到图像上multi-person的poses P
2)image过backbone network,得到feature map F
3)根据pose template和person pose,求H,将H应用到F上
4)根据P画出PAF和joint heatmaps
5)feature map + PAF + joint heatmap 过SegModule,得到图像上若干个人的Mask
6)Mask做AlignReverse得到最终结果
这样的理解对吗?

PS:Figure4最左的图(再加上image+pose as input这句话),对我而言有些许歧义,我原以为是:由joint生成heatmap,再和图像concat,过backbone

另外还有2个问题:
a)Figure4有两个 “+ Concat”,这两个有什么区别?是指不同的人吗?
b)我其实对segment工作并不是很了解,所以想请教一下。本文是做多人segment,假设图中有N个人,那么SegModule的输出直接是N个人的Mask吗?还是逐个输出每个人的Mask?

from pose2seg.

lck1201 avatar lck1201 commented on June 14, 2024

另外,http://www.liruilong.cn/Pose2Seg/index.html 这个网站404……

from pose2seg.

liruilong940607 avatar liruilong940607 commented on June 14, 2024

1.1)每个人pose的表示都是一个vector,见论文4.2.1。
1.2)论文Figure 4. 有写是concat。是按通道方向拼接。
2)Simple Baselines@MSRA是top-down的方法。top-down的方法由于NMS的存在对于遮挡问题从原理上无法处理。这部分的讨论和介绍见论文Introduction。
3)并没有预测两次。skeleton feature的生成不需要预测,就是直接用affine-align将pose vector坐标对齐后画出来的。
4)Figure4 是e2e的(input是image + pose)。

另,感谢提醒,网站连接已修复。

from pose2seg.

liruilong940607 avatar liruilong940607 commented on June 14, 2024

Right! You got the basic idea!

a) 没有区别,每个人都要concat。
b) 直接输出N个人的。输入是(N, C1, H, W)的feature,输出是(N, C2, H, W)的score map,对应于N个人。

from pose2seg.

lck1201 avatar lck1201 commented on June 14, 2024

你好,最后还有一个问题,
4.2.2提到要用BBOX去crop得到ROI,那么BBOX从哪里来?除了实验中提到可以用G.T,或者从GT KPT expand出来。
像是Figure7(a)的那些例子,bbox从哪里来的呢?

from pose2seg.

liruilong940607 avatar liruilong940607 commented on June 14, 2024

Figure 7(a) 中的不是bounding box,是affine-align对齐操作对应在原图上的region。

from pose2seg.

lck1201 avatar lck1201 commented on June 14, 2024

好的,谢谢,我基本明白了pipeline。
另外,Figure4的Affine-Align Operation给的feature map图,看上去是很像人体的热力图,换言之,它已经带有一定语义了。我想问的是,backbone出来的feature真的是长这样的吗?

from pose2seg.

liruilong940607 avatar liruilong940607 commented on June 14, 2024

feature map是256-channel的。Fig. 4 中我们为了可视化对256-channel做了个融合处理变成3-channel的heatmap。实际上,因为e2e地训练,backbone一定是能学到一些语义的。

from pose2seg.

lck1201 avatar lck1201 commented on June 14, 2024

明白,谢谢你的回复,祝你CVPR顺利!

from pose2seg.

lxtGH avatar lxtGH commented on June 14, 2024

你好!我也有几个问题:如果没有ground truthd的pose的话,比如表2里面的0.222这个,你的pose是从哪里来的?还有test上没有ground truth的pose,你怎么算模板呢?

from pose2seg.

lxtGH avatar lxtGH commented on June 14, 2024

#3

from pose2seg.

lxtGH avatar lxtGH commented on June 14, 2024

还有一个问题,既然对于一张图都是全图预测多人,训练的时候人数不一致怎么办呢,concate的特征通道不一致怎么办?

from pose2seg.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.