<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

In this implemented version, it uses interpolate rather than pixelshuffle for up

I get what you said and i also think the sructure of original paper seems to hav

did u make a mistake in 'class ResidualLayer'?seems u make 2 sames layer? about pytorch-cyclegan-vc2 HOT 8 OPEN

taichunyen commented on June 24, 2024

did u make a mistake in 'class ResidualLayer'?seems u make 2 sames layer?

from pytorch-cyclegan-vc2.

Comments (8)

Georgehappy1 commented on June 24, 2024

This is for gated cnn and of course it needs 2 same cnn .

from pytorch-cyclegan-vc2.

drawingsnow commented on June 24, 2024

thanks for your answer~ i am a student and now are trying to make my final design, i got some problems from the original paper,hope you can help me  1.paper said they used PixelShuffle to upsample,but in the upsample module,input channels are 512 ,after conv layer output channels are 1024,then after PSlayer they got output channels are 512? i think pixelshuffle will make it be 256 ? 2. what happen in the last conv layer ? how can they change channles from 35 to be 1 and make the h and w as the same as original input? thanks ,hopes your answer

…

------------------ 原始邮件 ------------------ 发件人: "George"<[email protected]>; 发送时间: 2020年4月24日(星期五) 下午4:48 收件人: "TaiChunYen/Pytorch-CycleGAN-VC2"<[email protected]>; 抄送: "790676289"<[email protected]>; "Author"<[email protected]>; 主题: Re: [TaiChunYen/Pytorch-CycleGAN-VC2] did u make a mistake in 'class ResidualLayer'?seems u make 2 sames layer? (#4) This is for gated cnn and of course it needs 2 same cnn . — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

from pytorch-cyclegan-vc2.

Georgehappy1 commented on June 24, 2024

In this implemented version, it uses interpolate rather than pixelshuffle for upsampling. So during upsampling, firstly the input is fed into conv2d layer whose output channel is 1024. The next step is to interpolate the output which only makes the width and height twice and doesn't change the channel size.(for details you can see model_vc2.py file).
The original input size is 3-dimensional ([batchsize,width,height]). To do 2d convolution, the original input needs to be unsqueezed in the second dimension(now the size is [batchsize,1,width,height]). And the input fed into last conv layer is also 4 dimensional with >1 channels. So the last 1*1 conv is to let channel size be 1. In the end, the output is squeezed in the second dimension to remove the channel dimension and now the output is the same size with original input.

from pytorch-cyclegan-vc2.

drawingsnow commented on June 24, 2024

thanks~
1 . i know you uses interpolate rather than pixelshuffle for upsampling, i means Is there any problem with the structure of the original paper？
2.i get it,thanks~
3.why we should swap the dimensional in the discriminator?what we want is just a feature map,right?
i cannot understand this code:
downSample4 = downSample4.contiguous().permute(0, 2, 3, 1).contiguous()

from pytorch-cyclegan-vc2.

Georgehappy1 commented on June 24, 2024

I get what you said and i also think the sructure of original paper seems to have some problems with the output the channel after pixelshuffle.
Yep I think without the swap it will still be ok. And the contiguous() is to make the storage address of the tensor next to each other. If a tensor has went through view() or transpose() and you want to view or transpose the tensor again, you must call .contiguous() function.

from pytorch-cyclegan-vc2.

drawingsnow commented on June 24, 2024

thanks a lot ~

from pytorch-cyclegan-vc2.

drawingsnow commented on June 24, 2024

hi~i get some new puzzle...

here is the code in trainingdataset:

if name == 'main':
trainA = np.random.randn(162, 24, 554)
trainB = np.random.randn(158, 24, 554)
dataset = trainingDataset(trainA, trainB)
trainLoader = torch.utils.data.DataLoader(dataset=dataset,
batch_size=2,
shuffle=True)
for epoch in range(10):
for i, (trainA, trainB) in enumerate(trainLoader):
print(trainA.shape, trainB.shape)

what is the trainA and trainB's first dimensional ? i know it maybe ( ？, feature_numbers ,lenth)

from pytorch-cyclegan-vc2.

drawingsnow commented on June 24, 2024

i think it means :i have 162 voice segments,and i extract features (24,554) from every segments,so the input is (162,24,554),did i get so mistake?

from pytorch-cyclegan-vc2.

did u make a mistake in 'class ResidualLayer'?seems u make 2 sames layer? about pytorch-cyclegan-vc2 HOT 8 OPEN

Comments (8)

Related Issues (5)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent