Giter Site home page Giter Site logo

adding optimizations about bulkdgd HOT 10 OPEN

rachadele avatar rachadele commented on June 12, 2024
adding optimizations

from bulkdgd.

Comments (10)

iprada avatar iprada commented on June 12, 2024

Hi Rachel!

It is great to see people using bulkDGD. We are happy to help!

It would be useful if you describe a bit your experimental set-up (How many samples? which tissue?) and what you are trying to achieve. We can take it from there.

If you do not want to describe your set-up publicly, you can find my email here: https://di.ku.dk/english/staff/?pure=en/persons/525785

best,

Iñigo

from bulkdgd.

rachadele avatar rachadele commented on June 12, 2024

Hi Iñigo,

thanks for responding :- ) I am using breast epithelium RNA-seq data from this study: GSE141828. I am only focusing on the "susceptible" breast tissue samples (for now).

After convering HGNC to ENSEMBL, dropping non-uniquely mappted genes, and, preprocessing samples using ioutil.preprocess_samples, I have a data frame of 7 samples × 16883 genes.

from bulkdgd.

iprada avatar iprada commented on June 12, 2024

I see. We are working on providing loss curves as a dataframe, but that might take some time.

In the meantime, I would plot the loss curves (x axis = epoch; y axis= loss) to see how they behave. From what you send loss does not look crazy high.

best,

Iñigo

from bulkdgd.

rachadele avatar rachadele commented on June 12, 2024

Hi Iñigo,

Thanks for your suggestions. I've included the plots for the loss curves here.
As you can see the loss is decreasing, I'm just a little confused because in other models I've used the loss is always <1. So I'm not sure if a loss of 14 is acceptable for DGD.

opt1

opt2

from bulkdgd.

rachadele avatar rachadele commented on June 12, 2024

another important question: should I be providing DGD with raw or normalized counts?

from bulkdgd.

iprada avatar iprada commented on June 12, 2024

Hi!

DGD should be provided with raw counts. It takes care of the normalization internally (mean scaling). Loss curves look fine and the high numbers you observe is probably the GMM penalty, but I will increase a bit the number of epoch to ensure they converge. In think if you sept op1 epochs to 20 and opt2 to 100 the will look stable.

best,

Iñigo

from bulkdgd.

rachadele avatar rachadele commented on June 12, 2024

Thanks for your help! :- )

I'm also wondering if you can help me generate a figure similar to figure 3B of your paper: image.

I assume I have to somehow save the outputs of the probability mass function from the decoder class, but I'm unsure if there's an easy way to do this.

from bulkdgd.

iprada avatar iprada commented on June 12, 2024

Hi,

Definetely. The decoder has a Negative-binomial layer with a log_prob method implemented. You can use that to calculate the probability of your samples. You simply pass your data through the log_prob method and I should get sample probabilities.

good luck!

Iñigo

from bulkdgd.

rachadele avatar rachadele commented on June 12, 2024

Thank you so much! :)
what are the "scaling factors" for this function? how can I compute them? I tried passing the r_values as in the perform_dea function, but it doesn't accept them.

I do have another question– is there a way to figure out which components correspond to which tissue?
All of our (breast) samples were assigned to component 28, except for one, which was assigned to component 23.

from bulkdgd.

iprada avatar iprada commented on June 12, 2024

sorry this took so long. Component 28 is the breast component, good news. Component 23 is a brain component. We should publish this data but we need to think how to. In the meantime, if you email me, I can share the data. You can find my email here

https://di.ku.dk/english/staff/?pure=en/persons/525785

from bulkdgd.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.