Giter Site home page Giter Site logo

robertaboukhalil / ginkgo Goto Github PK

View Code? Open in Web Editor NEW
47.0 47.0 28.0 103.1 MB

Cloud-based single-cell copy-number variation analysis tool

Home Page: qb.cshl.edu/ginkgo

License: BSD 2-Clause "Simplified" License

Makefile 0.10% Shell 7.48% PHP 38.55% CSS 6.09% JavaScript 26.90% HTML 0.57% C++ 6.82% R 10.70% Python 0.12% WDL 1.58% Perl 0.48% Dockerfile 0.62%
bioinformatics sequencing single-cell-genomics

ginkgo's People

Contributors

jherrero avatar jpritt avatar mschatz avatar robertaboukhalil avatar tgarvin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ginkgo's Issues

server down?

hi, we haven't been able to upload anything since yesterday (either adding to existing, or new instances). It seems to work and then when the upload reaches 100% says Error Failed to write file to disk. thanks

Phylogenetic Tree output?

Dear Robert,

I'm currently using the standalone version of ginkgo and am wondering how can I build a phylogenetic tree from the data?
In the web-based platform, there's a tree in the results (which according to the paper is estimated by "first computing the Pearson correlation between all samples and using these dissimilarity values to cluster the samples"). However, in the standalone version, I don't get such tree. Instead, I just get the different dendrograms derived from the heatmaps.

Individual MAD score with command line (standalone)

I am running analyze-subset.R command for individual MAD scores, but it gives the following error

Error in FUN(left, right) : non-numeric argument to binary operator
Calls: sweep -> Ops.data.frame -> eval -> eval
Execution halted

The data file inside new_run_123 contains the list of bed.gz files. I have put bed.gz files for 35 cells inside the data file. Could you please confirm if it is correct? I guess that I am doing something wrong for the content of data file because when I printed the number of bins and number of samples (shown below) , it gives 35 1.

raw = read.table('data', header=TRUE, sep="\t")
l = dim(raw)[1] # Number of bins
w = dim(raw)[2] # Number of samples
#
print(dim(raw))
print(l)
print(w)

standalone ginkgo: SegCopy not found?

I am running a test run via web interface after setting up ginkgo.

The analysis is stuck at step 2: 0% Calling copy number events... ( Initializing variables)

Cheking ginkgo.out under the analysis folder, I can see several trials to read the SegCopy file.
`

awk: fatal: cannot open file '/mnt/data/ginkgo/uploads/80OSyT7WskAjdEZslogZ/SegCopy' for reading (No such file or directory)

bounds_variable files

Hi Robert,

I am trying to run Ginkgo with a list of bad bins generated outside of Ginkgo. I was trying to understand the nature of the chromosome bounds listed in the two different bounds_variable_{binSize}etc. files. Could you help? Thank you!

--Kunal

Analysis time

Hi,

I am wondering what would be the approximate time for the analysis. I uploaded 3 .beg.gz files (~90 MB each) and it says 80% complete after 6h. Is this normal? I also got the Gateway Time Out error several times when uploading the files. Here is the link to my analysis. Thanks!

MAD exact

Hello,

I have been using Gingko in several data sets and I would like to correlate the MAD values. Therefore I would like to have the MAD value for each cell alone.
Is there any output of Ginkgo that can give me the MAD values apart from the plots(which is approximate). If I set up Ginkgo on my server will I be able to get the values perhaps?

Thank you very much,
Maria Kalyva

Gateway timeout

Hello,

I wanted to know if the server is down, I was able to use the Ginkgo web platform successfully till yesterday morning, and since then, I keep getting a gateway timeout error at the upload stage. I'm sure it's from something other than my end since I have tried different PCs and internet connections. Any insight would be appreciated!

Best,
Sherry

Lorenz-based cutoff metric?

Hello,

I was wondering if there is a built-in metric (or e.g. a way to use the normalized bin counts post-analysis) to identify "bad cells" (similar to having a BIC score for each cell) ?

Thanks,

Kunal

CNV1 vs CNV2?

I was wondering what the difference is between these 2 CNV outputs? Thanks!

Bad Bin -> coordinate map?

I was wondering if there is a file somewhere that would map bin numbers to genomic hg 19 coordinates (e.g. for 500 kb variable bins) so that I can know which regions are masked by the bad-bin analysis? Thank you,

Kunal

bed.gz files not visible

Hello,

When I upload my 4 compressed bed files to the web app, I only see the first one listed after moving to the next step. Then I cannot proceed with the analysis anyway as it requires as least 3 files. Could you help?

Thanks,

Kunal

Something about reference

Hi , Robert ! I used gingko to call CNV pattern nowadays, using different references. The results were the same whether I used normal or CNV samples as controls. And I was sure my parameter settings were correct, can you provide me with some suggestions ?

How to define the reference sample?

I have some MALBAC single cell data. I'd like to analyze the CNV with ginkgo. I do not know how to define the normal sample in the config file. Please give me some example to follow.

duplicates/secondaries

Hi Robert,

Does Ginkgo remove duplicate and/or secondary alignments before processing reads?

Thanks!

problem in the R code

Hi there,

I'm setting up a local install of ginko and running a set of just over 900 single cell libraries. After installing all the required libraries I had to make a small change to the code:

line 25 of analyze.sh is looking for a folder called "genomes" this isn't in my install so I took it out and got much farther.

Now I get to the part when lots of .jpegs are being created and I get an error that I have narrowed down to this location:

/opt/ginkgo/scripts/process.R /opt/ginkgo/genomes/hg19 /opt/ginkgo/uploads/CIC1 status.xml data 0 variable_100000_150_bwa ward.D2 euclidean 1 refDummy.bed_mapped 0 ploidyDummy.txt 1 0
Loading required package: amap
Loading required package: methods

Attaching package: ‘gplots’

The following object is masked from ‘package:stats’:

    lowess

Error in clip(tu[1], mean(temp) - (diff(reads$mids)/2), tu[3], tu[4]) : 
  invalid 'x2' argument
In addition: There were 50 or more warnings (use warnings() to see the first 50)
Execution halted

I get this error after getting though 102 libraries for which I get lots of pretty .jpeg images.

Looking in the status.xml file I am able to narrow this down to happening on just one of my libraries. That particular library doesn't looking very different from the others.

Do you have any suggestions I could try ?

The input file format

Hi, thank you for providing this tool for CNV analysis. I want to use ginkgo to infer CNV of single cells, and I wonder the input files format, that is, the bed file and the FACS file. One bed file contains one cell? Can you provide an example? Thank you so much!

Cannot select any cells in Step1

After uploading the .bed files, I cannot select any cells in Step1, therefore when trying to start the analysis the error message 'Please choose at least 3 cells for your analysis.' pops up. Is it something that can be fixed?
Thank you so much for your help!
Best
Ronja

Failed to upload the .bed or .bed.gz

Hello,

I am currently experiencing an issue while attempting to upload multiple .bed or .bed.gz files to the website. Even though the website indicates that the files have been uploaded successfully, they do not appear in the selection panel post-clicking the 'Analyze' button. I have observed this issue on multiple PCs, suggesting that it may not be a local issue.

Given the consistent occurrence of this problem, it would be greatly appreciated if you could investigate the potential root causes behind this situation.

Thank you for your time and assistance with this matter.

Using Ginkgo on a cell by bin matrix

I would like to use Ginkgo to call copy numbers from data obtained with the 10X Genomics CNV solution, which outputs a cell by genomic bin matrix of (GC and mappability-normalized) read counts. However, your interface only seems to accept BED files as input. How would I go about applying Ginkgo to my matrix?

Analysis stuck at 66%

I have started various analyses since last week and none of them get past 66% completion. Also, the file uploading was a challenge with many unsuccessful uploads (error: true / error: bad gateway). Smaller gzip compressed files seem to upload a little more easily. Sometimes files that first didn`t upload, suddenly get displayed in the "STEP 1" overview.

Is it the server, or could my files be the issue? (although, files that I analyzed successfully in the past show the same behavior)

Thanks!

Web platform - processing issues

Hi,

I am trying to run Ginkgo on some data published by the Navin lab for the purpose of comparing performance.
The data are described in https://www.nature.com/articles/s41586-021-03357-x

I am having some issues first with uploading the data, but even after that succeeds, the process appears to stop at
0% in the Calling copy number events (Initializing variables).... step. Do you have any advice or solution?
It might be worth adding the data from above to the set of pre-processed data, and I'd be happy to contribute to that effort.

hg38

Hi!

For several days I am trying to generate genome files for hg38. Although I think I replaced all necessary paths in all the scripts that required it, it still fails. Could you generate and share these files as you do with hg19?

Best,
Paweł

Error in clip(tu[1], mean(temp) - (diff(reads$mids)/2), tu[3], tu[4])

Hi

I am trying to run the standalone Gingko on my local computer using the hg38 bin data and I got the following error. Could you please check what could be causing this?

(base) behera@Administrators-MacBook-Pro ginkgo % ./scripts/analyze.sh new_run_123
Launching process.R /Users/behera/Brain_data/ginkgo/genomes/hg19/original /Users/behera/Brain_data/ginkgo/uploads/new_run_123 status.xml data 1 variable_100000_150_bwa ward.D2 euclidean 3 refDummy.bed_mapped 0 ploidyDummy.txt 1 0
Loading required package: amap

Attaching package: ‘gplots’

The following object is masked from ‘package:stats’:

    lowess

Error in clip(tu[1], mean(temp) - (diff(reads$mids)/2), tu[3], tu[4]) : 
  invalid 'x2' argument
In addition: Warning messages:
1: In regularize.values(x, y, ties, missing(ties), na.rm = na.rm) :
  collapsing to unique 'x' values
2: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead. 
Execution halted
awk: can't open file /Users/behera/Brain_data/ginkgo/uploads/new_run_123/SegCopy
 source line number 1
./scripts/analyze.sh: line 129: ((: i<=: syntax error: operand expected (error token is "=")
Launching /Users/behera/Brain_data/ginkgo/scripts/CNVcaller /Users/behera/Brain_data/ginkgo/uploads/new_run_123/SegCopy /Users/behera/Brain_data/ginkgo/uploads/new_run_123/CNV1 /Users/behera/Brain_data/ginkgo/uploads/new_run_123/CNV2
Unable to open input file: /Users/behera/Brain_data/ginkgo/uploads/new_run_123/SegCopy

Thanks,
Sairam

No server availble for the tasks

Hi,

It looks like the web platform could not handle new tasks with the message: "No server is available to handle this request." Ginkgo is a great tool and I am looking forward to using it again.

Thanks!

Running cells with second-best SoS solution?

Hi there,

I am using Gingko (CLI mostly) and am wondering if there is a way to re-run a small subset of cells to provide data from the second-best SoS solution? I have some cases where the least error and second least error are extremely close and I wish to look at certain cells/the output data under another copy number solution.

If not, could you point me to the appropriate place in the code to try making my own alterations? (Sorry, I am new to bioinformatics!!)

Thanks so much!

Lauren

Web-platform: Error Gateway Time-out

Hello all,
I try to upload some bed.gz files (25-150Mb) to use the web-platform of Ginkgo but I still have this Error Gateway Time-out. Could you please let me know how to fix it?
Many thanks for your help.

Gingko Web Platform: Copy number calls stalling with FACS input.

I'm analyzing a set of about 300 single cells on the Gingko web app. The karyotypes are very strange so I input a table of predicted ploidy for the cells by DAPI content. The problem is that the data processing is stalling out at certain cells. I can exclude these cells and obviously it will skip them, however it will stall out again on a different cell. Some example ploidy values that this has occurred on are 1.3, 3.4 and 3.9. Any ideas? Thanks!

Uploading issue

Hi,
Ginkgo works well in my previous analyses. But I always fail to upload the .bed inputs now. It seems that files are uploaded successfully. But if I click "next step", there are empty in the "select cells" box. Do you have any idea on this issue?

Thanks,
RW

Web-platform: doesn't work

I want to upload some bed files (small size) to the web-platform for calling CNV, but I found it did not work. When I finished the upload and continue the newt step, it returned back to the upload step.

Another, the Sample analyses are also broken, it can be clicked to select the previous datasets for testing.

Command line GinkGo

Hi!

I would like to integrate GinkGo into my workflow. I see that I can install GinkGo on my computer but can I run it from the command line? Or does it supply only the graphical interface? Or is there maybe any API to GinkGo?

Rergards,
Paweł

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.