Giter Site home page Giter Site logo

cytominer_scripts's People

Contributors

cells2numbers avatar shntnu avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

gwaybio

cytominer_scripts's Issues

Should backend creation use first image set as a "template"?

Trying to make a backend with process.sh, I had issues because my first image had failed QC and my pipeline was set to skip the rest if it did so, so there was only a small subset of the usual amount of data present in that folder- only the Image.csv (with a smaller number of columns), no object CSVs. It therefore for all images only added the common columns, which led to reasonable errors when it went to aggregate the object tables at the end and could not find any.

I hacked around it by deleting all the folders before the first well that actually had been plated with cells, but it seems to me that

  1. in creation of the backend we want to pull in all the data that's there, we can deal with NaNs from missing data later
  2. there's a reasonable chance of this happening again, as well A01 (and other edge/corner wells) are often not plated into in smaller experiments but in a large fraction of those cases the whole plate is just imaged anyway.

Feel free to disagree though, that's why I phrased it as a question.

End of the error string below, I doubt it's helpful but just in case

(builtins.OSError) /home/ubuntu/efs/{redacted}/workspace/software/cytominer_scripts/.4b21aa7e-6e45-11e7-8ea1-0e60212e428a:1: expected 218 columns but found 568 - extras ignored
 [SQL: 'sqlite3 -nullvalue \'\' -separator , -cmd .import "/home/ubuntu/efs/{redacted}/workspace/software/cytominer_scripts/.4b21aa7e-6e45-11e7-8ea1-0e60212e428a" "Image" /home/ubuntu/ebs_tmp/2017_07_12_Batch1/AU00027623//AU00027623.sqlite']
[Fri Jul 21 16:41:41 UTC 2017] Looking up AU00027623.sqlite on permanent store
[Fri Jul 21 16:41:41 UTC 2017] /home/ubuntu/bucket/projects/{redacted}/workspace/backend/2017_07_12_Batch1/AU00027623/AU00027623.sqlite not found
[Fri Jul 21 16:41:41 UTC 2017] Creating /home/ubuntu/ebs_tmp/2017_07_12_Batch1/AU00027623//AU00027623.sqlite

real    127m40.736s
user    62m50.242s
sys     4m23.562s
[Fri Jul 21 18:49:21 UTC 2017] Indexing /home/ubuntu/ebs_tmp/2017_07_12_Batch1/AU00027623//AU00027623.sqlite
Error: near line 3: no such table: main.Cells
Error: near line 4: no such table: main.Cytoplasm
Error: near line 5: no such table: main.Nuclei

real    0m0.054s
user    0m0.009s
sys     0m0.023s
[Fri Jul 21 18:49:22 UTC 2017] Aggregating /home/ubuntu/ebs_tmp/2017_07_12_Batch1/AU00027623//AU00027623.sqlite
Error in rsqlite_send_query(conn@ptr, statement) : no such table: cells
Calls: %>% ... initialize -> initialize -> rsqlite_send_query -> .Call
Execution halted

real    0m0.993s
user    0m0.603s
sys     0m0.056s
[Fri Jul 21 18:49:23 UTC 2017] /home/ubuntu/ebs_tmp/2017_07_12_Batch1/AU00027623//AU00027623.csv not created / does not exist. Exiting.

Create option to explicitly specify all paths

Currently, most scripts assume a specific folder structure, which is great for keeping the options compact (only need to specify batchname and plate_id for most cases). But this makes it inflexible. Keep the current options, but also have the option to explicitly specify paths.

See the http://docopt.org docs to make sure we do it the right way.

These are the scripts that need to be updated:

select.R
sample.R
preselect.R
normalize.R
compare_plates.R
collapse.R
audit.R
annotate.R

`preselect.R` Don't assume multiple identical plates

preselect.R assumes that replicates can be found by looking for 2 plates that have an identical "Metadata_Plate_Map_Name" and then saying the replicates are just a matter of matching wells across these identical plates.

In some experiments, however, each plate may be unique, and replicates may be found in either a different location on the same (or even another) plate. Allowing the user an optional flag to pass something else would be helpful.

Specify isolated vs. colony definition

in #30 we introduce an additional aggregate option: --sc_type.

Currently, the definitions are based on a specific cell painting variable (Cells_Neighbors_NumberOfNeighbors_Adjacent). More specifically defined in broadinstitute/cmQTL#9

It would be great if these were not hardcoded! (the change is probably beyond the scope of #30 as no other projects (to my knowledge) require this flag)

Implement QC functions for profiling

The purpose of this issue is to create a list. Once we settle on a list, we will close the issue and create an issue per QC item. We also need to decide where to implement this – here or in http://github.com/CellProfiler/cytominer

  • Plot illumination corrections functions
  • Plot salient features on a plate map to see if there are any trends
    • Cell count
    • IntegratedIntensity
    • PercentMaximal
  • Show excluded wells on a plate map
  • Check for rotation of the plate layout
    • Plate map with cell counts
    • Cluster all the wells across plates

Resolve `evaluation nested too deeply` issues

@bethac07 reported this

ubuntu@ip-10-0-3-243:~/efs/2019_06_04_Cardiomyocytes_AnantChopra_Bayer/workspace/software/cytominer_scripts$ ./preselect.R \
>   --batch_id ${BATCH_ID} \
>   --input ../../parameters/${BATCH_ID}/sample/${BATCH_ID}_normalized_sample.feather \
>   --operations correlation_threshold
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Execution halted

This was reported for preselect but it can occur in other functions that use feather

The issue appears to be related to this issue wesm/feather#372

Update the packages mentioned here and the problem should go away: wesm/feather#372 (comment)

`annotate.R` fails when plate name has underscores in it

It seems to ignore everything after the first underscore, so if my backend is in ../../backend/batch/Experiment1_Day1_1/plate, annotate fails because it's looking in ../../backend/batch/Experiment1/plate. It seems to write out to the correct place, and the steps after that seem to work ok IIRC.

Confusing Error in `preselect.R`

I am performing a replicate_correlation variable selection with preselect.R.

The error I receive is:

INFO [2019-05-17 15:44:25] Subsetting using Metadata_Well != 'dummy'
INFO [2019-05-17 15:44:25] Performing replicate_correlation...
Joining, by = c("Metadata_Plate_Map_Name", "Metadata_Well")
Error in grouped_df_impl(data, unname(vars), drop) :
  Column `variable` is unknown
Calls: %>% ... group_by.data.frame -> grouped_df -> grouped_df_impl -> .Call
Execution halted

I believe the error is generated in this call to cytominer::replicate_correlation

In cytominer::replicate_correlation, perhaps the error is happening here. Either way, this is something that I need to look into and fix.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.