Giter Site home page Giter Site logo

cytoqc's Introduction

cytoqc – A standardization tool for openCyto

check and build

cytoqc checks and standardizes channels, markers, keywords, gates of the cytodata .

Installation

remotes::install_github("RGLab/cytoqc")

Get started

library(flowCore)
library(flowWorkspace)
library(cytoqc)

Load the FCS

files <- list.files(data_dir, ".fcs", full.names = TRUE)
cqc_data <- cqc_load_fcs(files)
cqc_data
## cytoqc data: 
## 21 samples

The basic workflow can be summarised as three steps:

  1. check

The consistency of markers, keywords and gating schemes.

  1. match

Inconsistent annotations to their nearest correct samples.

  1. fix

The inconsistent samples.

1. Check the consistency across samples

check_results <- cqc_check(cqc_data, type = "channel")
check_results

group_id

nFCS

channel

3

18

FL1-H, FL2-A, FL2-H, FL3-H, FL4-H, FSC-H, SSC-H, Time

1

1

channelA, FL1-H, FL2-A, FL2-H, FL3-H, FL4-H, FSC-H, SSC1-H, Time

2

1

FL1-H, FL2-A, FL2-H, FL3-H, FL4-H, fsc-h, SSC-H

4

1

FL1-H, FL2-A, FL2-H, FL3-H, FL4-H, fsc-h, SSC1-H, Time

2. Match the reference

res <- cqc_match(check_results, ref = 3) 
res
##                Ref             1     2      4
## 1            FL1-H             ✓     ✓      ✓
## 2            FL2-A             ✓     ✓      ✓
## 3            FL2-H             ✓     ✓      ✓
## 4            FL3-H             ✓     ✓      ✓
## 5            FL4-H             ✓     ✓      ✓
## 6            FSC-H             ✓ fsc-h  fsc-h
## 7            SSC-H        SSC1-H     ✓ SSC1-H
## 8             Time             ✓  <NA>      ✓
## 10 To Delete  Time channelA,Time         Time

3. Apply the fix

cqc_fix(res)

Update check report

check_results <- cqc_check(cqc_data, type = "channel")
check_results

group_id

nFCS

channel

1

21

FL1-H, FL2-A, FL2-H, FL3-H, FL4-H, FSC-H, SSC-H

Return the cleaned data

cqc_data <- cqc_get_data(check_results)
cqc_data
## cytoqc data: 
## 21 samples

Coerce it inot cytoset

cytoset(cqc_data)
## A cytoset with 21 samples.
## 
##   column names:
##     FSC-H, SSC-H, FL1-H, FL2-H, FL3-H, FL2-A, FL4-H

Or output to FCS

cqc_write_fcs(cqc_data, outdir)

cytoqc's People

Contributors

gfinak avatar jacobpwagner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

edjcarr jlglass

cytoqc's Issues

Resolve package check warnings

While I'm aware we're still in early development, the state of the package docs make it next to impossible to collaborate.
I've done a bit of work to clean it up, but there are still many issues visible via R CMD check.

  • Undocumented code objects.
    • since we're using S3, do the S3 object specific methods even need to be exported? Can we just export the default methods and have those dispatch correctly?
  • Undocumented parameters
    • signatures differ between many default methods and their object specific methods.
      • We need to make sure all of the parameters are documented correctly and unambiguously, and the appropriate usage sections are in place.
  • No object documentation for S3 objects.
    how do the pieces fit together?

Addressing these now and keeping the docs up to date will save time and trouble later.

Further clarification on `cqc_fix()` error message

Thanks for a great package. I was assuming the res$solution object provided would be able to find and rename old markers as it's in my res object. I'm getting an error message as below. I haven't found any examples that demonstrate how to work around this so writing here for more help.

Hope to hear from you!

error message

> cqc_fix(res)
Error in cf_rename_marker(x, from, to) : 
  old marker is not found: PE - TOX

code tried & output

library(flowCore)
library(flowWorkspace)
library(cytoqc)

> cqc_data <- cqc_load_fcs(files)
> check_results <- cqc_check(cqc_data, type = "marker")
> check_results
# A tibble: 3 × 3
  group_id  nFCS marker                                                                                                    
     <int> <int> <chr>                                                                                                     
1        2    22 CD44, CD45-1, CD62L, CD8, CX3CR1, CXCR5, Granzyme-B, Ki-67, L_D, PD-1, Slamf6, TCF-1, Tim-3, TOX          
2        1    20 A594 - TCF-1, APC - Granzyme-B, APCCy7 - CD62L, BUV395 - CD8, BUV737 - CD44, BV421 - Slamf6, BV605 - Tim-…
3        3    19 CD44, CD62L, CD8, CX3CR1, CXCR5, Granzyme-B, Ki-67, L_D, Ly5-1, PD-1, Slamf6, TCF-1, Tim-3, TOX           
> res <- cqc_match(check_results, ref = 2) 
Warning message:
Unmatched items remain after cqc_match. Before using cqc_fix, please resolve these unmatched items manually using cqc_match_update/remove/delete_unmatched or re-attempt automatic matching with cqc_match with a larger max.distance argument. 
> res <- cqc_match_delete_unmatched(res, c("PerCP-Cy5-5","Ly5-1"))
> cqc_fix(res)
Error in cf_rename_marker(x, from, to) : 
  old marker is not found: PE - TOX
> res
                    Ref                1     3
1                  CD44    BUV737 - CD44     ✓
2                CD45-1    FITC - CD45-1  <NA>
3                 CD62L   APCCy7 - CD62L     ✓
4                   CD8     BUV395 - CD8     ✓
5                CX3CR1   BV650 - CX3CR1     ✓
6                 CXCR5    PECy7 - CXCR5     ✓
7            Granzyme-B APC - Granzyme-B     ✓
8                 Ki-67             <NA>     ✓
9                   L_D       V500 - L_D     ✓
10                 PD-1     BV785 - PD-1     ✓
11               Slamf6   BV421 - Slamf6     ✓
12                TCF-1     A594 - TCF-1     ✓
13                Tim-3    BV605 - Tim-3     ✓
14                  TOX         PE - TOX     ✓
16 To Delete                 PerCP-Cy5-5 Ly5-1

> res$solution
# A tibble: 15 × 3
   group_id from             to        
      <int> <chr>            <chr>     
 1        1 PE - TOX         TOX       
 2        1 APC - Granzyme-B Granzyme-B
 3        1 A594 - TCF-1     TCF-1     
 4        1 FITC - CD45-1    CD45-1    
 5        1 V500 - L_D       L_D       
 6        1 BV421 - Slamf6   Slamf6    
 7        1 BV605 - Tim-3    Tim-3     
 8        1 BV650 - CX3CR1   CX3CR1    
 9        1 BV785 - PD-1     PD-1      
10        1 PECy7 - CXCR5    CXCR5     
11        1 APCCy7 - CD62L   CD62L     
12        1 BUV395 - CD8     CD8       
13        1 BUV737 - CD44    CD44      
14        1 PerCP-Cy5-5      NA        
15        3 Ly5-1            NA 

How to combine multiple lists of cytoframes/cytocq data objects into one

library(flowCore)
library(flowWorkspace)
library(cytoqc)

#Load the FCS
files <- list.files(data_dir, ".fcs", full.names = TRUE)
cqc_data <- cqc_load_fcs(files)
cqc_data

cytoqc data:

21 samples

#However, I have > 1500 fcs files, and have to split the task in several steps, to avoid errors. Thus, I load a subset of the files, and #repeat this several times, resulting in approximately 10 cqc_data objects, and I want to combine these objects in the end.:

cqc_data_1 <- cqc_load_fcs(files[1:3])
cqc_data_2 <- cqc_load_fcs(files[4:6])
cqc_data_3 <- cqc_load_fcs(files[7:21])

#How do I combine the lists cqc_data_1, cqc_data_2, cqc_data_3 into one list of cytoframes/cqc_cf_list for further analyses in the #cytoqc pipeline?

Fixing Parameter Order to make FlowSet

Hi, I have matched markers and fluorochrome names using cytoqc:

> cqc_data <- cqc_load_fcs(files)
> res <- cqc_check(cqc_data, type = "panel", by = "channel")
> res
# A tibble: 8 x 2
  channel                  `group 1(n=112)`
  <chr>                    <chr>           
1 FJComp-Alexa Fluor 700-A CCL4            
2 FJComp-APC-A             NKG2A           
3 FJComp-APC-Cy7-A         CD16            
4 FJComp-BV510-A           IFNg            
5 FJComp-BV605-A           CD56            
6 FJComp-BV711-A           CD107a          
7 FJComp-PE-A              CD57            
8 FJComp-PE-Cy7-A          NKG2C           

However, I was still not able construct a flowset:

> PBMC_fs <- read.flowSet(files,
+                         transformation = F,
+                         truncate_max_range = F
+ )
000015_ADNKA_C_E55_STIM.fcs doesn't have the identical colnames as the other samples!
000016_ADNKA_C_E55_UNSTIM.fcs doesn't have the identical colnames as the other samples!
000017_ADNKA_C_E66_STIM.fcs doesn't have the identical colnames as the other samples!
000018_ADNKA_C_E66_UNSTIM.fcs doesn't have the identical colnames as the other samples!
000019_ADNKA_C_F107_STIM.fcs doesn't have the identical colnames as the other samples!
000020_ADNKA_C_F107_UNSTIM.fcs doesn't have the identical colnames as the other samples!
Error in validObject(.Object) : 
  invalid class “flowSet” object: Some items identified in the data environment either have the wrong dimension or type.

On closer examination, I found that the order of the parameters still differs (and so does the number of keywords are stored in the 'description' slot):

> fcs1 <- read.FCS(files[1],
+                 transformation = F,
+                 truncate_max_range = F
+ )
> fcs1
flowFrame object 'ADNKA_A_E36_STIM.fcs'
with 12960 cells and 8 observables:
                        name   desc  range minRange maxRange
$P1             FJComp-APC-A  NKG2A 262144     -111   262144
$P2         FJComp-APC-Cy7-A   CD16 262144     -111   262144
$P3 FJComp-Alexa Fluor 700-A   CCL4  82897     -111    82897
$P4              FJComp-PE-A   CD57 262144     -111   262144
$P5          FJComp-PE-Cy7-A  NKG2C 262144     -111   262144
$P6           FJComp-BV711-A CD107a 262144     -111   262144
$P7           FJComp-BV605-A   CD56 262144     -111   262144
$P8           FJComp-BV510-A   IFNg  82897     -111    82897
111 keywords are stored in the 'description' slot
> fcs2 <- read.FCS(files[18],
+                  transformation = F,
+                  truncate_max_range = F
+ )
> fcs2
flowFrame object 'ADNKA_C_E66_UNSTIM.fcs'
with 169321 cells and 8 observables:
                        name   desc  range minRange maxRange
$P1             FJComp-APC-A  NKG2A 262144     -111   262144
$P2         FJComp-APC-Cy7-A   CD16 262144     -111   262144
$P3 FJComp-Alexa Fluor 700-A   CCL4 262144     -111   262144
$P4           FJComp-BV510-A   IFNg 262144     -111   262144
$P5           FJComp-BV605-A   CD56 262144     -111   262144
$P6           FJComp-BV711-A CD107a 262144     -111   262144
$P7              FJComp-PE-A   CD57 262144     -111   262144
$P8          FJComp-PE-Cy7-A  NKG2C 262144     -111   262144
107 keywords are stored in the 'description' slot

Tried : read.flowSet with column.pattern = "FJComp", but no luck.

Is the reason for failure to make flowset related to the fact that the parameters are not ordered appropriately or due to differing number of keywords?
How can I fix / circumvent this issue?
My ultimate goal is to prepare a flowset, that's it.
Thanks!

WorkFlow to fix missing marker names?

@gfinak and @mikejiang , cytoqc is amazing and can't wait for its Bioconductor launch, thank-you!
I was able to fix channel names, however, got stuck with marker names:

res <- cqc_check(cqc_data, type = "marker")
> res
# A tibble: 2 x 3
  group_id  nFCS marker                                              
     <int> <int> <chr>                                               
1        2   100 "CCL4, CD107a, CD16, CD56, CD57, IFNg, NKG2A, NKG2C"
2        1    12 ""                                        
> table(res$marker,res$group_id)
                   1     2
                   12    0
CCL4               0    100
CD107a             0    100
CD16               0    100
CD56               0    100
CD57               0    100
IFNg               0    100
NKG2A              0    100
NKG2C              0    100
> res1 <- cqc_match(res, ref = 2)
Warning message:
Unmatched items remain after cqc_match. Before using cqc_fix, please resolve these unmatched items manually using
cqc_match_update/remove/delete_unmatched or re-attempt automatic matching with cqc_match with a larger max.distance
argument. 
> res1 <- cqc_match_update(res1, map = c(""="CCL4"),
+                          group_id = 1)
Error: attempt to use zero-length variable name

How could I fill-in the missing marker names?
OR
Is there a way to read.flowSet ignoring $PnS?
OR
How can I just delete marker information ($PnS) completely from the 100 (group 2) files?
All I need to do is prepare a flowset so that I can proceed with my analysis.
Many thanks!

After cytoqc, error in CytoML::flowjo_to_gatingset (missing compensation parameter)

@gfinak and @mikejiang, great package, thanks for making it publicly available!

Our wet-lab collaborators acquired some samples with and others without an open autofluorescence channel.

I was able to clean up the dataset with cytoqc (amazing btw!) but now when I try to open their FlowJo workspace, I get the following error:

gs <- CytoML::flowjo_to_gatingset(ws,
                                   name = 1,
                                   subset = basename(sampleFCS_path),
                                   extend_val = -Inf,
                                   cytoset = cs,
                                   additional.sampleID = TRUE)

Error in (function (ws, group_id, subset, execute, path, cytoset, backend_dir,  : 
  compensation parameter 'AF-A' not found in cytoframe parameters!

How would you guys go about editing the compensation matrices now?

Any help would be much appreciated, thanks in advance!

Unexpected error when running cqc_fix(res)

Hi,

I am trying to run cytoqc to standardize channels across different fcs datasets. I am running the code as described but receive the error: "Error: Invalid input type, expected 'integer' actual 'double'" when running cqc_fix. Can you please help? I have the newest version of R (4.2.2) and updated all my packages. Thanks a lot!!

#if fcs files have different columns try using cytoqc to standardize
files <- list.files(fcs.dir, ".fcs", full.names = TRUE)
cqc_data <- cqc_load_fcs(files)
cqc_data

cytoqc data:
5 samples

#check consistency across samples
check_results <- cqc_check(cqc_data, type="channel")
check_results

A tibble: 2 × 3

group_id nFCS channel

1 2 4 FJComp-AF-A, FJComp-Alexa Fluor 488-A, FJComp-Alexa Fluor 647-A, FJComp-Alexa Fluor 700-A, FJComp-Alexa Fluor 750-A, FJComp-APC-A, FJComp-…
2 1 1 FJComp-AF-A, FJComp-Alexa Fluor 488-A, FJComp-Alexa Fluor 594-A, FJComp-Alexa Fluor 647-A, FJComp-Alexa Fluor 700-A, FJComp-Alexa Fluor 75…

#match the reference
res <- cqc_match(check_results,ref=2)
res

                              Ref                        1

1 FJComp-AF-A ✓
2 FJComp-Alexa Fluor 488-A ✓
3 FJComp-Alexa Fluor 647-A ✓
4 FJComp-Alexa Fluor 700-A ✓
5 FJComp-Alexa Fluor 750-A ✓
6 FJComp-APC-A ✓
7 FJComp-BV510-A ✓
8 FJComp-BV605-A ✓
9 FJComp-BV711-A ✓
10 FJComp-BV785-A ✓
11 FJComp-eFluor 450-A ✓
12 FJComp-PE-Cy7-A ✓
13 FJComp-PerCP-Cy5.5-A ✓
14 FJComp-Zombie NIR-A ✓
15 FSC-A ✓
16 FSC-H ✓
17 SSC-A ✓
18 SSC-B-A ✓
19 SSC-B-H ✓
20 SSC-H ✓
21 Time ✓
23 To Delete FJComp-Alexa Fluor 594-A

#apply the fix
cqc_fix(res)

Error: Invalid input type, expected 'integer' actual 'double'

Misc

Hi,
Interesting that I should investigate.
Readme: three steps are listed not four
lyoplate directory at the root of the package does not seem standard to me.
Best.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.