Comments (8)
You can simply ignore raw peak stats.
We will hide it from the report in the next release v1.1.6 (commit e68b6b8).
It's redundant.
from atac-seq-pipeline.
Can you also include both IDR and naive peak set stats in the JSON?
from atac-seq-pipeline.
from atac-seq-pipeline.
Hi @leepc12, was this issue fixed across all implementations of the pipeline? I no longer get erroneous raw peak ataqc stats when I run the pipeline locally with conda, but a collaborator who ran the v1.1.7 pipeline on GCP has the same bug in her JSON report.
from atac-seq-pipeline.
Yes, this is already fixed in v1.1.6.
Example qc.json
from v1.1.6 run:
https://github.com/ENCODE-DCC/atac-seq-pipeline/blob/master/test/test_workflow/ref_output/v1.1.6/ENCSR356KRQ_subsampled/qc.json#L427
Can you upload her qc.json
file?
from atac-seq-pipeline.
Just this line:
"Raw peaks": [
74269,
"OK"
],
{
"general": {
"date": "2019-03-20 10:11:14",
"pipeline_ver": "v1.1.7",
"peak_caller": "macs2",
"genome": "rn6.tsv",
"description": "ATAC-seq on MoTrPAC",
"title": "20180725_5_Liver_002_powder_S4",
"paired_end": [
true
]
},
"flagstat_qc": {
"rep1": {
"total": 92694432,
"total_qc_failed": 0,
"duplicates": 0,
"duplicates_qc_failed": 0,
"mapped": 92123522,
"mapped_qc_failed": 0,
"mapped_pct": 99.38,
"paired": 51743818,
"paired_qc_failed": 0,
"read1": 25871909,
"read1_qc_failed": 0,
"read2": 25871909,
"read2_qc_failed": 0,
"paired_properly": 50741198,
"paired_properly_qc_failed": 0,
"paired_properly_pct": 98.06,
"with_itself": 51103330,
"with_itself_qc_failed": 0,
"singletons": 69578,
"singletons_qc_failed": 0,
"singletons_pct": 0.13,
"diff_chroms": 78236,
"diff_chroms_qc_failed": 0
}
},
"dup_qc": {
"rep1": {
"unpaired_reads": 0,
"paired_reads": 20329998,
"unmapped_reads": 0,
"unpaired_dupes": 0,
"paired_dupes": 6863253,
"paired_opt_dupes": 6353,
"dupes_pct": 0.337592
}
},
"pbc_qc": {
"rep1": {
"total_read_pairs": 20329998,
"distinct_read_pairs": 13483546,
"one_read_pair": 10166059,
"two_read_pair": 2492700,
"NRF": 0.663234,
"PBC1": 0.75396,
"PBC2": 4.078332
}
},
"nodup_flagstat_qc": {
"rep1": {
"total": 26933490,
"total_qc_failed": 0,
"duplicates": 0,
"duplicates_qc_failed": 0,
"mapped": 26933490,
"mapped_qc_failed": 0,
"mapped_pct": 100.0,
"paired": 26933490,
"paired_qc_failed": 0,
"read1": 13466745,
"read1_qc_failed": 0,
"read2": 13466745,
"read2_qc_failed": 0,
"paired_properly": 26933490,
"paired_properly_qc_failed": 0,
"paired_properly_pct": 100.0,
"with_itself": 26933490,
"with_itself_qc_failed": 0,
"singletons": 0,
"singletons_qc_failed": 0,
"singletons_pct": 0.0,
"diff_chroms": 0,
"diff_chroms_qc_failed": 0
}
},
"overlap_reproducibility_qc": {
"Nt": 0,
"N1": 127461,
"Np": 0,
"N_opt": 127461,
"N_consv": 127461,
"opt_set": "rep1-pr",
"consv_set": "rep1-pr",
"rescue_ratio": 0.0,
"self_consistency_ratio": 1.0,
"reproducibility": "pass"
},
"idr_reproducibility_qc": {
"Nt": 0,
"N1": 74269,
"Np": 0,
"N_opt": 74269,
"N_consv": 74269,
"opt_set": "rep1-pr",
"consv_set": "rep1-pr",
"rescue_ratio": 0.0,
"self_consistency_ratio": 1.0,
"reproducibility": "pass"
},
"frip_macs2_qc": {
"rep1": {
"FRiP": 0.312998501123
},
"rep1-pr1": {
"FRiP": 0.317969240676
},
"rep1-pr2": {
"FRiP": 0.322046516961
}
},
"overlap_frip_qc": {
"rep1-pr": {
"FRiP": 0.241691255014
}
},
"idr_frip_qc": {
"rep1-pr": {
"FRiP": 0.190524844719
}
},
"ataqc": {
"rep1": {
"Genome": "Rattus_norvegicus.Rnor_6.0.dna.toplevel.fa.gz",
"Paired/Single-ended": "Paired-ended",
"Read length": 75,
"Read count from sequencer": 92694432,
"Read count successfully aligned": 92123522,
"Read count after filtering for mapping quality": 59637527,
"Read count after removing duplicate reads": 52774274,
"Read count after removing mitochondrial reads (final read count)": 26933490,
"NRF": [
0.663234,
"out of range [0.8, inf]"
],
"PBC1": [
0.75396,
"out of range [0.8, inf]"
],
"PBC2": [
4.078332,
"OK"
],
"picard_est_library_size": 31429662,
"Fraction of reads in NFR": [
0.499886971921,
"OK"
],
"NFR / mono-nuc reads": [
1.53787093978,
"out of range [2.5, inf]"
],
"Presence of NFR peak": "OK",
"Presence of Mono-Nuc peak": "OK",
"Presence of Di-Nuc peak": "OK",
"Raw peaks": [
74269,
"OK"
],
"Naive overlap peaks": [
127461,
"OK"
],
"IDR peaks": [
74269,
"OK"
],
"Naive peak stats: Min size": 150.0,
"Naive peak stats: 25 percentile": 325.0,
"Naive peak stats: 50 percentile (median)": 495.0,
"Naive peak stats: 75 percentile": 760.0,
"Naive peak stats: Max size": 4715.0,
"Naive peak stats: Mean": 577.525611756,
"IDR peak stats: Min size": 150.0,
"IDR peak stats: 25 percentile": 441.0,
"IDR peak stats: 50 percentile (median)": 651.0,
"IDR peak stats: 75 percentile": 913.0,
"IDR peak stats: Max size": 4715.0,
"IDR peak stats: Mean": 706.594002881,
"TSS_enrichment": 6.25728227255
}
}
}
from atac-seq-pipeline.
In the example you linked to, why is the number of raw peaks less than the number of IDR peaks?
from atac-seq-pipeline.
Thanks for reporting this. The term Raw peaks
is internally equal to "IDR peaks" in the ataqc module. We will remove it in the next release or hotfix.
from atac-seq-pipeline.
Related Issues (20)
- Finished parsing without consuming all tokens HOT 1
- How are mitochondrial reads aligned? HOT 1
- How are DNAse-seq data processed with this pipeline?
- Reuse the same files in case of error.
- Help with running the pipeline HOT 1
- Unable to run the pipeline. invalid jar file error HOT 3
- 6 days stuck on task=atac.read_genome_tsv:-1, retry=0, status=Running
- More than 10 replicates HOT 3
- Encode-atac-seq-pipeline environment can't be found?
- [Question]: Do reads need to be deduped before FRiP calculation HOT 6
- Invalid MEMLIMIT unit value with LSF jobs on Linux
- two replicates and the combined have different signal
- --read-len selection
- Memory Saving: too many large files?
- The pipeline stalled at "chip.read_genome_tsv" for local backend HOT 2
- Differences in qc when validating installation HOT 3
- Failed on fastqs having identical filename but different path HOT 3
- Confirming that separate conditions/treatments should be analyzed by separate pipelines HOT 2
- Don't need to trim adapters
- Add --ntasks-per-node or --exclusive option for your multi-process jobs.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from atac-seq-pipeline.