Giter Site home page Giter Site logo

freec's People

Contributors

calkan avatar matthdsm avatar valeu avatar xusailor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

freec's Issues

polynomial error

### With this config file:
[general]

BedGraphOutput=TRUE
chrFiles=/exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Fasta
chrLenFile=/exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Ovis_aries.Oar_v3.1.dna.toplevel.len
maxThreads=4
outputDir=/exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Control-freec/WA1343
ploidy=2
window=1000
step=1

[sample]
mateFile=/exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Washera/batch_X16081_WA1343_reheaded_mkdups_rg.bam
inputFormat=BAM
mateOrientation=FR

[control]

I had this error:

Control-FREEC v11.4 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
Multi-threading mode using 4 threads
..Breakpoint threshold for segmentation of copy number profiles is 0.8
..telocenromeric set to 50000
..FREEC is not going to adjust profiles for a possible contamination by normal cells
..Window = 1000 was set
..Step: 1
..Output directory: /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Control-freec/WA1343
..Directory with files containing chromosome sequences: /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Fasta
..Sample file: /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Washera/batch_X16081_WA1343_reheaded_mkdups_rg.bam
..Sample input format: BAM
..will use this instance of samtools: 'samtools' to read BAM files
..minimal expected GC-content (general parameter "minExpectedGC") was set to 0.35
..maximal expected GC-content (general parameter "maxExpectedGC") was set to 0.55
..Polynomial degree for "ReadCount ~ GC-content" normalization is 3 or 4: will try both
..Minimal CNA length (in windows) is 1
..File with chromosome lengths: /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Ovis_aries.Oar_v3.1.dna.toplevel.len
..Using the default minimal mappability value of 0.85
..uniqueMatch = FALSE
..average ploidy set to 2
..break-point type set to 2
..noisyData set to 0
..Control-FREEC will not look for subclones
..File /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Ovis_aries.Oar_v3.1.dna.toplevel.len was read
..Starting reading /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Washera/batch_X16081_WA1343_reheaded_mkdups_rg.bam
..samtools should be installed to be able to read BAM files; will use the following command for samtools: samtools view /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Washera/batch_X16081_WA1343_reheaded_mkdups_rg.bam
..finished reading /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Washera/batch_X16081_WA1343_reheaded_mkdups_rg.bam
PROFILING [tid=47201871498240]: /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Bams/Washera/batch_X16081_WA1343_reheaded_mkdups_rg.bam read in 4168 seconds [fillMyHash]
776139194 lines read..
697836260 reads used to compute copy number profile
printing counts into /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Control-freec/WA1343/batch_X16081_WA1343_reheaded_mkdups_rg.bam_sample.cpn
..Window size: 1000
..using GC-content to normalize copy number profiles
CG-content printed into /exports/cmvm/eddie/eb/groups/CTLGH_GCRF/WGS_sheep/test_temp/Control-freec/WA1343/GC_profile.1000bp.cnp
..Running FREEC with ploidy set to 2
ERROR: there was a problem in the initial guess of the polynomial. Please contact the support team of change your input parameters. Exit.

It would be great if you can help me with it.

Thank you

recommended value for breakPointThreshold

Hi,

in the version of FREEC-9.6, there is a file config_WGS.txt. In it, we can see following text
"

This should be a positive value. The closer it is to Zero, the more breakpoints will be called. Its recommended value is between 0.01 and 0.08.

breakPointThreshold = .8"

I also checked http://boevalab.com/FREEC/tutorial.html#CONFIG, the default value for breakPointThreshold is 0.8.

Are 0.01 and 0.08 should be 0.1 and 0.8?

Best regards,
Hongen

What dose "-" mean in `genotype` column of CNV output from FREEC?

I found there were some "-" items in the genotype field of CNVs files and the corresponding * percentage of uncertainty of the predicted genotype, max=100 field is -1.
Is there any explanation about this?

thanks !

part of result file

1       248928623       248937047       1       loss    A       -1      somatic 0
2       24270656        24295781        3       gain    -       -1      somatic 0
2       26176539        26186470        3       gain    -       -1      somatic 0
2       219285923       219291538       3       gain    -       -1      somatic 0
3       197667  1148078 1       loss    A       -1      somatic 0
3       75668939        75741784        1       loss    A       0.887484        somatic 0
3       75783300        85912597        1       loss    A       -1      somatic 0
3       125582695       126671834       3       gain    AAB     100     somatic 0
3       169115377       170866770       3       gain    AAB     27.351  somatic 0
3       184684414       185229364       1       loss    A       100     somatic 0
3       187668939       189320849       1       loss    A       25.5344 somatic 0
3       192798385       194498812       3       gain    AAB     20.7959 somatic 0
3       194592901       197149811       3       gain    AAB     3.06852 somatic 0
3       198120549       198219553       1       loss    A       -1      somatic 0
4       189957347       190082634       2       neutral AA      1.3469  somatic 0
5       107861526       109381773       1       loss    A       -1      somatic 0
6       73758922        73781377        3       gain    -       -1      somatic 0
6       158158197       159761719       2       neutral AA      13.0907 somatic 0
6       159762026       159767160       4       gain    -       -1      somatic 0
6       170543602       170640583       1       loss    A       -1      somatic 0
7       151566489       151632181       3       gain    -       -1      somatic 0
8       41265160        41530466        1       loss    A       53.8801 somatic 0
8       73020771        73046006        3       gain    -       -1      somatic 0
9       12193   173479  1       loss    A       8.20683 somatic 0
9       214914  325771  1       loss    A       31.5994 somatic 0
9       327985  27548708        1       loss    A       -1      somatic 0
9       132896178       132986789       0       loss    -       -1      somatic 0
9       136245468       137200152       1       loss    A       12.9155 somatic 0
9       137505637       138177332       1       loss    A       21.1893 somatic 0
10      47062   825304  1       loss    A       -1      somatic 0
10      16898976        18262183        1       loss    A       -1      somatic 0
10      27741308        30850188        1       loss    A       -1      somatic 0

run error

Hi,

I use the control_freec(v9.6) Analysis my Target sequencing data,
but I get the follows error:

..............
..using GC-content to normalize copy number profiles
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr
已放弃 (core dumped)

some people say is "Memory read cross-border".
How to solve this error?
Thank you very much !!!

my config as follow:
[general]
4
5 chrLenFile = /share/f/03.soft/FREEC/test/hg19.fa.fai
6 window = 0
7 ploidy = 2
8 outputDir = /share/f/03.soft/FREEC/test/out/
9 #sex=XY
10 breakPointType=4
11 chrFiles = /share/f/03.soft/FREEC/test/chromosomes/
12 maxThreads=6
13 breakPointThreshold=0.8
14 noisyData=TRUE
15 printNA=FALSE
16 readCountThreshold=50
17
18 [sample]
19
20 mateFile = /share/f/03.soft/FREEC/test/cancer.mpileup.txt
21 inputFormat = pileup
22 mateOrientation = FR
23
24 [control]
25
26 mateFile = /share/f/03.soft/FREEC/test/normal.mpileup.txt
27 inputFormat = pileup
28 mateOrientation = FR
29
30 [BAF]
31
32 #SNPfile = /bioinfo/users/vboeva/Desktop/annotations/hg19_snp131.SingleDiNucl.1based.txt
33 #minimalCoveragePerPosition = 5
34
35 [target]
36
37 captureRegions = /share/f/03.soft/FREEC/test/sorted.bed

_CNVs file versus _ratio.txt file versus capture regions BED file

I am trying to understand the difference between the regions in _CNVs and _ratio.txt and capture regions BED files. From what I understand, _CNVs will have all the regions with alterations after merging neighboring regions. That would make it a subset of _ratio.txt, but I see regions in _CNVs that aren't present in _ratio.txt. Is that expected?

I also compared _ratio.txt to the capture regions BED file. They seem to be identical, but _ratio.txt is heavily filtered (more than half the regions are filtered). The filtering seems to be based on the matched normal (all _ratio.txt files using the same matched normal have the same length). The regions with CopyNumber set to -1 do not make it to _CNVs, since there is insufficient data there. What is the difference between -1 regions and completely missing regions and why are so many missing? I am looking at the BAMs at some of the missing regions and they seem okay.

Error of makeGraph.R

Hi there,

Error in plot.window(...) : need finite 'xlim' values was reported when I used the command line cat makeGraph.R | R --slave --args 2 *ratio_noNA.txt.

image

I already delete all the 'Y' in the R script since Y chromosome is not included in my data. I am really wondering what happened.

Best,
Tina

if tumor sample is not purity, how to solve it?

Dear BoevaLab,
I have a mix tumor sample,including some normal cells, I know the percentage of tumor cells is 64%.
In config file, I set the parameters:" contamination=0.36 "and "contaminationAdjustment=TRUE", I don't whether it is ture. what is the algorithm for fixed the contamination? thanks. Looking forward to you reply.

cannot "make" free from version 9.7 source file

Hi,

I tried make freec from source file in three different computers, none works.
below is the output message:

make -f Makefile.freec init
make[1]: Entering directory /home/users/xu/FREEC-9.7/src' make[1]: Nothing to be done forinit'.
make[1]: Leaving directory /home/users/xu/FREEC-9.7/src' make -f Makefile.freec depend make[1]: Entering directory/home/users/xu/FREEC-9.7/src'
make[1]: Leaving directory /home/users/xu/FREEC-9.7/src' make -f Makefile.freec all make[1]: Entering directory/home/users/xu/FREEC-9.7/src'
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o main.o main.cpp
main.cpp: In function ‘int main(int, char**)’:
main.cpp:541:39: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < strs.size(); i++) {
^
main.cpp:577:60: warning: statement has no effect [-Wunused-value]
if (seekSubclones==0 ||seekSubclones==1) {seekSubclones==100;}
^
main.cpp:904:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i=0;i < ploidies.size(); i++ ) {
^
main.cpp:915:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i=0;i < ploidies.size(); i++ ) {
^
main.cpp:536:10: warning: variable ‘isPloidyKnown’ set but not used [-Wunused-but-set-variable]
bool isPloidyKnown = false;
^
main.cpp: In function ‘void runWithDefinedPloidy(int, GenomeCopyNumber&, GenomeCopyNumber&, bool, int, bool, bool, bool, int, int, bool, float, float, float, float, int, int, int, std::vector&, std::vector&, bool, std::vector&, ThreadPool_, ThreadPoolManager_, std::string, float, std::string, std::vector&, bool)’:
main.cpp:1020:27: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
if ((!forceGC && !(has_BAF) || (ifTargeted&&forceGC!=1) || (WESanalysis == true &&forceGC==0))) { //normalize sample density with control density
^
main.cpp:1179:48: warning: ‘contamValue’ may be used uninitialized in this function [-Wmaybe-uninitialized]
contamination.push_back(contamValue);
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ConfigFile.o ConfigFile.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o Chameleon.o Chameleon.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o GenomeDensity.o GenomeDensity.cpp
GenomeDensity.cpp: In constructor ‘GenomeDensity::GenomeDensity(const string&, const string&, const string&, const string&)’:
GenomeDensity.cpp:63:10: warning: unused variable ‘length’ [-Wunused-variable]
int length = strs[1].length();
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o Help.o Help.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o myFunc.o myFunc.cpp
myFunc.cpp: In function ‘unsigned int split(char*, char, char**)’:
myFunc.cpp:54:18: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
for (; c = str++; ++jj) {
^
myFunc.cpp: In function ‘long int getReadNumberFromPileup(const string&)’:
myFunc.cpp:370:56: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
if (toadd=strccnt(strs[4].c_str(), '^')) {
^
myFunc.cpp:393:56: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
if (toadd=strccnt(strs[4].c_str(), '^')) {
^
myFunc.cpp: In function ‘void strkeepOnly(char
, const char_)’:
myFunc.cpp:1639:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i< strlen(s);i++) {
^
myFunc.cpp:1640:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int j = 0; j< strlen(c);j++){
^
myFunc.cpp: In function ‘void strkeepOnly(std::string&, const char_)’:
myFunc.cpp:1651:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i< s.length(); i++) {
^
myFunc.cpp:1652:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int j = 0; j< strlen(c);j++){
^
myFunc.cpp: In function ‘void deleteChar(std::string&, char, int)’:
myFunc.cpp:1672:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i< s.length(); i++) {
^
myFunc.cpp: In function ‘void deleteChar(std::string&, char)’:
myFunc.cpp:1686:33: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i< s.length(); i++) {
^
myFunc.cpp: In function ‘void filterWithQualities(std::string&, std::string&, int)’:
myFunc.cpp:1709:43: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i< pileupShort.length(); i++) {
^
myFunc.cpp: In function ‘void getBAFinfo(std::string, float, float&, float&, std::string&, float&, float, int, bool, bool, bool)’:
myFunc.cpp:1886:141: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
if (copyNumbers[0]==1 && copyNumbers[1]==2 && copyNumber<1.5 && (medianBAFSym.compare("AA")==0 || medianBAFSym.compare("AB")==0 && uncertainty >0.1)) {
^
myFunc.cpp:1908:48: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for(int jj=0; jj<testedCN.size();jj++) {
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o KernelVector.o KernelVector.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ChrDensity.o ChrDensity.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ChrCopyNumber.o ChrCopyNumber.cpp
ChrCopyNumber.cpp: In constructor ‘ChrCopyNumber::ChrCopyNumber(int, int, const string&, int, std::string)’:
ChrCopyNumber.cpp:210:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if(tried != length_) {
^
ChrCopyNumber.cpp: In member function ‘std::string ChrCopyNumber::getGeneNameAtBin(int)’:
ChrCopyNumber.cpp:359:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (genes_names.size()>=i)
^
ChrCopyNumber.cpp: In member function ‘void ChrCopyNumber::setNotNprofileAt(int, float)’:
ChrCopyNumber.cpp:383:29: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (i<notNprofile_.size())
^
ChrCopyNumber.cpp: In member function ‘void ChrCopyNumber::setAllNormal()’:
ChrCopyNumber.cpp:581:50: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (readCount_[i] != NA && i<notNprofile_.size() &&notNprofile_[i]>0) {
^
ChrCopyNumber.cpp: In member function ‘void ChrCopyNumber::calculateCopyNumberMedian(int, int, bool, bool)’:
ChrCopyNumber.cpp:732:32: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (medianProfile_.size()!=length_) {
^
ChrCopyNumber.cpp:792:26: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (data.size()>=minCNAlength && data.size()>0)
^
ChrCopyNumber.cpp: In member function ‘float ChrCopyNumber::getEstimatedBAFuncertaintyAtI(int)’:
ChrCopyNumber.cpp:1552:47: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (i>0 &&i<estimatedBAFuncertainty_.size())
^
ChrCopyNumber.cpp: In member function ‘void ChrCopyNumber::setRCountToZeroForNNNN()’:
ChrCopyNumber.cpp:1704:42: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i< notNprofile_.size(); i++) {
^
ChrCopyNumber.cpp:1709:58: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = notNprofile_.size(); i< readCount_.size(); i++) {
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o GenomeCopyNumber.o GenomeCopyNumber.cpp
GenomeCopyNumber.cpp: In member function ‘double GenomeCopyNumber::calculateMedianAround(GenomeCopyNumber&, float, float)’:
GenomeCopyNumber.cpp:1012:53: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (controlcounts.size()!=it->getLength()) {
^
GenomeCopyNumber.cpp: In member function ‘void GenomeCopyNumber::calculateRatio(GenomeCopyNumber&, int, bool, bool)’:
GenomeCopyNumber.cpp:1134:57: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (controlcounts.size()!=it->getLength()) {
^
GenomeCopyNumber.cpp:1345:57: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (controlcounts.size()!=it->getLength()) {
^
GenomeCopyNumber.cpp: In member function ‘float GenomeCopyNumber::evaluateContaminationwithLR()’:
GenomeCopyNumber.cpp:3229:8: warning: unused variable ‘contam’ [-Wunused-variable]
float contam = 0;
^
GenomeCopyNumber.cpp: In member function ‘int GenomeCopyNumber::processRead(const string&, const string&, std::string)’:
GenomeCopyNumber.cpp:3623:55: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
if (valueToReturn=strccnt(strs[4].c_str(), '^')) {
^
GenomeCopyNumber.cpp: In member function ‘int GenomeCopyNumber::processRead(InputFormat, MateOrientation, const char_, int&, std::string, std::string)’:
GenomeCopyNumber.cpp:3691:29: warning: unused variable ‘maxpos’ [-Wunused-variable]
int maxpos=0;
^
GenomeCopyNumber.cpp:4072:20: warning: unused variable ‘orient2_1’ [-Wunused-variable]
MateOrientation orient2_1 = getMateOrientation(orient2+orient1);
^
GenomeCopyNumber.cpp:4192:43: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
if (valueToReturn=strccnt(strs[4], '^')) {
^
GenomeCopyNumber.cpp: In member function ‘int GenomeCopyNumber::focusOnCapture(const string&)’:
GenomeCopyNumber.cpp:4359:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (RegLength<minRegion)
^
GenomeCopyNumber.cpp: In member function ‘double GenomeCopyNumber::Percentage_GenomeExplained(int&)’:
GenomeCopyNumber.cpp:4453:8: warning: unused variable ‘startFragment’ [-Wunused-variable]
int startFragment = 0;
^
GenomeCopyNumber.cpp:4454:8: warning: unused variable ‘endFragment’ [-Wunused-variable]
int endFragment=0;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o chisquaredistr.o chisquaredistr.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ap.o ap.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o igammaf.o igammaf.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o gammafunc.o gammafunc.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o normaldistr.o normaldistr.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ablasf.o ablasf.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ablas.o ablas.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ortfac.o ortfac.cpp
ortfac.cpp: In function ‘void rmatrixbd(ap::real_2d_array&, int, int, ap::real_1d_array&, ap::real_1d_array&)’:
ortfac.cpp:1421:9: warning: variable ‘minmn’ set but not used [-Wunused-but-set-variable]
int minmn;
^
ortfac.cpp: In function ‘void rmatrixlqbasecase(ap::real_2d_array&, int, int, ap::real_1d_array&, ap::real_1d_array&, ap::real_1d_array&)’:
ortfac.cpp:2927:9: warning: variable ‘minmn’ set but not used [-Wunused-but-set-variable]
int minmn;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o sblas.o sblas.cpp
sblas.cpp: In function ‘void symmetricmatrixvectormultiply(const real_2d_array&, bool, int, int, const real_1d_array&, double, ap::real_1d_array&)’:
sblas.cpp:40:9: warning: variable ‘ba2’ set but not used [-Wunused-but-set-variable]
int ba2;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o rotations.o rotations.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o reflections.o reflections.cpp
reflections.cpp: In function ‘void applyreflectionfromtheleft(ap::real_2d_array&, double, const real_1d_array&, int, int, int, int, ap::real_1d_array&)’:
reflections.cpp:199:9: warning: variable ‘vm’ set but not used [-Wunused-but-set-variable]
int vm;
^
reflections.cpp: In function ‘void applyreflectionfromtheright(ap::real_2d_array&, double, const real_1d_array&, int, int, int, int, ap::real_1d_array&)’:
reflections.cpp:270:9: warning: variable ‘vm’ set but not used [-Wunused-but-set-variable]
int vm;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o linreg.o linreg.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o hblas.o hblas.cpp
hblas.cpp: In function ‘void hermitianmatrixvectormultiply(const complex_2d_array&, bool, int, int, const complex_1d_array&, ap::complex, ap::complex_1d_array&)’:
hblas.cpp:40:9: warning: variable ‘ba2’ set but not used [-Wunused-but-set-variable]
int ba2;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o descriptivestatistics.o descriptivestatistics.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o creflections.o creflections.cpp
creflections.cpp: In function ‘void complexapplyreflectionfromtheleft(ap::complex_2d_array&, ap::complex, const complex_1d_array&, int, int, int, int, ap::complex_1d_array&)’:
creflections.cpp:206:9: warning: variable ‘vm’ set but not used [-Wunused-but-set-variable]
int vm;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o blas.o blas.cpp
blas.cpp: In function ‘int vectoridxabsmax(const real_1d_array&, int, int)’:
blas.cpp:71:12: warning: variable ‘a’ set but not used [-Wunused-but-set-variable]
double a;
^
blas.cpp: In function ‘int columnidxabsmax(const real_2d_array&, int, int, int)’:
blas.cpp:90:12: warning: variable ‘a’ set but not used [-Wunused-but-set-variable]
double a;
^
blas.cpp: In function ‘int rowidxabsmax(const real_2d_array&, int, int, int)’:
blas.cpp:109:12: warning: variable ‘a’ set but not used [-Wunused-but-set-variable]
double a;
^
blas.cpp: In function ‘void matrixmatrixmultiply(const real_2d_array&, int, int, int, int, bool, const real_2d_array&, int, int, int, int, bool, double, ap::real_2d_array&, int, int, int, int, double, ap::real_1d_array&)’:
blas.cpp:388:9: warning: variable ‘ccols’ set but not used [-Wunused-but-set-variable]
int ccols;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o bdsvd.o bdsvd.cpp
bdsvd.cpp: In function ‘bool bidiagonalsvddecompositioninternal(ap::real_1d_array&, ap::real_1d_array, int, bool, bool, ap::real_2d_array&, int, int, ap::real_2d_array&, int, int, ap::real_2d_array&, int, int)’:
bdsvd.cpp:234:12: warning: variable ‘sminlo’ set but not used [-Wunused-but-set-variable]
double sminlo;
^
bdsvd.cpp:252:10: warning: variable ‘rightside’ set but not used [-Wunused-but-set-variable]
bool rightside;
^
bdsvd.cpp:1030:5: warning: ‘tsign’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if( ap::fp_greater_eq(b,0) )
^
bdsvd.cpp:1146:12: note: ‘tsign’ was declared here
double tsign;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o svd.o svd.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ialglib.o ialglib.cpp
ialglib.cpp: In function ‘bool ialglib::i_cmatrixgemmf(int, int, int, ap::complex, const complex_2d_array&, int, int, int, const complex_2d_array&, int, int, int, ap::complex, ap::complex_2d_array&, int, int)’:
ialglib.cpp:837:48: warning: deprecated conversion from string constant to ‘char
’ [-Wwrite-strings]
vcopy_complex(k, arow, 1, abuf, 1, "No conj");
^
ialglib.cpp:842:62: warning: deprecated conversion from string constant to ‘char_’ [-Wwrite-strings]
vcopy_complex(k, arow, stride, abuf, 1, "No conj");
^
ialglib.cpp:847:59: warning: deprecated conversion from string constant to ‘char_’ [-Wwrite-strings]
vcopy_complex(k, arow, stride, abuf, 1, "Conj");
^
ialglib.cpp: In function ‘bool ialglib::i_cmatrixrighttrsmf(int, int, const complex_2d_array&, int, int, bool, bool, int, ap::complex_2d_array&, int, int)’:
ialglib.cpp:917:76: warning: deprecated conversion from string constant to ‘char
’ [-Wwrite-strings]
vcopy_complex(i, abuf+2_i, alglib_c_block, tmpbuf, 1, "No conj");
^
ialglib.cpp:930:94: warning: deprecated conversion from string constant to ‘char_’ [-Wwrite-strings]
vcopy_complex(n-1-i, pdiag+2_alglib_c_block, alglib_c_block, tmpbuf, 1, "No conj");
^
ialglib.cpp: In function ‘bool ialglib::i_cmatrixlefttrsmf(int, int, const complex_2d_array&, int, int, bool, bool, int, ap::complex_2d_array&, int, int)’:
ialglib.cpp:1069:66: warning: deprecated conversion from string constant to ‘char
’ [-Wwrite-strings]
vcopy_complex(m-1-i, pdiag+2, 1, tmpbuf, 1, "No conj");
^
ialglib.cpp:1081:59: warning: deprecated conversion from string constant to ‘char_’ [-Wwrite-strings]
vcopy_complex(i, arow, 1, tmpbuf, 1, "No conj");
^
ialglib.cpp: In function ‘bool ialglib::i_cmatrixsyrkf(int, int, double, const complex_2d_array&, int, int, int, double, ap::complex_2d_array&, int, int, bool)’:
ialglib.cpp:1229:56: warning: deprecated conversion from string constant to ‘char
’ [-Wwrite-strings]
vcopy_complex(k, arow, 1, tmpbuf, 1, "Conj");
^
ialglib.cpp:1237:56: warning: deprecated conversion from string constant to ‘char_’ [-Wwrite-strings]
vcopy_complex(k, arow, 1, tmpbuf, 1, "Conj");
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o EntryCNV.o EntryCNV.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o SNPinGenome.o SNPinGenome.cpp
SNPinGenome.cpp: In member function ‘int SNPinGenome::processSNPLine(bool, char*, std::string&, int&, int&)’:
SNPinGenome.cpp:58:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (SNP_atChr_->size()>index) {
^
SNPinGenome.cpp: In member function ‘long int SNPinGenome::processPileUPLine(int&, char_, std::string&, int&, int, int&, int, GenomeCopyNumber_)’:
SNPinGenome.cpp:273:62: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
if (valueToReturn = strccnt(strs[4], '^')) {
^
SNPinGenome.cpp:287:62: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
if (valueToReturn = strccnt(strs[4], '^')) {
^
SNPinGenome.cpp:316:19: warning: unused variable ‘localBAF’ [-Wunused-variable]
float localBAF=addInfoFromAPileUp(atoi(strs[3]),minimalTotalLetterCountPerPosition,(SNP_atChr)[index].getNucleotideAt(positionCount),
^
SNPinGenome.cpp:338:27: warning: unused variable ‘localBAF’ [-Wunused-variable]
float localBAF=addInfoFromAPileUp(atoi(strs[3]),minimalTotalLetterCountPerPosition,(SNP_atChr)[index].getNucleotideAt(positionCount),
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o SNPatChr.o SNPatChr.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o SNPposition.o SNPposition.cpp
SNPposition.cpp: In constructor ‘SNPposition::SNPposition(int, char_)’:
SNPposition.cpp:31:18: warning: unused variable ‘strs_cnt’ [-Wunused-variable]
unsigned strs_cnt = split(alt, ',', strs);
^
SNPposition.cpp: In constructor ‘SNPposition::SNPposition(int, char_, const char_, const char_)’:
SNPposition.cpp:46:14: warning: unused variable ‘c_ref’ [-Wunused-variable]
char c_ref = ref[0];
^
SNPposition.cpp:47:14: warning: unused variable ‘reverse’ [-Wunused-variable]
bool reverse = strcmp(strand, "-") == 0;
^
SNPposition.cpp:52:18: warning: unused variable ‘strs_cnt’ [-Wunused-variable]
unsigned strs_cnt = split(letters, '/', strs);
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o binomialdistr.o binomialdistr.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ibetaf.o ibetaf.cpp
ibetaf.cpp: In function ‘double invincompletebeta(double, double, double)’:
ibetaf.cpp:258:9: warning: ‘dir’ may be used uninitialized in this function [-Wmaybe-uninitialized]
int dir;
^
ibetaf.cpp:433:25: warning: ‘rflg’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if( rflg==1 )
^
ibetaf.cpp:397:21: warning: ‘dithresh’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if( ap::fp_less(fabs(yp),dithresh) )
^
ibetaf.cpp:579:62: warning: ‘lgm’ may be used uninitialized in this function [-Wmaybe-uninitialized]
d = (aaa-1.0)log(x)+(bbb-1.0)log(1.0-x)+lgm;
^
ibetaf.cpp:546:54: warning: ‘x’ may be used uninitialized in this function [-Wmaybe-uninitialized]
yyy = incompletebeta(aaa, bbb, x);
^
ibetaf.cpp:548:17: warning: ‘yyy’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if( ap::fp_less(yyy,yl) )
^
ibetaf.cpp:396:30: warning: ‘y0’ may be used uninitialized in this function [-Wmaybe-uninitialized]
yp = (yyy-y0)/y0;
^
ibetaf.cpp:579:42: warning: ‘bbb’ may be used uninitialized in this function [-Wmaybe-uninitialized]
d = (aaa-1.0)log(x)+(bbb-1.0)log(1.0-x)+lgm;
^
ibetaf.cpp:579:25: warning: ‘aaa’ may be used uninitialized in this function [-Wmaybe-uninitialized]
d = (aaa-1.0)log(x)+(bbb-1.0)log(1.0-x)+lgm;
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ThreadPool.o ThreadPool.cpp
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o BAFpileup.o BAFpileup.cpp
BAFpileup.cpp: In member function ‘float BAFpileup::calculateFlankLength(const string&, const string&, const string&, std::string)’:
BAFpileup.cpp:54:20: warning: unused variable ‘t0’ [-Wunused-variable]
time_t t0 = time(NULL);
^
BAFpileup.cpp:57:25: warning: unused variable ‘matesOrientation’ [-Wunused-variable]
MateOrientation matesOrientation = getMateOrientation(matesOrientation_str);
^
BAFpileup.cpp: In member function ‘void BAFpileup::createBedFileWithChromosomeLengths(std::string, std::string)’:
BAFpileup.cpp:212:40: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (int i = 0; i < chr_names.size(); i++)
^
g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o RSSerror.o RSSerror.cpp
In file included from RSSerror.h:13:0,
from RSSerror.cpp:1:
GenomeCopyNumber.h: In function ‘long double calculateRSS(GenomeCopyNumber&, int)’:
GenomeCopyNumber.h:134:29: error: ‘std::mapstd::basic_string<char, int> GenomeCopyNumber::chromosomesInd
’ is private
std::map<std::string, int> chromosomesInd
; //should stay private
^
RSSerror.cpp:15:28: error: within this context
for ( it=samplecopynumber.chromosomesInd
.begin() ; it != samplecopynumber.chromosomesInd
.end(); it++ ) {
^
In file included from RSSerror.h:13:0,
from RSSerror.cpp:1:
GenomeCopyNumber.h:134:29: error: ‘std::mapstd::basic_string<char, int> GenomeCopyNumber::chromosomesInd
’ is private
std::map<std::string, int> chromosomesInd
; //should stay private
^
RSSerror.cpp:15:77: error: within this context
for ( it=samplecopynumber.chromosomesInd_.begin() ; it != samplecopynumber.chromosomesInd_.end(); it++ ) {
^
In file included from RSSerror.h:13:0,
from RSSerror.cpp:1:
GenomeCopyNumber.h:133:32: error: ‘std::vector GenomeCopyNumber::chrCopyNumber_’ is private
std::vector chrCopyNumber_; //should stay private !!! why is it public now, Carino????
^
RSSerror.cpp:24:33: error: within this context
int length = samplecopynumber.chrCopyNumber_[index].getLength();
^
In file included from RSSerror.h:13:0,
from RSSerror.cpp:1:
GenomeCopyNumber.h:133:32: error: ‘std::vector GenomeCopyNumber::chrCopyNumber_’ is private
std::vector chrCopyNumber_; //should stay private !!! why is it public now, Carino????
^
RSSerror.cpp:29:45: error: within this context
observed = samplecopynumber.chrCopyNumber_[index].getRatioAtBin(i);
^
In file included from RSSerror.h:13:0,
from RSSerror.cpp:1:
GenomeCopyNumber.h:133:32: error: ‘std::vector GenomeCopyNumber::chrCopyNumber_’ is private
std::vector chrCopyNumber_; //should stay private !!! why is it public now, Carino????
^
RSSerror.cpp:31:38: error: within this context
if (samplecopynumber.chrCopyNumber_[index].isMedianCalculated()) {
^
In file included from RSSerror.h:13:0,
from RSSerror.cpp:1:
GenomeCopyNumber.h:133:32: error: ‘std::vector GenomeCopyNumber::chrCopyNumber_’ is private
std::vector chrCopyNumber_; //should stay private !!! why is it public now, Carino????
^
RSSerror.cpp:32:49: error: within this context
expected = samplecopynumber.chrCopyNumber_[index].getMedianProfileAtI(i);
^
In file included from RSSerror.h:13:0,
from RSSerror.cpp:1:
GenomeCopyNumber.h:133:32: error: ‘std::vector GenomeCopyNumber::chrCopyNumber_’ is private
std::vector chrCopyNumber_; //should stay private !!! why is it public now, Carino????
^
RSSerror.cpp:33:42: error: within this context
if (samplecopynumber.chrCopyNumber_[index].isSmoothed())
^
In file included from RSSerror.h:13:0,
from RSSerror.cpp:1:
GenomeCopyNumber.h:133:32: error: ‘std::vector GenomeCopyNumber::chrCopyNumber_’ is private
std::vector chrCopyNumber_; //should stay private !!! why is it public now, Carino????
^
RSSerror.cpp:34:53: error: within this context
expected = samplecopynumber.chrCopyNumber_[index].getSmoothedProfileAtI(i);
^
make[1]: *** [RSSerror.o] Error 1
make[1]: Leaving directory `/home/users/xu/FREEC-9.7/src'
make: *** [all] Error 2

update config demo

Since many versions updated after the demo, please update the config files under "data"

is freec fit for HLA LOH detect?

dear professor,
“Because the HLA locus is highly polymorphic, very few sequencing reads align well with the human reference genome, making it quite diffcult to assess whether LOH has occurred,
so is freec fit for HLA LOH detect?
thanks a lot

FREEC was not able to extract reads

Hello,
I am trying to use FREEC on a paired (Tum/Norm) WGS data set of Illumina paired end bam files. I keep getting the following error

finished reading T62.sorted.dup.recal.bam
PROFILING [tid=46912512328960]:  T62.sorted.dup.recal.bam read in 8929 seconds [fillMyHash]
990013582 lines read..
0 reads used to compute copy number profile
Error: FREEC was not able to extract reads from T62.sorted.dup.recal.bam

Check your parameters: inputFormat and matesOrientation
Use "matesOrientation=0" if you have single end reads
Check the list of possible input formats at http://bioinfo-out.curie.fr/projects/freec/tutorial.html#CONFIG

If you use sorted SAM or BAM, please set "mateOrientation=0"; then FREEC will not try to detect pairs with normal orientation and insert size. Instead, it will keep all pairs from the input file

I have repeated the analysis with all possible input options for matesOrientation (FR, RF and 0) and get the same error. I am using BAM for my inputFormat and I am using Control-FREEC v9.3. Any help would be much appreciated.

Thanks
Arun

Config file

Hi I am trying to use FREE-C software for finding CNVs in WGS data. We don't have a control sample.

We aligned our data with a insect cell genome (Sf-9). However they don't have any chromosomal location available.

What should I specify under cheLenfile under the file: config_GC.txt

I have a sam and bam file from alignment.

Please help me

"*ratio.txt"file question ?

Hello dear friend :

I don't know the  *ratio.txt file Ratio column equal to -1 meaning  what。

My result is like :
image
My sample have a chr13 CNV. But through the result graph I can't see it.

I'm very much looking forward to your recovery.
Thank you very much !

support for CRAM files through samtools

Hi,

Would it be possible to include support for CRAM compressed files through samtools?
If I'm not mistaking, this is already native to the latest samtools/HTSlib

Cheers
M

Exome-seq Copy number analysis

Hello,

Can you please confirm if we can run exome-seq data through FREEC without a paired normal? In the documentation you've mentioned that "Starting from Control-FREEC v10.6, Control-FREEC can work on exome-seq data without a control sample"

The tasks ran into completion but I also got the following warnings:
WARNING: You did not provide a control sample ('mateFile' or 'mateCopyNumberFile') but you are working with targeted sequencing data that has a capture bias.
WARNING: Will proceed without any normalization (on your own risk)!!
Warning : You did not provide a control sample for WES data. No normalization will be applied to read counts!

Thanks!

FREEC cannot open .vcf.gz file

Hi,

I have a WGS .BAM file that I want to interrogate for CNA and LOH.

My config file is pretty much set up for with default settings, with files specified where necessary. I'm primarily using FREEC to generate the LOH analysis but I have an error reading the SNP vcf file.

Config file:

chrLenFile = /home/HG38/Hg38_chromosomeLengthFile.txt
window = 50000
step = 3000
ploidy = 2
breakPointThreshold = 0.75
chrFiles = /home/HG38/chromosomes/
intercept = 1
outputDir = /home/FREEC-ANALYSIS
sex = XY
breakPointType = 2
coefficientOfVaration = 0.05

[sample]
mateFile = /HOME/WGS.mpileup.pileup.gz
inputFormat = pileup
mateOrientation = FR

[BAF]
SNPfile = /home/common_all_20180418.vcf
minimalCoveragePerPosition = 0
fastaFile = /home/HG38/hg38.fa

The programme runs fine, with no errors or warnings until

Starting to read common_all_20180418.vcf
Error: unable to open /home/common_all_20180418.vcf

At first I wondered if this was because the SNP file was .vcf.gz but that resulted in the same error which is why i then unzipped it.

I'm using FREEC v11.4.

Can you help?

Best,

Build warnings, potential bugs

When building FREEC with a modern toolchain, I see quite a few suspicious warnings:

https://gist.github.com/sambrightman/dbb4b3d3e260672b0ed57a89c8869c79

  • the first == instead of = looks like a clear bug
  • some of the later assignment in place of comparison looks intentional but useless (value not used) - perhaps good to avoid this in at least most places?
  • without knowing the code, some of the &&/|| precedence issues look okay but some look suspicious (probably best to clarify all of them with parentheses)
  • lots of unused variables clutter the output, which hinders identification of real issues.

For me the compiler is:

[sam@Sams-MacBook-Pro src ((776efec...) *%)]$ g++ --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin16.1.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

but recent versions of GCC should also warn about such things.

Warning: minimal recommended polynomial degree for "ReadCount ~ GC-content" is 3

I am getting the following warning:

..Polynomial degree for "ReadCount ~ GC-content" is 1
Warning: minimal recommended polynomial degree for "ReadCount ~ GC-content" is 3
Comment or remove the corresponding line in the config file to try both degree==3 and degree==4

However, the documentation states:

Default: 3&4 (GC-content based normalization, WGS) or 1 (control-read-count-based normalization, WES)

I set forceGCcontentNormalization to 1 as the documentation suggests for WES. That might be triggering the warning, since that is forcing GC-content normalization.

So what is the right combination to use for WES?

captureRegions in configure

Hello, Thanks for your developing about FREEC.
And I have a confusion about the configure content. I don't know what the meaning of
chr 0-based start 1-based end at captureRegions . I know what is 0-based or 1-based respectively, but I can't understand why you will mix them.
Looking forward for your reply.

Cannot generate BAF files

Dear FREEC developer,

I have some issue with generating BAF files. I test several combinations of input files, first I provide bam file for sample, bam file for control and use makepileup option in BAF and provide vcf for SNPfile.
Second, I provide bam file for sample, pileup file for control and provide vcf for SNPfile.
Third, I provide pileup files for sample as well as control and provide vcf for SNPfile.
They don't give me the BAF files. Did I miss something?

Also, in the manual, it says mateOrientation=0,RF,FR,FF are related to single end, pair end, mate pair, solid mate pair. However, in the WGS example configure file, it says use 0 for sorted sam and bam. I am confused here. I am using sorted Bam files as input, illumina pair end results, what should I put here?

Thank you very much!
Yu

readCountThreshold calculation

I am using Control-FREEC for exome data. I noticed that the ratio.txt files had a lot fewer regions than the target regions BED file. I decided to change the readCountThreshold from 50 to 25. That yielded a lot more regions in ratio.txt file. However, I am not sure why. Most of the regions that were missing had coverage above 50, even at the lowest point. Is that an error or am I just not understanding how readCountThreshold works? For example, does the read have to be entirely inside the window to count or if there are two adjacent windows and the read spans both, does it only get assigned to one?

expalinations for makeGraph.R output

Hi,

It's great to provide such a script for visualization. Would you please provide some explanations, especially for BAF plot (different color lines, dots)?

Best regards,
Hongen

terminate called after throwing an instance of 'std::out_of_range' what(): basic_string::substr

hi,there
I am running FREEC on WES data according to chromosomes . I get an error with chr2 but others are OK .
I guess there is a problem with the bed file . but I can't find the specific reason .
I hope you can help me.
I'm very much looking forward to your recovery .
Thank you

log:

Control-FREEC v11.5 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
Multi-threading mode using 2 threads
..consider the sample being male
..Breakpoint threshold for segmentation of copy number profiles is 0.8
..telocenromeric set to 50000
..FREEC is not going to output normalized copy number profiles into a BedGraph file (for example, for visualization in the UCSC GB). Use "[general] BedGraphOutput=TRUE" if you want a BedGraph file
..FREEC is not going to adjust profiles for a possible contamination by normal cells
..Note, the Coefficient Of Variation won't be used since "window" = 0 was set
..Output directory: /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp
..Directory with files containing chromosome sequences: /export/database/WGS/database/b37/chrFiles
..Sample file: /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375LN.BQSR.2.bam.pileup
..Sample input format: pileup
..Control file: /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375N.BQSR.2.bam.pileup
..Input format for the control file: pileup
..forceGCcontentNormalization was set to 1: will use GC-content to normalize the read count data
..minimal expected GC-content (general parameter "minExpectedGC") was set to 0.35
..maximal expected GC-content (general parameter "maxExpectedGC") was set to 0.55
..Will use intercept==1 for the GC-content normalization and intercept==0 for the second normalization using the control
..Polynomial degree for "ReadCount ~ GC-content" is 1
Warning: minimal recommended polynomial degree for "ReadCount ~ GC-content" is 3
Comment or remove the corresponding line in the config file to try both degree==3 and degree==4
..Minimal CNA length (in windows) is 3
..File with chromosome lengths: /export/database/WGS/database/Control_freec/b37/human_g1k_v37_decoy.fasta.chr2.length
..File /export/database/WGS/database/Control_freec/b37/human_g1k_v37_decoy.fasta.chr2.length was read
..Using the minimal mappability of: 0.85
..uniqueMatch = FALSE
..average ploidy set to 2
..break-point type set to 2
..noisyData set to 1
..minimal number of reads per window in the control sample is set to 50
..Control-FREEC will not look for subclones
..will use SNP positions from /export/database/WGS/database/Control_freec/hg19_snp142.SingleDiNucl.1based.txt to calculate BAF profiles
..Starting reading /export/database/WGS/database/Control_freec/hg19_snp142.SingleDiNucl.1based.txt to get SNP positions
..read 101778434 SNP positions
PROFILING [tid=140516515325760]: /export/database/WGS/database/Control_freec/hg19_snp142.SingleDiNucl.1based.txt read in 36 seconds [readSNPs]
avoid double pileup read: reading sample matefile
avoid double pileup read: reading control matefile
..File /export/database/WGS/database/Control_freec/b37/human_g1k_v37_decoy.fasta.chr2.length was read
..Reading /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/bedDir/freec.exonm.rmdup.2.bed
..Your file must be in .BED format, and it must be sorted
Number of exons analysed in chromosome 2 : 15623
Average exon length in chromosome 2 : 182.577
..use "pileup" format of reads to calculate BAF profiles
..Starting reading /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375LN.BQSR.2.bam.pileup to calculate BAF profiles
..File /export/database/WGS/database/Control_freec/b37/human_g1k_v37_decoy.fasta.chr2.length was read
..Reading /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/bedDir/freec.exonm.rmdup.2.bed
..Your file must be in .BED format, and it must be sorted
Number of exons analysed in chromosome 2 : 15623
Average exon length in chromosome 2 : 182.577
..use "pileup" format of reads to calculate BAF profiles
..Starting reading /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375N.BQSR.2.bam.pileup to calculate BAF profiles
6181476 reads used to compute copy number profile
26614257 lines read
PROFILING [tid=140516403517184]: /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375LN.BQSR.2.bam.pileup read in 232 seconds [assignValues]
11524929 reads used to compute copy number profile
42612615 lines read
PROFILING [tid=140516395124480]: /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375N.BQSR.2.bam.pileup read in 315 seconds [assignValues]
terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr
printing counts into /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375LN.BQSR.2.bam.pileup_sample.cpn
printing counts into /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375N.BQSR.2.bam.pileup_control.cpn
..FREEC will take into account only regions from /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/bedDir/freec.exonm.rmdup.2.bed
..using GC-content to normalize copy number profiles

config:

[general]
breakPointThreshold = 0.8
breakPointType = 2
bedtools=/export/software/conda/miniconda3/bin/bedtools
samtools=/export/software/conda/miniconda3/bin/samtools
chrFiles=/export/database/WGS/database/b37/chrFiles
chrLenFile=/export/database/WGS/database/Control_freec/b37/human_g1k_v37_decoy.fasta.chr2.length
coefficientOfVariation = 0.05
ploidy=2
outputDir=/export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp
sex=XY
forceGCcontentNormalization=1
degree=1
intercept=0
minCNAlength=3
noisyData=TRUE
printNA=FALSE
readCountThreshold=50
minMappabilityPerWindow=0.85
minimalSubclonePresence = 100
maxThreads=2
telocentromeric = 50000
window=0

[sample]
mateFile = /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375LN.BQSR.2.bam.pileup
inputFormat = pileup
mateOrientation = FR

[control]
mateFile = /export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/S-375LN_S-375N.temp/S-375N.BQSR.2.bam.pileup
inputFormat = pileup
mateOrientation = FR

[BAF]
SNPfile=/export/database/WGS/database/Control_freec/hg19_snp142.SingleDiNucl.1based.txt
minimalCoveragePerPosition=0
minimalQualityPerPosition=0
shiftInQuality=33

[target]

captureRegions=/export/project/0.NGS/BJXWZ-201812011/output_group2/7.FreecOutDir/bedDir/freec.exonm.rmdup.2.bed

bed

freec.exonm.rmdup.2.bed.txt

window=0 warning

I don't understand this warning:

     if (WESanalysis==false && window!=0) {
         cerr << "Warning: we recommend setting \"window=0\" for exome sequencing data\n";
     }

I understand from the documentation that window=0 is supposed to be used for WES and read counts are then performed per exon. Is the first part of conditional the wrong way around (should be WESanalysis==true)? Inverting this would make the warning redundant though, as it's not currently possible for the WESanalysis to be true with window!=0:

     bool WESanalysis = false;
     if (ifTargeted && window == 0)   {
         WESanalysis = true;
     }

I notice this because I'm getting the warning all the time without using targets (WGS with window 1000).

Error on duplicated target regions

Dear Valentina,

when running FREEC (v11.4), I currently get an error because of potentially duplicated target regions.
Here is what part of the output looks like:

..Starting reading bam/***.bam
..sambamba should be installed to be able to read BAM files; will use the following command for sambamba: sambamba view -t 30 bam/***.bam
..finished reading bam/***.bam
PROFILING [tid=140085019785024]: bam/***.bam read in 3309 seconds [fillMyHash]
77240649 lines read..
57024716 reads used to compute copy number profile
printing counts into cnvs/***.bam_control.cpn
..Will not consider chrY..
..Erased chrY from the list of chromosomes
..FREEC will take into account only regions from agilent_SureSelect/S07604624_Covered.hg38_whitelist_unique.bed
..using GC-content to normalize copy number profiles
Error: your BED file with coordinates of targeted regions may contain duplicates
Check chromosome 1

I already tried to resolve duplicated entries by running

sort -k1,1 -k2,2n -k3,3n -u 

on the target regions bed file, but the error persists. How does FREEC determine a unique set of target regions? Is the strand important as well?

Best and thanks,
Jens

P.S.: Here is the configuration I used:

[general]
minCNAlength = 3
printNA = FALSE
maxThreads = 30
minimalSubclonePresence = 100
noisyData = TRUE
BedGraphOutput = TRUE
chrFiles = /mnt/flatfiles/organisms/human/hg38_GRCh38/chromosomes/
sambamba = sambamba
breakPointThreshold = 1.2
ploidy = 2
sex = XX
breakPointType = 4
window = 0
chrLenFile = chrNameLength.txt
readCountThreshold = 50
outputDir = cnvs
[sample]
inputFormat = BAM
mateOrientation = FR
mateFile = bam/***.bam
[control]
inputFormat = BAM
mateOrientation = 0
mateFile = bam/***.bam
[BAF]
fastaFile = /mnt/flatfiles/organisms/human/hg38_GRCh38/indices/star_GRCh38_gencode.v27/Homo_sapiens.GRCh38.27.dna_sm.toplevel.fa
[target]
captureRegions = agilent_SureSelect/S07604624_Covered.hg38_whitelist_unique.bed

compilation error in Mac OSX

I had tried to install FREEC-10.2 on mac osx but its fail with following error. Could you help me on this issue?

/Applications/Xcode.app/Contents/Developer/usr/bin/make -f Makefile.freec init
make[1]: Nothing to be done for `init'.
/Applications/Xcode.app/Contents/Developer/usr/bin/make -f Makefile.freec depend
/Applications/Xcode.app/Contents/Developer/usr/bin/make -f Makefile.freec all
c++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o main.o main.cpp
c++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o ConfigFile.o ConfigFile.cpp
c++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o Chameleon.o Chameleon.cpp
c++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o GenomeDensity.o GenomeDensity.cpp
c++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o Help.o Help.cpp
c++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o myFunc.o myFunc.cpp
myFunc.cpp:54:12: warning: using the result of an assignment as a condition
without parentheses [-Wparentheses]
for (; c = *str++; ++jj) {
~~^~~~~~~~
myFunc.cpp:54:12: note: place parentheses around the assignment to silence this
warning
for (; c = *str++; ++jj) {
^
( )
myFunc.cpp:54:12: note: use '==' to turn this assignment into an equality
comparison
for (; c = *str++; ++jj) {
^
==
myFunc.cpp:281:13: error: use of undeclared identifier '_popen'; did you mean
'popen'?
_popen(command.c_str(), "r");
^~~~~~
popen
/usr/include/stdio.h:325:7: note: 'popen' declared here
FILE *popen(const char *, const char *) __DARWIN_ALIAS_STARTING(__MAC...
^
myFunc.cpp:289:5: error: use of undeclared identifier '_pclose'; did you mean
'pclose'?
_pclose(stream);
^~~~~~~
pclose
/usr/include/stdio.h:321:6: note: 'pclose' declared here
int pclose(FILE *) __swift_unavailable_on("Use posix_spawn APIs or ...
^
myFunc.cpp:309:5: error: use of undeclared identifier '_popen'; did you mean
'popen'?
_popen(command.c_str(), "r");
^~~~~~
popen
/usr/include/stdio.h:325:7: note: 'popen' declared here
FILE *popen(const char *, const char *) __DARWIN_ALIAS_STARTING(__MAC...
^
myFunc.cpp:318:5: error: use of undeclared identifier '_pclose'; did you mean
'pclose'?
_pclose(stream);
^~~~~~~
pclose
/usr/include/stdio.h:321:6: note: 'pclose' declared here
int pclose(FILE *) __swift_unavailable_on("Use posix_spawn APIs or ...
^
myFunc.cpp:357:5: error: use of undeclared identifier '_popen'; did you mean
'popen'?
_popen(command.c_str(), "r");
^~~~~~
popen
/usr/include/stdio.h:325:7: note: 'popen' declared here
FILE *popen(const char *, const char *) __DARWIN_ALIAS_STARTING(__MAC...
^
myFunc.cpp:370:26: warning: using the result of an assignment as a condition
without parentheses [-Wparentheses]
if (toadd=strccnt(strs[4].c_str(), '^')) {
~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
myFunc.cpp:370:26: note: place parentheses around the assignment to silence this
warning
if (toadd=strccnt(strs[4].c_str(), '^')) {
^
( )
myFunc.cpp:370:26: note: use '==' to turn this assignment into an equality
comparison
if (toadd=strccnt(strs[4].c_str(), '^')) {
^
==
myFunc.cpp:377:5: error: use of undeclared identifier '_pclose'; did you mean
'pclose'?
_pclose(stream);
^~~~~~~
pclose
/usr/include/stdio.h:321:6: note: 'pclose' declared here
int pclose(FILE *) __swift_unavailable_on("Use posix_spawn APIs or ...
^
myFunc.cpp:393:26: warning: using the result of an assignment as a condition
without parentheses [-Wparentheses]
if (toadd=strccnt(strs[4].c_str(), '^')) {
~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
myFunc.cpp:393:26: note: place parentheses around the assignment to silence this
warning
if (toadd=strccnt(strs[4].c_str(), '^')) {
^
( )
myFunc.cpp:393:26: note: use '==' to turn this assignment into an equality
comparison
if (toadd=strccnt(strs[4].c_str(), '^')) {
^
==
3 warnings and 6 errors generated.
make[1]: *** [myFunc.o] Error 1
make: *** [all] Error 2

error: calculatedMedianAround

WGS 60X data pairend data.
I use pileup.gz tumor and normal file to run FreeC.
I met this error: calculateMedianAround() what does it mean?

Thanks a lot.

amplicon data

Hello I am trying to use FREE-C software for finding CNVs in amplicon sequencing data. I used pileup format as matefile ,both control and sample are pile up format , but I get error when it calculate BAF file

..use "pileup" format of reads to calculate BAF profiles
..Starting reading /home/ywliao/project/Gengyan/mpileup/pileup/spileup/wbcpileup/health1.pileup to calculate BAF profiles
段错误(吐核)

Please help me

Warning: control length is not equal to the sample length for chromosome M

Dear stuff in Boevalab,
I am in trouble when i run freec to analysis my WGS data. The problems is :

"Warning: control length is not equal to the sample length for chromosome M
Segmentation fault (core dumped)"

I check the output (cpn) after failed, the segment of chrM in sample is similar with in control. I am confused, i donnot know how to solve it. Looking forward to your reply!

bestwishes,
moshl

bad alloc when running with GC correction

I'm running FREECv9.7b on an exome sequencing paired sample and getting a failure after the chromosome.lengths file is read.
The end of the log file is below.
Number of exons analysed in chromosome Y : 1321
..Starting reading /exports/igmm/eddie/TCGA_exome/WOLF/pileups/SOL1648_1.recal.pileup.gz
..finished reading /exports/igmm/eddie/TCGA_exome/WOLF/pileups/SOL1648_1.recal.pileup.gz
PROFILING [tid=47795198732096]: /exports/igmm/eddie/TCGA_exome/WOLF/pileups/SOL1648_1.recal.pileup.gz read in 453 seconds [fillMyHash]
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
49634942 lines read..
8281392 reads used to compute copy number profile
printing counts into ./freecout.GC1/tumour_normal/SOL1648_1.recal.pileup.gz_sample.cpn
..File /exports/igmm/eddie/TCGA_exome/WOLF/Homo_sapiens_assembly19.1.lengths was read

I'm attaching the config file, the output cpn file, the freec output (freec.log) and the file of chromosome lengths in this zip.
for_val2.zip

Empty output file ".bam_CNVS"

Hi Dr. Boeva,

I encountered some difficulties during finding CNVs using control-freec. The process of running freec did not generate any error message, however, the output file ".bam_CNVs" was empty. I have already test my data on other software and can confirm that there are plenty of CNVs in my data. So. I'm wondering why this happened and hope you may be able to give me some help.

Here is my configure file:

[general]
chrLenFile = /public/home/kai/test_kai/NGS_cnv/freec_test/hg19.len.txt
window = 3000
step = 1000
ploidy = 2
outputDir = /public/home/kai/test_kai/NGS_cnv/freec_test/test3/output
samtools = /public/home/kai/softwares/samtools/samtools

[sample]
mateFile = /public/home/kai/test_kai/NGS_cnv/cnvkit_test/testdata/tumour1.bam
inputFormat = BAM
matesOrientation = 0

[control]
mateFile = /public/home/kai/test_kai/NGS_cnv/cnvkit_test/testdata/control.bam
inputFormat = BAM
matesOrientation = 0

[target]
captureRegions = /public/home/kai/test_kai/NGS_cnv/cnvkit_test/testdata/YJ.bed

compile error in 9.7

Hi there,

i compiled 9.6 without any errors, but when I "make" 9.7 I get:
... BAFpileup.cpp: In member function ‘void BAFpileup::createBedFileWithChromosomeLengths(std::string, std::string)’: BAFpileup.cpp:212: warning: comparison between signed and unsigned integer expressions g++ -O3 -g -DPROFILE_TRACE -Wall -m64 -c -o RSSerror.o RSSerror.cpp GenomeCopyNumber.h: In function ‘long double calculateRSS(GenomeCopyNumber&, int)’: GenomeCopyNumber.h:134: error: ‘std::map<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<const std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > GenomeCopyNumber::chromosomesInd_’ is private RSSerror.cpp:15: error: within this context GenomeCopyNumber.h:134: error: ‘std::map<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, std::less<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<const std::basic_string<char, std::char_traits<char>, std::allocator<char> >, int> > > GenomeCopyNumber::chromosomesInd_’ is private RSSerror.cpp:15: error: within this context GenomeCopyNumber.h:133: error: ‘std::vector<ChrCopyNumber, std::allocator<ChrCopyNumber> > GenomeCopyNumber::chrCopyNumber_’ is private RSSerror.cpp:24: error: within this context GenomeCopyNumber.h:133: error: ‘std::vector<ChrCopyNumber, std::allocator<ChrCopyNumber> > GenomeCopyNumber::chrCopyNumber_’ is private RSSerror.cpp:29: error: within this context GenomeCopyNumber.h:133: error: ‘std::vector<ChrCopyNumber, std::allocator<ChrCopyNumber> > GenomeCopyNumber::chrCopyNumber_’ is private RSSerror.cpp:31: error: within this context GenomeCopyNumber.h:133: error: ‘std::vector<ChrCopyNumber, std::allocator<ChrCopyNumber> > GenomeCopyNumber::chrCopyNumber_’ is private RSSerror.cpp:32: error: within this context GenomeCopyNumber.h:133: error: ‘std::vector<ChrCopyNumber, std::allocator<ChrCopyNumber> > GenomeCopyNumber::chrCopyNumber_’ is private RSSerror.cpp:33: error: within this context GenomeCopyNumber.h:133: error: ‘std::vector<ChrCopyNumber, std::allocator<ChrCopyNumber> > GenomeCopyNumber::chrCopyNumber_’ is private RSSerror.cpp:34: error: within this context make[1]: *** [RSSerror.o] Error 1 make[1]: Leaving directory /home/rcorbett/bin/FREEC-9.7/src'
make: *** [all] Error 2
`

No big deal as I'm just testing this out and 9.6 should work for me, just flagged for you in case it helps.

The exact meaning of the "telocentromeric=" option in the configuration file

Dear Developer(s),

I would like to ask what is the exact meaning of the "telocentromeric=" option in the configuration file? According to this manual (http://boevalab.com/FREEC/tutorial.html#CONFIG), this is "the length of pre-telomeric and pre-centromeric regions". Do you mean the length of transition regions flanking telomeres and centromeres? I am asking this because I want to determine a sensible value for my specific organism (non-human).

Also, two suggestions for potential future enhancements:

  1. As this thread (#1) suggested, it would be great if FREEC can directly read in the multi-fasta reference genome and directly calculate chromosome lengths without requiring extra work from the users.
  2. It will be also great if FREEC can directly taking arguments via command line options in addition to the current implementation of requiring a separate configuration file.

Thanks in advance!

Best,
Jia-Xing

Status Column not present in _CNVs file

Hello,

I have used the following config file:


[general]
chrLenFile = /media/prodriguez/disk1/Databases/genomes/Hsapiens/hg19/seq/chromosomes/hg19.fa.fai
window = 0
ploidy = 2,3
outputDir = /media/prodriguez/disk1/data/prodriguez/Projects/RISC3CAT_cfDNA/tests/test_FREEC

#sex=XY
breakPointType=4
chrFiles =  /media/prodriguez/disk1/Databases/genomes/Hsapiens/hg19/seq/chromosomes

maxThreads=6

breakPointThreshold=1.2
noisyData=TRUE
printNA=FALSE

readCountThreshold=50

[sample]

mateFile = /media/prodriguez/disk1/data/prodriguez/Projects/RISC3CAT_cfDNA/Analysis/615/work/align/615_tumor/615_tumor-sort.bam
inputFormat = bam
mateOrientation = FR

[control]

mateFile = /media/prodriguez/disk1/data/prodriguez/Projects/RISC3CAT_cfDNA/Analysis/615/work/align/615_normal/615_normal-sort.bam
inputFormat = bam
mateOrientation = FR

[BAF]

makePileup = /media/prodriguez/disk1/data/prodriguez/Projects/RISC3CAT_cfDNA/Analysis/615/final/2018-07-12_615/615-germline-ensemble-annotated.vcf.gz
minimalCoveragePerPosition = 5
fastaFile = /media/prodriguez/disk1/Databases/genomes/Hsapiens/hg19/seq/hg19.fa

[target]

captureRegions = /media/prodriguez/disk1/data/prodriguez/Projects/RISC3CAT_cfDNA/Analysis/615/work/bedprep/cleaned-MedExome_hg19_empirical_targets.Plus75.intervals_pluschr.bed

As I have used BAF and control options, according to documentation, I expect to find Status and CNA/LOH columns in the _CNVs file. Nevertheless, I cannot find these columns.

Any idea of what is happening?

Thank you in advance,

Pau.

Difference between pre-made pileup and mini-pileups

Our pipeline already spends quite some time producing pileups for VarScan and we also wanted to enable FREEC BAF. I decided to use the pre-existing pileups with inputFormat=pileup and SNPfile instead of re-producing pileups with makePileup. This has the nice side-effect of making FREEC substantially faster as well.

However, I notice quite a discrepancy in operation between the two methods. The main thing is that using SNPfile automatically switching to GC-content normalisation and degree 3/4 in WGS sample-control mode. A few questions:

  • Is this intentional?
  • Why do the two methods differ in their defaults?
  • Am I correct in thinking that it's not possible to disable GC-content normalisation using pre-prepared pileups unless also using targets? (i.e. if using targets you can set degree=1 and forceGC=0 to get the normal sample-control behaviour but without targets you can only control degree).
  • Did I miss some part of the documentation about these differences or does it need updating?

Also, #14 probably needs modification as the asymmetry wasn't clear to me at the time.

basic_string::substr: __pos (which is 139) > this->size() (which is 136)

I am using a new targets BED file and am getting a core dump with all my files. These are the last few lines of the output:

PROFILING [tid=46912502222144]: /path/normal.bam read in 3838 seconds [fillMyHash]
terminate called after throwing an instance of 'std::out_of_range'
  what():  basic_string::substr: __pos (which is 139) > this->size() (which is 136)
145842385 lines read..
69747417 reads used to compute copy number profile
..using GC-content to normalize copy number profiles

Do you know what it may be referring to?

I am guessing there is something wrong with that particular BED file. It looks like something is 139 instead of 136, but I don't know what that could be.

Install error

Hi,

After I decompress the file on windows that was downloaded from Github , I uploading these files to Linux. Then I type "make" in the command line. However, there are many errors occurred.
For example,

'''
'make[1]: Entering directory /stor9000/apps/users/NWSUAF/2017050306/biosoft/FREEC/src' make[1]: Nothing to be done for init'.
make[1]: Leaving directory /stor9000/apps/users/NWSUAF/2017050306/biosoft/FREEC/src' ...... make[1]: *** [depend] Error 1 make[1]: Leaving directory /stor9000/apps/users/NWSUAF/2017050306/biosoft/FREEC/src'
make: *** [all] Error 2
'''

So, was there anything wrong or I had a operation mistake?

Thanks!
Install error.txt

FREEC was not able to extract reads

dear Freec support:

my data is not human database.

my config file is
###For more options see: http://boevalab.com/FREEC/tutorial.html#CONFIG ###

[general]

##parameters chrLenFile and ploidy are required.

chrLenFile = /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/02.ref-config/ref.fa.fai
ploidy = 2

##Parameter "breakPointThreshold" specifies the maximal slope of the slope of residual sum of squares.
##This should be a positive value. The closer it is to Zero, the more breakpoints will be called. Its recommended value is between 0.01 and 0.08.

breakPointThreshold = .8

##Either coefficientOfVariation or window must be specified for whole genome sequencing data. Set window=0 for exome sequencing data.

#coefficientOfVariation = 0.01
window = 50000
#step=10000

##Either chrFiles or GCcontentProfile must be specified too if no control dataset is available.
##If you provide a path to chromosome files, Control-FREEC will look for the following fasta files in your directory (in this order):
##1, 1.fa, 1.fasta, chr1.fa, chr1.fasta; 2, 2.fa, etc.

Please ensure that you don't have other files but sequences having the listed names in this directory.

chrFiles = /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/02.ref-config/
#GCcontentProfile = test/GC_profile_50kb.cnp

##if you are working with something non-human, we may need to modify these parameters:
#minExpectedGC = 0.35
#maxExpectedGC = 0.55

#readCountThreshold=10

numberOfProcesses = 4
outputDir = /mnt/ilustre/users/long.huang/freec/
#contaminationAdjustment = TRUE
#contamination = 0.4
#minMappabilityPerWindow = 0.95

##If the parameter gemMappabilityFile is not specified, then the fraction of non-N nucleotides per window is used as Mappability.

#gemMappabilityFile = /GEM_mappability/out76.gem

#breakPointType = 4
#forceGCcontentNormalization = 0
#sex=XY

##set BedGraphOutput=TRUE if you want to create a BedGraph track for visualization in the UCSC genome browser:
#BedGraphOutput=TRUE

[sample]

mateFile = /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/04.bam-sort/Red.sort.bam
inputFormat = BAM
matesOrientation=0

##use "mateOrientation=0" for sorted .SAM and .BAM

[control]
#mateFile = /path/control.pileup.gz
#mateCopyNumberFile = path/control.cpn
#inputFormat = pileup

#mateOrientation = RF

#[BAF]

##use the following options to calculate B allele frequency profiles and genotype status. This option can only be used if "inputFormat=pileup"

#SNPfile = /bioinfo/users/vboeva/Desktop/annotations/hg19_snp131.SingleDiNucl.1based.txt
#minimalCoveragePerPosition = 5

##use "minimalQualityPerPosition" and "shiftInQuality" to consider only high quality position in calculation of allelic frequencies (this option significantly slows down re

#minimalQualityPerPosition = 5
#shiftInQuality = 33

[target]

##use a tab-delimited .BED file to specify capture regions (control dataset is needed to use this option):

#captureRegions = /bioinfo/users/vboeva/Desktop/testChr19/capture.bed

my error is
Control-FREEC v11.3 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
Non Multi-threading mode
..Breakpoint threshold for segmentation of copy number profiles is 0.8
..telocenromeric set to 50000
..FREEC is not going to output normalized copy number profiles into a BedGraph file (for example, for visualization in the UCSC GB). Use "[general] BedGraphOutput=TRUE" if you want a BedGraph file
..FREEC is not going to adjust profiles for a possible contamination by normal cells
..Window = 50000 was set
..Output directory: /mnt/ilustre/users/long.huang/freec/
..Directory with files containing chromosome sequences: /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/02.ref-config/
..Sample file: /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/04.bam-sort/Red.sort.bam
..Sample input format: BAM
..will use this instance of samtools: 'samtools' to read BAM files
..minimal expected GC-content (general parameter "minExpectedGC") was set to 0.35
..maximal expected GC-content (general parameter "maxExpectedGC") was set to 0.55
..Polynomial degree for "ReadCount ~ GC-content" normalization is 3 or 4: will try both
..Minimal CNA length (in windows) is 1
..File with chromosome lengths: /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/02.ref-config/ref.fa.fai
..Using the default minimal mappability value of 0.85
..uniqueMatch = FALSE
..average ploidy set to 2
..break-point type set to 2
..noisyData set to 0
..Control-FREEC will not look for subclones
..File /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/02.ref-config/ref.fa.fai was read
..Starting reading /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/04.bam-sort/Red.sort.bam
..samtools should be installed to be able to read BAM files; will use the following command for samtools: samtools view /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/04.bam-sort/Red.sort.bam
..finished reading /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/04.bam-sort/Red.sort.bam
PROFILING [tid=47383814016416]: /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/04.bam-sort/Red.sort.bam read in 835 seconds [fillMyHash]
124465060 lines read..
0 reads used to compute copy number profile

Error: FREEC was not able to extract reads from /mnt/ilustre/users/minghao.zhang/newmdt/Project/MJ20180427034_zhujingle/variant_20180531/04.bam-sort/Red.sort.bam

Check your parameters: inputFormat and matesOrientation
Use "matesOrientation=0" if you have single end reads
Check the list of possible input formats at http://bioinfo-out.curie.fr/projects/freec/tutorial.html#CONFIG

my bam is PE sort bam by samtools

thank you

empty pileup file produced

I am not sure wether this is a bug of freec.
My input files are two sorted and base recalibration processed bamfiles, as control file and sample file.
The config file is

[general]
chrLenFile = /HOME/sysu_rj_1/CLS/database/hg19/freecLib/genome.fa.fai
window = 0
ploidy = 2
outputDir = ./
sex=XX
breakPointType=4
chrFiles = /HOME/sysu_rj_1/CLS/database/hg19/freecLib/chromosomes
bedtools = /HOME/sysu_rj_1/CLS/software/bedtools2/bin/bedtools
sambamba = ~/bin/sambamba
SambambaThreads = 23
samtools = samtools
maxThreads=23
breakPointThreshold=1.2
noisyData=TRUE
printNA=FALSE
readCountThreshold=50
[sample]
mateFile = '311252-S_sort_dedup_realigned_recal.bam'
inputFormat = BAM
mateOrientation = 0
[control]
mateFile = '311252-N-1_sort_dedup_realigned_recal.bam'
inputFormat = BAM
mateOrientation = 0
[BAF]
makePileup = /HOME/sysu_rj_1/CLS/database/hg19/freecLib/hg19_snp142.SingleDiNucl.1based.bed
fastaFile = /HOME/sysu_rj_1/CLS/database/hg19/freecLib/genome.fa
SNPfile = /HOME/sysu_rj_1/CLS/database/hg19/freecLib/hg19_snp142.SingleDiNucl.1based.txt
minimalCoveragePerPosition = 5
[target]
captureRegions = /HOME/sysu_rj_1/CLS/database/hg19/freecLib/freec_nuohe_target_V6.bed

and my samtools version is 1.3.1
any suggestions?

Segmentation fault (core dumped)

Hi,
I tried to run freec with this config file to analyze WGS data without control.

[general]
chrLenFile = /home/data/resources/gatk_resources/hg19/ucsc.hg19.fasta.fai
chrFiles = /home/data/analysis/CTC/FREEC/files/hg19/
gemMappabilityFile = /home/data/analysis/CTC/FREEC/files/GEM_mappability/hg19/out76_hg19.gem

minCNAlength = 1
coefficientOfVariation = 0.06
printNA = FALSE
maxThreads = 4
sex = XX
uniqueMatch = TRUE
ploidy = 4
outputDir = /home/data/analysis/CTC/FREEC/FREEC-11.0/data/IonXpress_008/support/gem76/
BedGraphOutput = TRUE

[sample]
mateFile = /home/data/runs/Archivio/da_archiviare/Ampli1_LowPass_01/IonXpress_008_R_2017_06_08_13_23_10_user_SN2-41-Pietro_LowPass_prova_Auto_user_SN2-41-Pietro_LowPass_prova_91.bam
inputFormat = BAM
mateOrientation = 0

The software returned this error at this step:

Number of EM iterations :18
root mean square error = 11.5541
Y = -11482.7x^4+56229.2x^3+-69675.9x^2+32512.7x^1+-4982.38
Errore di segmentazione (core dumped)

Could you help me please??

Thanks,
Michela.

Some question about FREEC algorithm

I am a freec user from china. After testing freec and reading your publishment, I have some questions about it.

1. I am confusing about the defination of LOH in your tool. You R script show if b_allele_frequency does not equal to 0.5, it will be show as light blue, which represent LOH.
    I also review of other literature, which shows LOH means the genetype from AB to AA or BB.
    So I am confused about it. The following statement is my view:
       1.1 you filter region which shows AA or BB or AAA ......
       1.2 so you do report allele content, but not LOH ?
       1.3 There also some region show AB AAB or other type, you only consider about them in your algorithm, but there also some fragment shows AA or BB or AAA.... in these region, so it is part of real LOH?

2. I was condised about the arguments: window and step, can you give me more interpretation?

I will be appreciated if anyone can answer my questions.

Thanks,

Segmentation fault (core dumped) at "will remove all windows with read count in the control less than 50"

I am running Control-FREEC. If I use the BAM files, I run into this error:

CG-content printed into ./GC_profile.targetedRegions.cnp
..using GC-content to normalize the control profile
file ./GC_profile.targetedRegions.cnp is read
..will remove all windows with read count in the control less than 50
Warning: control length is not equal to the sample length for chromosome 1
Warning: control length is not equal to the sample length for chromosome 2
Warning: control length is not equal to the sample length for chromosome 3
Warning: control length is not equal to the sample length for chromosome 4
Warning: control length is not equal to the sample length for chromosome 5
Warning: control length is not equal to the sample length for chromosome 6
Warning: control length is not equal to the sample length for chromosome 7
Warning: control length is not equal to the sample length for chromosome 8
Warning: control length is not equal to the sample length for chromosome 9
Warning: control length is not equal to the sample length for chromosome 10
Warning: control length is not equal to the sample length for chromosome 11
Warning: control length is not equal to the sample length for chromosome 12
Warning: control length is not equal to the sample length for chromosome 13
Warning: control length is not equal to the sample length for chromosome 14
Warning: control length is not equal to the sample length for chromosome 15
Warning: control length is not equal to the sample length for chromosome 16
Warning: control length is not equal to the sample length for chromosome 17
Warning: control length is not equal to the sample length for chromosome 18
Warning: control length is not equal to the sample length for chromosome 19
Warning: control length is not equal to the sample length for chromosome 20
Warning: control length is not equal to the sample length for chromosome 21
Warning: control length is not equal to the sample length for chromosome 22
Warning: control length is not equal to the sample length for chromosome X
Segmentation fault (core dumped)

In my experience, a core dump happens when there is a memory issue, but it's only 45MB, so that can't be the reason.

If I repeat the analysis, but change from mateFile to mateCopyNumberFile using the .cpn files from the failed analysis, the problem goes away:

CG-content printed into ./GC_profile.targetedRegions.cnp
..using GC-content to normalize the control profile
file ./GC_profile.targetedRegions.cnp is read
..will remove all windows with read count in the control less than 50
..will process the control file as well: removing all windows with read count in the control less than 50
..Set ploidy for the control genome equal to 2
..Running FREEC with ploidy set to 2

It's using the same .cpn files for doing that calculation as far as I can tell. Why does it work one in one version but not the other? I tried multiple times with the same results.

Your GC-content file is empty or is in a wrong format Please use chomosome sequences (option "chrFiles") to recreate it!

Hi, I'd like to use FREEC without control to detect CNV in goat genome. I prepared my chromosome lenght file (goat_chr.len), my GC_content file, without mappability using bedtools on goat genome (example_GC_content.txt) and I prepared my config file:
[general]
chrLenFile = /illumina/runs/DNAPipeline/girgentana1/goat_chr.len
ploidy = 2
GCcontentProfile = /illumina/runs/DNAPipeline/girgentana1/GC_girgentana_mod.cnp
window = 50000
minExpectedGC = 0.30
maxExpectedGC = 0.70

[sample]

mateFile = /illumina/runs/DNAPipeline/outRD.bam
inputFormat = BAM
mateOrientation = 0

[control]

I run FREEC and after reading the bam file It gives me the error: Your GC-content file /illumina/runs/DNAPipeline/girgentana1/GC_girgentana_mod.cnp is empty or is in a wrong format

Please use chomosome sequences (option "chrFiles") to recreate it!

I have attached the chromosome lengths file, an example of GC content file and the log.txt with some information of the analysis. Please I need some help to understand what I wrong. Any suggestion will be very appreciated.
Greetings
Marco

example_GC_content.txt
goat_chr.txt
log.txt

explanation of ratio in *_ratio.txt

Hi,

When I read the manual of Control-Freec, I found that ratio = (Sample RC/ Control RC). Is it the meaning of Ration and MedianRatio in the result of *_ratio.txt?

Thanks

Accept multifasta file instead of chrFiles directory

It makes more sense to me, where FASTA files are usually multifasta files, that an option to pass just one file instead of a directory full of files would be appropriate. There are plenty of libraries for handling indexed FASTA files. Maybe you could link to htslib for this?

Thanks for putting this on GitHub - and for the continued development!

bam file used in control-FREEC

which level of bam,duplicate masking,InDel realigned,or post-BQSR , should I use when using control-FREEC to detect CNVs?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.