ebivariation / eva-pipeline Goto Github PK
View Code? Open in Web Editor NEWGenomic variation pipeline for the European Variation Archive, implemented using Spring Batch
Home Page: http://www.ebi.ac.uk/eva
License: Apache License 2.0
Genomic variation pipeline for the European Variation Archive, implemented using Spring Batch
Home Page: http://www.ebi.ac.uk/eva
License: Apache License 2.0
Hi. I followed the instructions using v2.0 source code.
When running tests by using mvn test package
, got the following results.
Results :
Failed tests:
CreateDatabaseIndexesStepTest.testIndexesAreCreated:75 expected:<[{ "v" : [1 , "key" : { "_id" : 1} , "name" : "_id_" , "ns" : "ed18730b-debc-479c-b625-1b9893dcfa0c.features"}, { "v" : 1] , "key" : { "name" ...> but was:<[{ "v" : [2 , "key" : { "_id" : 1} , "name" : "_id_" , "ns" : "ed18730b-debc-479c-b625-1b9893dcfa0c.features"}, { "v" : 2] , "key" : { "name" ...>
StatisticsMongoWriterTest.shouldCreateIndexesInCollection:138 expected:<[{ "v" : 1 , "key" : { "_id" : 1} , "name" : "_id_" , "ns" : "8d03943c-db27-4dea-b3e3-421bf090acbe.populationStatistics"}, { "v" : 1 , "unique" : true , "key" : { "chr" : 1 , "start" : 1 , "ref" : 1 , "alt" : 1 , "sid" : 1 , "cid" : 1} , "name" : "vscid" , "ns" : "8d03943c-db27-4dea-b3e3-421bf090acbe.populationStatistics"}]> but was:<[{ "v" : 2 , "key" : { "_id" : 1} , "name" : "_id_" , "ns" : "8d03943c-db27-4dea-b3e3-421bf090acbe.populationStatistics"}, { "v" : 2 , "unique" : true , "key" : { "chr" : 1 , "start" : 1 , "ref" : 1 , "alt" : 1 , "sid" : 1 , "cid" : 1} , "name" : "vscid" , "ns" : "8d03943c-db27-4dea-b3e3-421bf090acbe.populationStatistics"}]>
Tests run: 475, Failures: 2, Errors: 0, Skipped: 1
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:14 min
[INFO] Finished at: 2020-02-17T17:09:13+09:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on project eva-pipeline: There are test failures.
[ERROR]
[ERROR] Please refer to /home/kimoton/eva-pipeline/2.0/eva-pipeline-2.0/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Please give me some advise.
$ java -version
openjdk version "1.8.0_222"
OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10)
OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
$ mvn --version
Apache Maven 3.6.0
Maven home: /usr/share/maven
Java version: 1.8.0_222, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre
Default locale: en, platform encoding: UTF-8
OS name: "linux", version: "4.4.0-18362-microsoft", arch: "amd64", family: "unix"
$ mongo --version
MongoDB shell version v3.6.17
git version: 3d6953c361213c5bfab23e51ab274ce592edafe6
OpenSSL version: OpenSSL 1.0.2n 7 Dec 2017
allocator: tcmalloc
modules: none
build environment:
distmod: ubuntu1604
distarch: x86_64
target_arch: x86_64
variation-commons and eva-pipeline have different definitions of models and converters between Java classes and MongoDB documents. variation-commons has been recently reworked to facilitate reading from MongoDB.
Make eva-pipeline use the new classes from variation-commons.
The application fails if the config.chunk.size
parameter is not provided. A better approach would be to use a default value like 500, which allows the pipeline to progress without too much contention but at the same time ensures the memory usage is not too high.
Letting a VCF with duplicate samples be run through the pipeline causes issues when accessing them through the API. The reason is that we use the sample name and not the column index they occupy in the file as an accessor; it is not possible for it to differentiate between 2 occurrences of the sample same.
In addition to this, the VCF specification doesn't allow samples with duplicate names to be listed in a file.
An example of a valid VCF header line would be:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE1 SAMPLE2 SAMPLE3
And an invalid VCF header line would be:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE1 SAMPLE2 SAMPLE1
The pipeline must abort when an input like the latter is provided, by modifying the class VcfHeaderReader.
Since EVA is a variation archive, we shall not load into the database those records from a VCF file where one of the following conditions occurs:
I follow installation instructions;
git clone https://github.com/EBIvariation/opencga.git
cd opencga && mvn clean install -DskipTests
error said
[INFO] 11 errors
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] opencga ........................................... SUCCESS [0.149s]
[INFO] opencga-lib ....................................... SUCCESS [3.248s]
[INFO] opencga-storage ................................... SUCCESS [0.004s]
[INFO] opencga-storage-core .............................. FAILURE [1.250s]
[INFO] opencga-catalog ................................... SKIPPED
[INFO] opencga-analysis .................................. SKIPPED
[INFO] opencga-storage-mongodb ........................... SKIPPED
[INFO] opencga-storage-app ............................... SKIPPED
[INFO] opencga-app ....................................... SKIPPED
[INFO] opencga-account ................................... SKIPPED
[INFO] opencga-storage-hbase ............................. SKIPPED
[INFO] opencga-server .................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 5.332s
[INFO] Finished at: Fri Aug 11 07:46:39 UTC 2017
[INFO] Final Memory: 23M/252M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.2:compile (default-compile) on project opencga-storage-core: Compilation failure: Compilation failure:
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/VepVariantAnnotator.java:[7,39] package org.opencb.cellbase.core.client does not exist
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[12,32] cannot find symbol
[ERROR] symbol: class CellBaseConfiguration
[ERROR] location: package org.opencb.cellbase.core
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[13,39] package org.opencb.cellbase.core.client does not exist
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[14,44] cannot find symbol
[ERROR] symbol: class CellbaseConfiguration
[ERROR] location: package org.opencb.cellbase.core.common.core
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[44,13] cannot find symbol
[ERROR] symbol: class CellBaseClient
[ERROR] location: class org.opencb.opencga.storage.core.variant.annotation.CellBaseVariantAnnotator
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[82,37] cannot find symbol
[ERROR] symbol: class CellBaseClient
[ERROR] location: class org.opencb.opencga.storage.core.variant.annotation.CellBaseVariantAnnotator
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[100,13] cannot find symbol
[ERROR] symbol: class CellBaseClient
[ERROR] location: class org.opencb.opencga.storage.core.variant.annotation.CellBaseVariantAnnotator
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[103,38] cannot find symbol
[ERROR] symbol: class CellBaseClient
[ERROR] location: class org.opencb.opencga.storage.core.variant.annotation.CellBaseVariantAnnotator
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[243,35] package CellBaseClient does not exist
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[244,35] package CellBaseClient does not exist
[ERROR] /mnt/speedSeq_data/opencga/opencga-storage/opencga-storage-core/src/main/java/org/opencb/opencga/storage/core/variant/annotation/CellBaseVariantAnnotator.java:[246,35] package CellBaseClient does not exist
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn -rf :opencga-storage-core
What should I do
Thank you ,
Note
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.