Comments (12)
@biogeeker
just to be clear, misc.compareGenomes.py will not compute or build anything... It's just a tool to draw the modern or ancestral karyoptypes according to an ancestral or modern reference (or draw dotplots... see default mode=drawMatrix).
If you want to represent M1 with the color of the blocs of reconstructed A1:
1. plot A1 to draw the reference colors (1 bloc / 1 colour):
./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200 +sortBySize > A1.ps
2. plot M1 according to A1 colors of blocs:
./misc.compareGenomes.py ../example/data/genes/genes.M1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> M1.ps
If you want to represent how it evolves from A1 to A3 to M1, you have to draw A3 according to A1 colours:
./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A3.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> A3.ps
Can [ Agora ] output the occurrence of fusion and fission events for each ancestral node?
you can have an idea of of the fission and or fusion by using the same tool but with the option mode=printOrthologousChrom
as an example:
src/misc.compareGenomes.py ./example/results/ancGenomes/vertebrates-workflow/ancGenome.A1.list.bz2 example/data/genes/genes.M1.list.bz2 example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=printOrthologousChrom +sortBySize -minChrSize=200
will give:
CAR_1 Chr7 (516)
CAR_10 Chr12 (236) Chr25 (89)
CAR_11 Chr24 (194) Chr16 (95)
CAR_12 Chr17 (253)
CAR_13 Chr12 (202) Chr21 (41)
CAR_14 Chr30 (232)
CAR_15 Chr16 (234)
CAR_16 Chr27 (214)
CAR_17 Chr17 (215)
CAR_18 Chr21 (136) Chr12 (70)
CAR_19 Chr18 (155) Chr21 (130) Chr27 (64)
CAR_2 Chr1 (349) Chr17 (120)
CAR_20 Chr10 (185) Chr16 (21)
CAR_21 Chr2 (208) Chr3 (62)
CAR_22 Chr19 (174) Chr3 (34)
with this parameters (taking into account only blocs > 200 genes and with default parameter orthoschr:minHomology=90)
that means that CAR_10 of A1 is split on 2 Chromosomes in M1
the reverse command:
src/misc.compareGenomes.py ./example/results/ancGenomes/vertebrates-workflow/ancGenome.A1.list.bz2 example/data/genes/genes.M1.list.bz2 example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=printOrthologousChrom +sortBySize -minChrSize=200 +reverse
Chr1 CAR_2 (349)
Chr10 CAR_20 (185) CAR_6 (23)
Chr11 CAR_23 (202)
Chr12 CAR_6 (365) CAR_7 (325) CAR_10 (236) CAR_13 (202)
Chr14 CAR_5 (355)
Chr16 CAR_4 (436) CAR_15 (234) CAR_11 (95)
Chr17 CAR_12 (253) CAR_17 (215) CAR_2 (120)
Chr18 CAR_19 (155)
Chr19 CAR_22 (174)
Chr2 CAR_21 (208)
Chr20 CAR_8 (311)
Chr21 CAR_25 (196) CAR_18 (136) CAR_19 (130)
Chr22
Chr23 CAR_9 (294)
Chr24 CAR_11 (194)
Chr25 CAR_10 (89)
Chr27 CAR_16 (214) CAR_19 (64)
Chr29 CAR_15 (6) CAR_16 (3)
Chr3 CAR_21 (62) CAR_22 (34)
Chr30 CAR_14 (232) CAR_24 (205)
Chr4
Chr6 CAR_3 (447)
Chr7 CAR_1 (516)
will informe you that the modern Chr12 is composed of 4 ancestral blocs (complete or partial)
different options are available :
-mode=printOrthologuesCount
will tell you how many genes from which blocs composed the target genome chromosomes.
CAR_1 CAR_10 CAR_11 CAR_12 CAR_13 CAR_14 CAR_15 CAR_16 CAR_17 CAR_18 CAR_19 CAR_2 CAR_20 CAR_21 CAR_22 CAR_23 CAR_24 CAR_25 CAR_3 CAR_4 CAR_5 CAR_6 CAR_7 CAR_8 CAR_9
Chr1 0 0 0 0 0 0 0 0 0 0 0 349 0 0 0 0 0 0 0 0 0 0 0 0 0
Chr10 0 0 0 0 0 0 0 0 0 0 0 0 185 0 0 0 0 0 0 0 0 23 0 0 0
Chr11 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 202 0 0 0 0 0 0 0 0 0
Chr12 0 236 0 0 202 0 0 0 0 70 0 0 0 0 0 0 0 0 0 0 0 365 325 0 0
-mode=printOrthologuesList
will list the orthologous genes.
I hope this would help.
Cheers,
Alex
from agora.
Hello Orestis,
thank you for using AGORA and Genomicus.
At the present time, we do not provide tool to generate genomicus like multikaryoview with command line.
However, you can use src/misc.compareGenomes.py to generate the karyotypes one by one (with the option mode=drawKaryotype).
The first ancestral karyotype (one color per ancestral block) can be generate by comparing the genome to itself, eg:
src/misc.compareGenomes.py ancGenome.A1.list.bz2 ancGenome.A1.list.bz2 ancGenes.A1.list.bz2 -mode=drawKaryotype > A1.ps
The others, by comparing the genome (ancestral or modern) with the first ancestral one:
src/misc.compareGenomes.py genome.M1.list.bz2 ancGenome.A1.list.bz2 ancGenes.A1.list.bz2 -mode=drawKaryotype > M1.ps
etc...
options are available (+sortBySize, -minChrSize etc...)
I hope this allows you to do what you want on your data,
don't hesitate if you have other questions
Best wishes,
Alexandra
from agora.
@nousiaso
I think (tell me if I'm I'm wrong) you provided a species tree without naming the internal nodes, so AGORA named them NAME_1, NAME_2 etc... when parsing the species tree, (it is needed to build the comparisons computing process).
Whether you want de build each ancestral node of the species tree or only the ancestral root, AGORA needs all internal Orthologous groups, because not all genome pairwise comparison are informative to reconstruct the ancestral root. (see figure2 of the article). Also each internal node Orthogroups should only contain genes of descendant species of this ancestor.
I think one top orthologous group will not provide information about where gene duplications nor losses have occured during evolution.
from agora.
@nousiaso
If your speciesTree is like:
((((spec1,spec2),spec3),spec4),spec5)
with no name at inner ancestral node, AGORA will infer them as "NAME_0" "NAME_1" "NAME_2" etc... and will search for data with that infered name (Orthologous.NAME_1.list, etc...)
If you provide a speciestree with name at inner nodes:
(((Spec1,Spec2)A3,Spec3)A1,(Spec4,Spec5)A2)A0 ;
AGORA will expected OrthoGroups.A[0-3].list and will not use NAME_ cause the algorithm doesn't need to infer the name of ancestors.
If you need to show me data more privately you can email me to [email protected]
from agora.
When I say plot, i mean like the nice chromosome plot from genomicus web server for instance
from agora.
Hey Alexandra thanks for the reply. It is in the right direction. I will try it. Wanted to ask you something else related to the analysis.
I have 5 gene lists when i do the analysis, gene.genome1.list ...etc and one ancestral gene list gene.A0.list .
From my understanding i compiled a file with the orthogroup ids for the ancestral genome, space separated as you show in the example. So i have one file called orthologyGroups.A0.list , but when i run the pipeline it needs 4 more files. Plus
there is a strange thing where it needs them having these headers orthologyGroups.NAME_0.list , orthologyGroups.NAME_1.list ...orthologyGroups.NAME_4list . I just copied the initial file i had with all the orthology relationships orthologyGroups.A0.list and gave it all these names. It worked but I looked at the results and they are little baffling.
Then i tried also to get different inputs : i did this : the file orthologyGroups.NAME_0.list had just the orthologies of the A0 and genome 1, the orthologyGroups.NAME_1.list had just the orthologies of A0 and genome 2 etc...
After this try, i did not get ancestral genomes at the end. So I want to ask you what is going on exactly ?
What are these 5 orthologyGroup.lists that are needed, and why not just one orthologyGroup list with all the orthologs between the A0 and the 5 genomes is not good enough ?
Please elaborate
Thanks!!
from agora.
Hi, Alexandra,
AGORA is a wonderful tools for reconstructing ancestor karyotype, I have two questions when I read the manual and NEE paper. As the example data shows:
I have constructed the ancestor A1 karyotype as you mentioned above, then I want to know the karyotype of M1. Should I directly construct from A1 -> M1, or firstly to A3, and then from A3 -> M1? I think two steps is more reasonable, but still not sure. Here is my code :
- Directly construct from A1:
./misc.compareGenomes.py ../example/data/genes/genes.M1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/size-1.0-1.0/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> M1.ps
- A1 -> A3 -> M1
./misc.compareGenomes.py ../example/data/genes/genes.M1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A3.list.bz2 ../example/results/ancGenes/size-1.0-1.0/ancGenes.A3.list.bz2 -mode=drawKaryotype -minChrSize=200> M1_A3.ps
And another question is:
Can [ Agora ] output the occurrence of fusion and fission events for each ancestral node?
Thanks for your reply!
from agora.
@nousiaso I think (tell me if I'm I'm wrong) you provided a species tree without naming the internal nodes, so AGORA named them NAME_1, NAME_2 etc... when parsing the species tree, (it is needed to build the comparisons computing process). Whether you want de build each ancestral node of the species tree or only the ancestral root, AGORA needs all internal Orthologous groups, because not all genome pairwise comparison are informative to reconstruct the ancestral root. (see figure2 of the article). Also each internal node Orthogroups should only contain genes of descendant species of this ancestor. I think one top orthologous group will not provide information about where gene duplications nor losses have occured during evolution.
Hello, Alexandra,
Thanks for the reply
I named the species tree according to the names in gene.*.list.bz2 .
Concerning this : "Whether you want de build each ancestral node of the species tree or only the ancestral root, AGORA needs all internal Orthologous groups"
I thought that would be the case as you describe it.
Let me make it crystal clear
We have the 5 genomes of the cultivars we are investigating. I want to reconstruct the common ancestor of the 5 genomes plus A0. A0 is a closely related wild genome of different genus (but same family), that diverged from the genus we are investigating.
We have the orthogroups of that A0 with the 5 genomes.
What i did is i took the orthologs from A0 with genome 1 , A0 with genome 2 etc, and merged them into a file with
unique A0 gene ids using pandas.
At the end we have a file that has all orthologs of A0 with the five genomes (like the one in the example)
I used that for the orthologyGroup.A0.list as input but it was not enough. It needed files that were named
orthologyGroup.NAME_0.list
orthologyGroup.NAME_1.list
orthologyGroup.NAME_2.list
orthologyGroup.NAME_3.list
orthologyGroup.NAME_4.list
i copied the merged orthologyfile orthologyGroup.A0.list , using the different names
NAME_0, until NAME_4 and it worked.
After that i also tested the following
used the un-merged orthologyGroup files A0 with genome1 ... etc A0 genome 5
and named them orthologyGroup.NAME_0.list,, etc until orthologyGroup.NAME_4.list
( looking at ancgenes.log) i used the same order according to the gene.*.list.bz2 parsed, the first list was NAME_0 ... the fifth list was NAME_4)
i did not get ancestral genome results after that.
So i guess this is why : "Whether you want de build each ancestral node of the species tree or only the ancestral root, AGORA needs all internal Orthologous groups"
this is not a problem. I understand why.
The NAME issue though i did not really understand why it happens.
If you have an insight on that, please share,
thank you much for the repplies!
Orestis
from agora.
Your reply was very clear !
Thanks a lot Alexandra !
Best,
Orestis
from agora.
@biogeeker just to be clear, misc.compareGenomes.py will not compute or build anything... It's just a tool to draw the modern or ancestral karyoptypes according to an ancestral or modern reference (or draw dotplots... see default mode=drawMatrix).
If you want to represent M1 with the color of the blocs of reconstructed A1:
1. plot A1 to draw the reference colors (1 bloc / 1 colour):
./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200 +sortBySize > A1.ps
2. plot M1 according to A1 colors of blocs:
./misc.compareGenomes.py ../example/data/genes/genes.M1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> M1.ps
If you want to represent how it evolves from A1 to A3 to M1, you have to draw A3 according to A1 colours:
./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A3.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> A3.ps
Can [ Agora ] output the occurrence of fusion and fission events for each ancestral node? you can have an idea of of the fission and or fusion by using the same tool but with the option mode=printOrthologousChrom
as an example:
src/misc.compareGenomes.py ./example/results/ancGenomes/vertebrates-workflow/ancGenome.A1.list.bz2 example/data/genes/genes.M1.list.bz2 example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=printOrthologousChrom +sortBySize -minChrSize=200
will give:
CAR_1 Chr7 (516) CAR_10 Chr12 (236) Chr25 (89) CAR_11 Chr24 (194) Chr16 (95) CAR_12 Chr17 (253) CAR_13 Chr12 (202) Chr21 (41) CAR_14 Chr30 (232) CAR_15 Chr16 (234) CAR_16 Chr27 (214) CAR_17 Chr17 (215) CAR_18 Chr21 (136) Chr12 (70) CAR_19 Chr18 (155) Chr21 (130) Chr27 (64) CAR_2 Chr1 (349) Chr17 (120) CAR_20 Chr10 (185) Chr16 (21) CAR_21 Chr2 (208) Chr3 (62) CAR_22 Chr19 (174) Chr3 (34)
with this parameters (taking into account only blocs > 200 genes and with default parameter orthoschr:minHomology=90) that means that CAR_10 of A1 is split on 2 Chromosomes in M1
the reverse command:
src/misc.compareGenomes.py ./example/results/ancGenomes/vertebrates-workflow/ancGenome.A1.list.bz2 example/data/genes/genes.M1.list.bz2 example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=printOrthologousChrom +sortBySize -minChrSize=200 +reverse
Chr1 CAR_2 (349) Chr10 CAR_20 (185) CAR_6 (23) Chr11 CAR_23 (202) Chr12 CAR_6 (365) CAR_7 (325) CAR_10 (236) CAR_13 (202) Chr14 CAR_5 (355) Chr16 CAR_4 (436) CAR_15 (234) CAR_11 (95) Chr17 CAR_12 (253) CAR_17 (215) CAR_2 (120) Chr18 CAR_19 (155) Chr19 CAR_22 (174) Chr2 CAR_21 (208) Chr20 CAR_8 (311) Chr21 CAR_25 (196) CAR_18 (136) CAR_19 (130) Chr22 Chr23 CAR_9 (294) Chr24 CAR_11 (194) Chr25 CAR_10 (89) Chr27 CAR_16 (214) CAR_19 (64) Chr29 CAR_15 (6) CAR_16 (3) Chr3 CAR_21 (62) CAR_22 (34) Chr30 CAR_14 (232) CAR_24 (205) Chr4 Chr6 CAR_3 (447) Chr7 CAR_1 (516)
will informe you that the modern Chr12 is composed of 4 ancestral blocs (complete or partial)
different options are available :
-mode=printOrthologuesCount will tell you how many genes from which blocs composed the target genome chromosomes.
CAR_1 CAR_10 CAR_11 CAR_12 CAR_13 CAR_14 CAR_15 CAR_16 CAR_17 CAR_18 CAR_19 CAR_2 CAR_20 CAR_21 CAR_22 CAR_23 CAR_24 CAR_25 CAR_3 CAR_4 CAR_5 CAR_6 CAR_7 CAR_8 CAR_9 Chr1 0 0 0 0 0 0 0 0 0 0 0 349 0 0 0 0 0 0 0 0 0 0 0 0 0 Chr10 0 0 0 0 0 0 0 0 0 0 0 0 185 0 0 0 0 0 0 0 0 23 0 0 0 Chr11 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 202 0 0 0 0 0 0 0 0 0 Chr12 0 236 0 0 202 0 0 0 0 70 0 0 0 0 0 0 0 0 0 0 0 365 325 0 0
-mode=printOrthologuesList will list the orthologous genes.
I hope this would help. Cheers, Alex
Hi Alexandra,
Thank you for developing such a great tool for karyotype reconstruction. I ran the AGORA, and obtained the ancestral genomes for the target nodes.
Now I'm attempting to plot the karyotype using the AGORA package's script "misc.compareGenomes.py," but I always get the same error when I execute the commandline, even with the example data:
./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200 +sortBySize > A1.ps
.
The error is shown below:
Affichage Traceback (most recent call last):
File "/home/yucao/software/Agora-master/src/misc.compareGenomes.py", line 268, in
eval(str(arguments["mode"]))()
File "/home/yucao/software/Agora-master/src/misc.compareGenomes.py", line 85, in drawMatrix
utils.myPsOutput.printPsHeader()
File "/home/yucao/software/Agora-master/src/utils/myPsOutput.py", line 24, in printPsHeader
initColor()
File "/home/yucao/software/Agora-master/src/utils/myPsOutput.py", line 54, in initColor
location = [f for f in knownLocations if os.path.exists(f)][0]
IndexError: list index out of range
And the same error shows for my own data and results, do you have any idea why this occurs? According to the error message, I located line 54 of the script β/Agora-master/src/utils/myPsOutput.pyβ (please see the picture below for the details), it seems the error occurred because the missing of file "rgb.txt", but I'm still not sure how to get the file "rgb.txt"?
Thank you very much! looking very forward to your reply.
from agora.
Hi Yu,
sorry for the later answer,
this file (rgb.txt), contains the definition of all the colors that will be used in the karyotypes output.
These are usual locations on linux ubuntu or other linux distribution.
On your computer or server, you can probably execute:
locate rgb.txt
to find the good location and change it in the /Agora-master/src/utils/myPsOutput.py
I already update the code with one other location "/usr/share/X11/rgb.txt"
I hope this helps,
best,
Alexandra
from agora.
Dear Alexandra,
Thank you very much for your reply, after update the code with location "/usr/share/X11/rgb.txt", the script "misc.compareGenomes.py" run successfully, cheers!
Thanks again for your help.
Best,
Yu
from agora.
Related Issues (20)
- Error when running agora-plants.py HOT 6
- Are there any tools to build an evolutionary tree in NHX format (from Multiple sequence alignment) ? HOT 5
- divergence times with Agora
- Too many contiguous ancestral regions (CARs) HOT 2
- Lack of chromosomal information HOT 1
- input file issue HOT 2
- Breakpoint regions mapping to reference chromosomes
- ValueError when runnig misc.compareGenomes.py HOT 1
- Error when running agora-basic.py HOT 5
- Agora in living sps HOT 1
- Error Running agora-basic.py: "assert oldName not in seen" HOT 5
- Request for Custom Scripts: Rearrangement Breakpoints filtering and Interchromosomal Rearrangement Identification
- Clarification on Identifying Successive Genomes and Synthetic Blocks between Ancestral Nodes and Extant Genomes in PhylDiag HOT 2
- Creating input for analysis HOT 4
- How to locate the support scores of the AGORA adjacency graph for adjacent genes in ancGenome?
- Orthology groups correspond to HOGs? HOT 2
- Agora-generic pipeline seems to be βstuckβ for several days after adding 1 more genome HOT 4
- AssertionError in ALL.reformatGeneFamilies.py HOT 4
- drawKaryotype trouble HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from agora.