Hello, very nice and useful pipeline. I finished a run using orthogroups, I am very in

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Plotting agora results plus a general question,about dyogenibens/agora

Comments (12)

alouis72 commented on June 15, 2024 3

@biogeeker
just to be clear, misc.compareGenomes.py will not compute or build anything... It's just a tool to draw the modern or ancestral karyoptypes according to an ancestral or modern reference (or draw dotplots... see default mode=drawMatrix).

If you want to represent M1 with the color of the blocs of reconstructed A1:

1. plot A1 to draw the reference colors (1 bloc / 1 colour):

./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200 +sortBySize > A1.ps

2. plot M1 according to A1 colors of blocs:

./misc.compareGenomes.py ../example/data/genes/genes.M1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> M1.ps

If you want to represent how it evolves from A1 to A3 to M1, you have to draw A3 according to A1 colours:

./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A3.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> A3.ps

Can [ Agora ] output the occurrence of fusion and fission events for each ancestral node?
you can have an idea of of the fission and or fusion by using the same tool but with the option mode=printOrthologousChrom

as an example:
src/misc.compareGenomes.py ./example/results/ancGenomes/vertebrates-workflow/ancGenome.A1.list.bz2 example/data/genes/genes.M1.list.bz2 example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=printOrthologousChrom +sortBySize -minChrSize=200

will give:

CAR_1	Chr7 (516)
CAR_10	Chr12 (236)	Chr25 (89)
CAR_11	Chr24 (194)	Chr16 (95)
CAR_12	Chr17 (253)
CAR_13	Chr12 (202)	Chr21 (41)
CAR_14	Chr30 (232)
CAR_15	Chr16 (234)
CAR_16	Chr27 (214)
CAR_17	Chr17 (215)
CAR_18	Chr21 (136)	Chr12 (70)
CAR_19	Chr18 (155)	Chr21 (130)	Chr27 (64)
CAR_2	Chr1 (349)	Chr17 (120)
CAR_20	Chr10 (185)	Chr16 (21)
CAR_21	Chr2 (208)	Chr3 (62)
CAR_22	Chr19 (174)	Chr3 (34)

with this parameters (taking into account only blocs > 200 genes and with default parameter orthoschr:minHomology=90)
that means that CAR_10 of A1 is split on 2 Chromosomes in M1

the reverse command:
src/misc.compareGenomes.py ./example/results/ancGenomes/vertebrates-workflow/ancGenome.A1.list.bz2 example/data/genes/genes.M1.list.bz2 example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=printOrthologousChrom +sortBySize -minChrSize=200 +reverse

Chr1	CAR_2 (349)
Chr10	CAR_20 (185)	CAR_6 (23)
Chr11	CAR_23 (202)
Chr12	CAR_6 (365)	CAR_7 (325)	CAR_10 (236)	CAR_13 (202)
Chr14	CAR_5 (355)
Chr16	CAR_4 (436)	CAR_15 (234)	CAR_11 (95)
Chr17	CAR_12 (253)	CAR_17 (215)	CAR_2 (120)
Chr18	CAR_19 (155)
Chr19	CAR_22 (174)
Chr2	CAR_21 (208)
Chr20	CAR_8 (311)
Chr21	CAR_25 (196)	CAR_18 (136)	CAR_19 (130)
Chr22
Chr23	CAR_9 (294)
Chr24	CAR_11 (194)
Chr25	CAR_10 (89)
Chr27	CAR_16 (214)	CAR_19 (64)
Chr29	CAR_15 (6)	CAR_16 (3)
Chr3	CAR_21 (62)	CAR_22 (34)
Chr30	CAR_14 (232)	CAR_24 (205)
Chr4
Chr6	CAR_3 (447)
Chr7	CAR_1 (516)

will informe you that the modern Chr12 is composed of 4 ancestral blocs (complete or partial)

different options are available :

-mode=printOrthologuesCount
will tell you how many genes from which blocs composed the target genome chromosomes.

	CAR_1	CAR_10	CAR_11	CAR_12	CAR_13	CAR_14	CAR_15	CAR_16	CAR_17	CAR_18	CAR_19	CAR_2	CAR_20	CAR_21	CAR_22	CAR_23	CAR_24	CAR_25	CAR_3	CAR_4	CAR_5	CAR_6	CAR_7	CAR_8	CAR_9
Chr1	0	0	0	0	0	0	0	0	0	0	0	349	0	0	0	0	0	0	0	0	0	0	0	0	0
Chr10	0	0	0	0	0	0	0	0	0	0	0	0	185	0	0	0	0	0	0	0	0	23	0	0	0
Chr11	0	0	0	0	0	0	0	0	3	0	0	0	0	0	0	202	0	0	0	0	0	0	0	0	0
Chr12	0	236	0	0	202	0	0	0	0	70	0	0	0	0	0	0	0	0	0	0	0	365	325	0	0

-mode=printOrthologuesList
will list the orthologous genes.

I hope this would help.
Cheers,
Alex

from agora.

alouis72 commented on June 15, 2024 2

Hello Orestis,
thank you for using AGORA and Genomicus.
At the present time, we do not provide tool to generate genomicus like multikaryoview with command line.

However, you can use src/misc.compareGenomes.py to generate the karyotypes one by one (with the option mode=drawKaryotype).

The first ancestral karyotype (one color per ancestral block) can be generate by comparing the genome to itself, eg:
src/misc.compareGenomes.py ancGenome.A1.list.bz2 ancGenome.A1.list.bz2 ancGenes.A1.list.bz2 -mode=drawKaryotype > A1.ps

The others, by comparing the genome (ancestral or modern) with the first ancestral one:
src/misc.compareGenomes.py genome.M1.list.bz2 ancGenome.A1.list.bz2 ancGenes.A1.list.bz2 -mode=drawKaryotype > M1.ps

etc...
options are available (+sortBySize, -minChrSize etc...)

I hope this allows you to do what you want on your data,
don't hesitate if you have other questions

Best wishes,
Alexandra

from agora.

alouis72 commented on June 15, 2024 1

@nousiaso
I think (tell me if I'm I'm wrong) you provided a species tree without naming the internal nodes, so AGORA named them NAME_1, NAME_2 etc... when parsing the species tree, (it is needed to build the comparisons computing process).
Whether you want de build each ancestral node of the species tree or only the ancestral root, AGORA needs all internal Orthologous groups, because not all genome pairwise comparison are informative to reconstruct the ancestral root. (see figure2 of the article). Also each internal node Orthogroups should only contain genes of descendant species of this ancestor.
I think one top orthologous group will not provide information about where gene duplications nor losses have occured during evolution.

from agora.

alouis72 commented on June 15, 2024 1

@nousiaso
If your speciesTree is like:

((((spec1,spec2),spec3),spec4),spec5)

with no name at inner ancestral node, AGORA will infer them as "NAME_0" "NAME_1" "NAME_2" etc... and will search for data with that infered name (Orthologous.NAME_1.list, etc...)

If you provide a speciestree with name at inner nodes:
(((Spec1,Spec2)A3,Spec3)A1,(Spec4,Spec5)A2)A0 ;

AGORA will expected OrthoGroups.A[0-3].list and will not use NAME_ cause the algorithm doesn't need to infer the name of ancestors.

If you need to show me data more privately you can email me to [email protected]

from agora.

nousiaso commented on June 15, 2024

When I say plot, i mean like the nice chromosome plot from genomicus web server for instance

from agora.

nousiaso commented on June 15, 2024

Hey Alexandra thanks for the reply. It is in the right direction. I will try it. Wanted to ask you something else related to the analysis.
I have 5 gene lists when i do the analysis, gene.genome1.list ...etc and one ancestral gene list gene.A0.list .
From my understanding i compiled a file with the orthogroup ids for the ancestral genome, space separated as you show in the example. So i have one file called orthologyGroups.A0.list , but when i run the pipeline it needs 4 more files. Plus
there is a strange thing where it needs them having these headers orthologyGroups.NAME_0.list , orthologyGroups.NAME_1.list ...orthologyGroups.NAME_4list . I just copied the initial file i had with all the orthology relationships orthologyGroups.A0.list and gave it all these names. It worked but I looked at the results and they are little baffling.

Then i tried also to get different inputs : i did this : the file orthologyGroups.NAME_0.list had just the orthologies of the A0 and genome 1, the orthologyGroups.NAME_1.list had just the orthologies of A0 and genome 2 etc...

After this try, i did not get ancestral genomes at the end. So I want to ask you what is going on exactly ?
What are these 5 orthologyGroup.lists that are needed, and why not just one orthologyGroup list with all the orthologs between the A0 and the 5 genomes is not good enough ?

Please elaborate
Thanks!!

from agora.

biogeeker commented on June 15, 2024

Hi, Alexandra,

AGORA is a wonderful tools for reconstructing ancestor karyotype, I have two questions when I read the manual and NEE paper. As the example data shows:

I have constructed the ancestor A1 karyotype as you mentioned above, then I want to know the karyotype of M1. Should I directly construct from A1 -> M1, or firstly to A3, and then from A3 -> M1? I think two steps is more reasonable, but still not sure. Here is my code :

Directly construct from A1:

./misc.compareGenomes.py ../example/data/genes/genes.M1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/size-1.0-1.0/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> M1.ps

A1 -> A3 -> M1

./misc.compareGenomes.py ../example/data/genes/genes.M1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A3.list.bz2 ../example/results/ancGenes/size-1.0-1.0/ancGenes.A3.list.bz2 -mode=drawKaryotype -minChrSize=200> M1_A3.ps

And another question is:

Can [ Agora ] output the occurrence of fusion and fission events for each ancestral node?

Thanks for your reply!

from agora.

nousiaso commented on June 15, 2024

@nousiaso I think (tell me if I'm I'm wrong) you provided a species tree without naming the internal nodes, so AGORA named them NAME_1, NAME_2 etc... when parsing the species tree, (it is needed to build the comparisons computing process). Whether you want de build each ancestral node of the species tree or only the ancestral root, AGORA needs all internal Orthologous groups, because not all genome pairwise comparison are informative to reconstruct the ancestral root. (see figure2 of the article). Also each internal node Orthogroups should only contain genes of descendant species of this ancestor. I think one top orthologous group will not provide information about where gene duplications nor losses have occured during evolution.

Hello, Alexandra,
Thanks for the reply

I named the species tree according to the names in gene.*.list.bz2 .

Concerning this : "Whether you want de build each ancestral node of the species tree or only the ancestral root, AGORA needs all internal Orthologous groups"

I thought that would be the case as you describe it.

Let me make it crystal clear

We have the 5 genomes of the cultivars we are investigating. I want to reconstruct the common ancestor of the 5 genomes plus A0. A0 is a closely related wild genome of different genus (but same family), that diverged from the genus we are investigating.

We have the orthogroups of that A0 with the 5 genomes.
What i did is i took the orthologs from A0 with genome 1 , A0 with genome 2 etc, and merged them into a file with
unique A0 gene ids using pandas.

At the end we have a file that has all orthologs of A0 with the five genomes (like the one in the example)

I used that for the orthologyGroup.A0.list as input but it was not enough. It needed files that were named
orthologyGroup.NAME_0.list
orthologyGroup.NAME_1.list
orthologyGroup.NAME_2.list
orthologyGroup.NAME_3.list
orthologyGroup.NAME_4.list

i copied the merged orthologyfile orthologyGroup.A0.list , using the different names
NAME_0, until NAME_4 and it worked.

After that i also tested the following
used the un-merged orthologyGroup files A0 with genome1 ... etc A0 genome 5
and named them orthologyGroup.NAME_0.list,, etc until orthologyGroup.NAME_4.list
( looking at ancgenes.log) i used the same order according to the gene.*.list.bz2 parsed, the first list was NAME_0 ... the fifth list was NAME_4)

i did not get ancestral genome results after that.

So i guess this is why : "Whether you want de build each ancestral node of the species tree or only the ancestral root, AGORA needs all internal Orthologous groups"

this is not a problem. I understand why.
The NAME issue though i did not really understand why it happens.

If you have an insight on that, please share,
thank you much for the repplies!
Orestis

from agora.

nousiaso commented on June 15, 2024

Your reply was very clear !

Thanks a lot Alexandra !
Best,
Orestis

from agora.

Caoyu819 commented on June 15, 2024

@biogeeker just to be clear, misc.compareGenomes.py will not compute or build anything... It's just a tool to draw the modern or ancestral karyoptypes according to an ancestral or modern reference (or draw dotplots... see default mode=drawMatrix).

If you want to represent M1 with the color of the blocs of reconstructed A1:

1. plot A1 to draw the reference colors (1 bloc / 1 colour):

./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200 +sortBySize > A1.ps

2. plot M1 according to A1 colors of blocs:

./misc.compareGenomes.py ../example/data/genes/genes.M1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> M1.ps

If you want to represent how it evolves from A1 to A3 to M1, you have to draw A3 according to A1 colours:

./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A3.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200> A3.ps

Can [ Agora ] output the occurrence of fusion and fission events for each ancestral node? you can have an idea of of the fission and or fusion by using the same tool but with the option mode=printOrthologousChrom

as an example: src/misc.compareGenomes.py ./example/results/ancGenomes/vertebrates-workflow/ancGenome.A1.list.bz2 example/data/genes/genes.M1.list.bz2 example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=printOrthologousChrom +sortBySize -minChrSize=200

will give:
CAR_1	Chr7 (516)
CAR_10	Chr12 (236)	Chr25 (89)
CAR_11	Chr24 (194)	Chr16 (95)
CAR_12	Chr17 (253)
CAR_13	Chr12 (202)	Chr21 (41)
CAR_14	Chr30 (232)
CAR_15	Chr16 (234)
CAR_16	Chr27 (214)
CAR_17	Chr17 (215)
CAR_18	Chr21 (136)	Chr12 (70)
CAR_19	Chr18 (155)	Chr21 (130)	Chr27 (64)
CAR_2	Chr1 (349)	Chr17 (120)
CAR_20	Chr10 (185)	Chr16 (21)
CAR_21	Chr2 (208)	Chr3 (62)
CAR_22	Chr19 (174)	Chr3 (34)
with this parameters (taking into account only blocs > 200 genes and with default parameter orthoschr:minHomology=90) that means that CAR_10 of A1 is split on 2 Chromosomes in M1

the reverse command: src/misc.compareGenomes.py ./example/results/ancGenomes/vertebrates-workflow/ancGenome.A1.list.bz2 example/data/genes/genes.M1.list.bz2 example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=printOrthologousChrom +sortBySize -minChrSize=200 +reverse
Chr1	CAR_2 (349)
Chr10	CAR_20 (185)	CAR_6 (23)
Chr11	CAR_23 (202)
Chr12	CAR_6 (365)	CAR_7 (325)	CAR_10 (236)	CAR_13 (202)
Chr14	CAR_5 (355)
Chr16	CAR_4 (436)	CAR_15 (234)	CAR_11 (95)
Chr17	CAR_12 (253)	CAR_17 (215)	CAR_2 (120)
Chr18	CAR_19 (155)
Chr19	CAR_22 (174)
Chr2	CAR_21 (208)
Chr20	CAR_8 (311)
Chr21	CAR_25 (196)	CAR_18 (136)	CAR_19 (130)
Chr22
Chr23	CAR_9 (294)
Chr24	CAR_11 (194)
Chr25	CAR_10 (89)
Chr27	CAR_16 (214)	CAR_19 (64)
Chr29	CAR_15 (6)	CAR_16 (3)
Chr3	CAR_21 (62)	CAR_22 (34)
Chr30	CAR_14 (232)	CAR_24 (205)
Chr4
Chr6	CAR_3 (447)
Chr7	CAR_1 (516)
will informe you that the modern Chr12 is composed of 4 ancestral blocs (complete or partial)

different options are available :

-mode=printOrthologuesCount will tell you how many genes from which blocs composed the target genome chromosomes.
	CAR_1	CAR_10	CAR_11	CAR_12	CAR_13	CAR_14	CAR_15	CAR_16	CAR_17	CAR_18	CAR_19	CAR_2	CAR_20	CAR_21	CAR_22	CAR_23	CAR_24	CAR_25	CAR_3	CAR_4	CAR_5	CAR_6	CAR_7	CAR_8	CAR_9
Chr1	0	0	0	0	0	0	0	0	0	0	0	349	0	0	0	0	0	0	0	0	0	0	0	0	0
Chr10	0	0	0	0	0	0	0	0	0	0	0	0	185	0	0	0	0	0	0	0	0	23	0	0	0
Chr11	0	0	0	0	0	0	0	0	3	0	0	0	0	0	0	202	0	0	0	0	0	0	0	0	0
Chr12	0	236	0	0	202	0	0	0	0	70	0	0	0	0	0	0	0	0	0	0	0	365	325	0	0
-mode=printOrthologuesList will list the orthologous genes.

I hope this would help. Cheers, Alex

Hi Alexandra,

Thank you for developing such a great tool for karyotype reconstruction. I ran the AGORA, and obtained the ancestral genomes for the target nodes.

Now I'm attempting to plot the karyotype using the AGORA package's script "misc.compareGenomes.py," but I always get the same error when I execute the commandline, even with the example data:
./misc.compareGenomes.py ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenomes/basic-workflow/ancGenome.A1.list.bz2 ../example/results/ancGenes/all/ancGenes.A1.list.bz2 -mode=drawKaryotype -minChrSize=200 +sortBySize > A1.ps.

The error is shown below:
Affichage Traceback (most recent call last):
File "/home/yucao/software/Agora-master/src/misc.compareGenomes.py", line 268, in
eval(str(arguments["mode"]))()
File "/home/yucao/software/Agora-master/src/misc.compareGenomes.py", line 85, in drawMatrix
utils.myPsOutput.printPsHeader()
File "/home/yucao/software/Agora-master/src/utils/myPsOutput.py", line 24, in printPsHeader
initColor()
File "/home/yucao/software/Agora-master/src/utils/myPsOutput.py", line 54, in initColor
location = [f for f in knownLocations if os.path.exists(f)][0]
IndexError: list index out of range

And the same error shows for my own data and results, do you have any idea why this occurs? According to the error message, I located line 54 of the script “/Agora-master/src/utils/myPsOutput.py” (please see the picture below for the details), it seems the error occurred because the missing of file "rgb.txt", but I'm still not sure how to get the file "rgb.txt"?

Thank you very much! looking very forward to your reply.

Best,
Yu

from agora.

DyogenIBENS commented on June 15, 2024

Hi Yu,
sorry for the later answer,
this file (rgb.txt), contains the definition of all the colors that will be used in the karyotypes output.
These are usual locations on linux ubuntu or other linux distribution.
On your computer or server, you can probably execute:

locate rgb.txt
to find the good location and change it in the /Agora-master/src/utils/myPsOutput.py

I already update the code with one other location "/usr/share/X11/rgb.txt"

I hope this helps,
best,
Alexandra

from agora.

Caoyu819 commented on June 15, 2024

Dear Alexandra,

Thank you very much for your reply, after update the code with location "/usr/share/X11/rgb.txt", the script "misc.compareGenomes.py" run successfully, cheers!

Thanks again for your help.
Best,
Yu

from agora.

Plotting agora results plus a general question about agora HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent