A Python program which applies graph theory to model genealogical structured data (pedigrees).
Authors:
- Bruno C. Perez [email protected]
- Ricardo V. Ventura
- Julio C. C. Balieiro
- Juliana D. Machado
Latest version: 0.2.1 - (April 4th, 2017)
Required modules:
- NetworkX
- Numpy
- Scipy
- Pandas
- Matplotlib
- Collections
- Louvain download
-
Open the terminal.
-
Go to pedWorks.py file directory and run:
python2 pedWorks.py setup.ini
- for systems where Python 3.X is default.
or
python pedWorks.py setup.ini
- for systems where Python 2.7 is default.
Performs a network analysis after applying network theory into a pedigree.
- Calculates basic measures: Number of nodes/edges, Density, Average (out)degree
- Calculates the following centralities for all nodes:
* (out)Degree centrality
* Closeness Centrality
* Betweeness Centrality
* Eigenvector Centrality
* Katz Centrality
* If
ctplot = TRUE
, creates a plot for each centrality. - Creates the (out) degree rank plot for the pedigree.
Uses Fruchterman & Reingold directed force drawing algorithm for a pedigree modeled as a acyclic directed network.
draw_simple
- Draws a simple one-colored network.draw_group
- Draws distinct groups of animals in different colors.draw_multigroup
- Draws only pre-ditermined nodes (individuals) from the pedigree maintainning their relationship as calculated for the complete pedigree.draw_cluster
- Uses Louvain method for community detection to find the underlying structure of a pedigree.
Makes use of a network framework to calculate the breed composition of all animals in a multi-breed population pedigree.
A brief description of the available options.
-
Reorder
function arguments:outfile
= specifies the name of the reordered pedigree file. ex:pedOrd.txt
outheader
=TRUE/FALSE
ifTRUE
, prints a header on the ordered pedigree file.format
= specifies the format for the ordered pedigree output.fwf
- fixed width formatcsv
- comma separated formattxt
- space separated format
-
Analysis
function arguments:ctplot
=TRUE/FALSE
ifTRUE
, creates the centrality plots for the pedigree.- (Out)Degree, Closeness, Betweeness, Eigenvector and Katz centralities.
-
Draw
function arguments:niter
= number of iterations for the Fruchterman & Reingold (F&R) algorithm. ex:.1000
-20000 or >
kpar
= value of the K parameter, which controls attractive and repulsive forces in F&R algorithm. ex:.0.001
-0.05
nscale
= specifies scale in which the network will be plotted (the x and y axis range)nalpha
= specifies the transparency of the nodes. ex:.0.1
-1.0
nsize
= specifies the size of the nodes. ex:.10
-100
.ncolor
= specifies the RGB code for the color of the nodes. RBG colochartealpha
= specifies the transparency of the edges. ex:.0.1
-1.0
ewidth
= specifies the width of the edges. ex:.0.1
-1.0
ecolor
= specifies the RGB code for the color of the nodes. RBG colochartinitpos
=TRUE/FALSE
- IfTRUE
, node positioning start as defined in a.txt
file.posfile
= specifies the file (.txt
) containning node positioning. (format:node x-axis y-axis
)savepos
=TRUE/FALSE
- If True, saves final node positioning innodepos.txt
file.
-
draw_ group
anddraw_multigroup
function specific argument:group_list
= specifies he name(s) of the file(s) containning a list of individuals to be highlighted. It may received multiple groups. ex:.[group1.txt, group2.txt, group3.txt]
color_list
= specifies the RGB code for the color for each highlighted group. May receive multiple colors that must be specified in the same number and order of the groups ingroup_list
. ex:.[#008000, #4682B4, #DC143C]
-
draw_cluster
function specific argument:-
cSize
= specifies the community minimum size treshold. Communities containning less than cSize animals will not be colored in the network drawing. ex:5
-20
-
Breed Composition
function arguments:infile
= specifies the input pedigree file. ex:pedigree.txt
nbreed
= specifies the number of different breeds in the population. ex:2
-4
.
-
The following examples were obtained by real genealogical data analysis using PedWorks.
- Topological sorting by Fruchterman & Reingold force-directed method.
- Community detection algorithm by Louvain method.
- Nodes (individuals) with same colors belong to the same community.