Pedigree analysis using the Python programming language

J.B. Cole and D.E. Franke

Department of Animal Science, Louisiana State University, Baton Rouge, LA 70803


The utility of the programming language Python as a tool for rapid application development is demonstrated with PyPedal, a package for pedigree analysis. Python is an interpreted, object-oriented programming language. It is a full-featured language which supports modern design paradigms, is available free of charge, and is ideally suited to rapid application development. Animal breeding applications are typically complex and computationally demanding. For the sake of efficiency such applications are usually written in a compiled language such as Fortran 90. The gain in efficiency from such languages is accompanied by complex syntax and primitive libraries for tasks such as I/O. This often makes the implementation of new algorithms non-trivial and results in long development cycles. While Python is not well-suited for applications such as the quarterly USDA dairy cattle genetic evaluations, it is ideal for exploring new methodologies or writing tools to perform common tasks. PyPedal is capable of many operations on pedigrees, including error-checking, construction of A and A−1, calculation of average coefficients of inbreeding and relationship, and calculation of effective founder number using direct and approximate methods. Diagnostic and error messages are written to the standard output device. Output is stored in text files. A pedigree containing records for 304 Brahman cattle was used to demonstrate PyPedal. A and its inverse were calculated and stored using one direct and two indirect methods. A was very sparse and contained 92,416 elements. Population average coefficients of inbreeding and relationship were 0.001 and 0.004, respectively. There were 152 actual founders in the pedigree. The effective number of founders was 95.86 and 132.57 by the direct and indirect methods, respectively. The difference in effective founder numbers is accounted for by the lack of precise generation information needed for accurate results from the approximate algorithm. The lack of useful generation information prevented the estimation of effective ancestor number. Total processing time was 68s on a 450 MHz Pentium II computer with 128 MB of RAM. PyPedal is available upon request from

(Key words: pedigree analysis, programming languages)