Format of input data and output results
GenAPoPop was intentionally designed to accept different
genotyping text-file format as long as each line codes for one
individual genotype, and each allele is reported in one column, with
columns separated by tabulation. It also manages files with multiple
header lines. The advantage of this Genalex-like format text
file (Peakall & Smouse 2012) is that it is universally handled by
spreadsheets and text editors, and it fits the most commonly-used output
format of many SNP-set callers. GenAPoPop workflow requires to
first upload such data file, and then label the four necessarily present
columns in the data file: three columns indicating population name,
generation or date of sampling and individual identifier (Table 1). Any
character can be used in these columns except tabulation and space. The
fourth column indicates the column with the first allele of the first
locus, and implies that all the following columns until the last one
only contains alleles coding for the individual genotype. Alleles can be
SNPs, thus expected to be coded as upper- or lower-case a ,c , g , t and n for missing allele or number1 , 2 , 3 , 4 and 0 for missing allele.
Alleles can also be sequence repeat markers (like micro-, mini- and
macro-satellites) or sequence length-based markers, named hereafter
SSR-like markers in GenAPoPop software and documentation. In
the case of SSR-like markers, each allele is expected to be coded as an
integer number of repeats or a sequence size, and, if encountered,
missing allele should be coded as zero. For the moment,
GenAPoPop supposes genotypes evolve following a K-allele
mutation model (KAM) in which any allele can mutate in any other allele
with the same probability, which has the advantage of aptly modelling
the mutation of both microsatellites and SNPs (Weir & Cockerham, 1984),
but does not make it possible to exploit the number of repeated DNA
segment or the sequence sizes for computing population and individual
genetic indices and distances.
GenAPoPop can work on input file with genotypes of one or
multiple populations, with identical ploidy and genotyped with a common
marker-set, to analyse them in mass. GenAPoPop has no limit in
the number of populations, of time-steps and genotypes it can analyse,
out of the classic material and operating system limitations,i.e. , the quantity of random-access memory (RAM) to upload the
datafiles and the outputs, and the central processing unit clock speed
and advancement of its instruction sets.