magenpy_simulate
Simulate complex traits with varying genetic architectures (magenpy_simulate
)¶
The magenpy_simulate
script is used to facilitate simulating complex traits with a variety of
genetic architectures, given a set of genotypes stored in plink
's BED file format. The script
takes as input the path to the genotype data, the type of trait to simulate, the parameters of
the genetic architecture (e.g. polygenicity, heritability, effect sizes), and the output directory
where the simulated phenotypes will be stored.
A full listing of the options available for the magenpy_simulate
script can be found by running the
following command in your terminal:
Which outputs the following help message:
********************************************************
_ __ ___ __ _ __ _ ___ _ __ _ __ _ _
| '_ ` _ \ / _` |/ _` |/ _ \ '_ \| '_ \| | | |
| | | | | | (_| | (_| | __/ | | | |_) | |_| |
|_| |_| |_|\__,_|\__, |\___|_| |_| .__/ \__, |
|___/ |_| |___/
Modeling and Analysis of Genetics data in python
Version: 0.1.4 | Release date: June 2024
Author: Shadi Zabad, McGill University
********************************************************
< Simulate complex quantitative or case-control traits >
usage: magenpy_simulate [-h] --bfile BED_FILE [--keep KEEP_FILE] [--extract EXTRACT_FILE] [--backend {plink,xarray}] [--temp-dir TEMP_DIR]
--output-file OUTPUT_FILE [--output-simulated-beta] [--min-maf MIN_MAF] [--min-mac MIN_MAC] --h2 H2
[--mix-prop MIX_PROP] [--prop-causal PROP_CAUSAL] [--var-mult VAR_MULT]
[--phenotype-likelihood {gaussian,binomial}] [--prevalence PREVALENCE] [--seed SEED]
Commandline arguments for the complex trait simulator
options:
-h, --help show this help message and exit
--bfile BED_FILE The BED files containing the genotype data. You may use a wildcard here (e.g. "data/chr_*.bed")
--keep KEEP_FILE A plink-style keep file to select a subset of individuals for simulation.
--extract EXTRACT_FILE
A plink-style extract file to select a subset of SNPs for simulation.
--backend {plink,xarray}
The backend software used for the computation.
--temp-dir TEMP_DIR The temporary directory where we store intermediate files.
--output-file OUTPUT_FILE
The path where the simulated phenotype will be stored (no extension needed).
--output-simulated-beta
Output a table with the true simulated effect size for each variant.
--min-maf MIN_MAF The minimum minor allele frequency for variants included in the simulation.
--min-mac MIN_MAC The minimum minor allele count for variants included in the simulation.
--h2 H2 Trait heritability. Ranges between 0. and 1., inclusive.
--mix-prop MIX_PROP Mixing proportions for the mixture density (comma separated). For example, for the spike-and-slab mixture density,
with the proportion of causal variants set to 0.1, you can specify: "--mix-prop 0.9,0.1 --var-mult 0,1".
--prop-causal PROP_CAUSAL, -p PROP_CAUSAL
The proportion of causal variants in the simulation. See --mix-prop for more complex architectures specification.
--var-mult VAR_MULT, -d VAR_MULT
Multipliers on the variance for each mixture component.
--phenotype-likelihood {gaussian,binomial}
The likelihood for the simulated trait: gaussian (e.g. quantitative) or binomial (e.g. case-control).
--prevalence PREVALENCE
The prevalence of cases (or proportion of positives) for binary traits. Ranges between 0. and 1.
--seed SEED The random seed to use for the random number generator.