Sumstats parsers
COJOSSParser
¶
Bases: SumstatsParser
A specialized class for parsing GWAS summary statistics files generated by the COJO
software.
Attributes:
Name | Type | Description |
---|---|---|
col_name_converter |
A dictionary mapping column names in the original table to magenpy's column names. |
|
read_csv_kwargs |
Keyword arguments to pass to pandas' |
Source code in magenpy/parsers/sumstats_parsers.py
__init__(col_name_converter=None, **read_csv_kwargs)
¶
Initialize the COJO summary statistics parser.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col_name_converter
|
A dictionary/string mapping column names in the original table to magenpy's column names for the various summary statistics. If a string, it should be a comma-separated list of key-value pairs (e.g. 'rsid=SNP,pos=POS'). |
None
|
|
read_csv_kwargs
|
Keyword arguments to pass to pandas' read_csv |
{}
|
Source code in magenpy/parsers/sumstats_parsers.py
FastGWASSParser
¶
Bases: SumstatsParser
A specialized class for parsing GWAS summary statistics files generated by the FastGWA
software.
Attributes:
Name | Type | Description |
---|---|---|
col_name_converter |
A dictionary mapping column names in the original table to magenpy's column names. |
|
read_csv_kwargs |
Keyword arguments to pass to pandas' |
Source code in magenpy/parsers/sumstats_parsers.py
__init__(col_name_converter=None, **read_csv_kwargs)
¶
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col_name_converter
|
A dictionary/string mapping column names in the original table to magenpy's column names for the various summary statistics. If a string, it should be a comma-separated list of key-value pairs (e.g. 'rsid=SNP,pos=POS'). |
None
|
|
read_csv_kwargs
|
Keyword arguments to pass to pandas' read_csv |
{}
|
Source code in magenpy/parsers/sumstats_parsers.py
Plink1SSParser
¶
Bases: SumstatsParser
A specialized class for parsing GWAS summary statistics files generated by plink1.9
.
Attributes:
Name | Type | Description |
---|---|---|
col_name_converter |
A dictionary mapping column names in the original table to magenpy's column names. |
|
read_csv_kwargs |
Keyword arguments to pass to pandas' |
Source code in magenpy/parsers/sumstats_parsers.py
__init__(col_name_converter=None, **read_csv_kwargs)
¶
Initialize the plink1.9
summary statistics parser.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col_name_converter
|
A dictionary/string mapping column names in the original table to magenpy's column names for the various summary statistics. If a string, it should be a comma-separated list of key-value pairs (e.g. 'rsid=SNP,pos=POS'). |
None
|
|
read_csv_kwargs
|
Keyword arguments to pass to pandas' read_csv |
{}
|
Source code in magenpy/parsers/sumstats_parsers.py
Plink2SSParser
¶
Bases: SumstatsParser
A specialized class for parsing GWAS summary statistics files generated by plink2
.
Attributes:
Name | Type | Description |
---|---|---|
col_name_converter |
A dictionary mapping column names in the original table to magenpy's column names. |
|
read_csv_kwargs |
Keyword arguments to pass to pandas' |
Source code in magenpy/parsers/sumstats_parsers.py
__init__(col_name_converter=None, **read_csv_kwargs)
¶
Initialize the plink2
summary statistics parser.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col_name_converter
|
A dictionary/string mapping column names in the original table to magenpy's column names for the various summary statistics. If a string, it should be a comma-separated list of key-value pairs (e.g. 'rsid=SNP,pos=POS'). |
None
|
|
read_csv_kwargs
|
Keyword arguments to pass to pandas' read_csv |
{}
|
Source code in magenpy/parsers/sumstats_parsers.py
parse(file_name, drop_na=True)
¶
Parse a summary statistics file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_name
|
The path to the summary statistics file. |
required | |
drop_na
|
Drop any entries with missing values. |
True
|
Returns:
Type | Description |
---|---|
A pandas DataFrame containing the parsed summary statistics. |
Source code in magenpy/parsers/sumstats_parsers.py
SSFParser
¶
Bases: SumstatsParser
A specialized class for parsing GWAS summary statistics that are formatted according
to the standardized summary statistics format adopted by the GWAS Catalog. This format is
sometimes denoted as GWAS-SSF
.
Reference and details: https://github.com/EBISPOT/gwas-summary-statistics-standard
Attributes:
Name | Type | Description |
---|---|---|
col_name_converter |
A dictionary mapping column names in the original table to magenpy's column names. |
|
read_csv_kwargs |
Keyword arguments to pass to pandas' |
Source code in magenpy/parsers/sumstats_parsers.py
__init__(col_name_converter=None, **read_csv_kwargs)
¶
Initialize the standardized summary statistics parser.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col_name_converter
|
A dictionary/string mapping column names in the original table to magenpy's column names for the various summary statistics. If a string, it should be a comma-separated list of key-value pairs (e.g. 'rsid=SNP,pos=POS'). |
None
|
|
read_csv_kwargs
|
Keyword arguments to pass to pandas' read_csv |
{}
|
Source code in magenpy/parsers/sumstats_parsers.py
SaigeSSParser
¶
Bases: SumstatsParser
A specialized class for parsing GWAS summary statistics files generated by the SAIGE
software.
Reference and details:
https://saigegit.github.io/SAIGE-doc/docs/single_step2.html
TODO: Ensure that the column names are correct across different trait types and the inference of the sample size is correct.
Attributes:
Name | Type | Description |
---|---|---|
col_name_converter |
A dictionary mapping column names in the original table to magenpy's column names. |
|
read_csv_kwargs |
Keyword arguments to pass to pandas' |
Source code in magenpy/parsers/sumstats_parsers.py
__init__(col_name_converter=None, **read_csv_kwargs)
¶
Initialize the SAIGE
summary statistics parser.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col_name_converter
|
A dictionary/string mapping column names in the original table to magenpy's column names for the various summary statistics. If a string, it should be a comma-separated list of key-value pairs (e.g. 'rsid=SNP,pos=POS'). |
None
|
|
read_csv_kwargs
|
Keyword arguments to pass to pandas' read_csv |
{}
|
Source code in magenpy/parsers/sumstats_parsers.py
parse(file_name, drop_na=True)
¶
Parse the summary statistics file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_name
|
The path to the summary statistics file. |
required | |
drop_na
|
Drop any entries with missing values. |
True
|
Returns:
Type | Description |
---|---|
A pandas DataFrame containing the parsed summary statistics. |
Source code in magenpy/parsers/sumstats_parsers.py
SumstatsParser
¶
Bases: object
A wrapper class for parsing summary statistics files that are written by statistical genetics software for Genome-wide Association testing. A common challenge is the fact that different software tools output summary statistics in different formats and with different column names. Thus, this class provides a common interface for parsing summary statistics files from different software tools and aims to make this process as seamless as possible.
The class is designed to be extensible, so that users can easily add new parsers for different software tools.
Attributes:
Name | Type | Description |
---|---|---|
col_name_converter |
A dictionary mapping column names in the original table to magenpy's column names. |
|
read_csv_kwargs |
Keyword arguments to pass to pandas' |
Source code in magenpy/parsers/sumstats_parsers.py
__init__(col_name_converter=None, **read_csv_kwargs)
¶
Initialize the summary statistics parser.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
col_name_converter
|
A dictionary/string mapping column names in the original table to magenpy's column names for the various summary statistics. If a string, it should be a comma-separated list of key-value pairs (e.g. 'rsid=SNP,pos=POS'). |
None
|
|
read_csv_kwargs
|
Keyword arguments to pass to pandas' read_csv |
{}
|
Source code in magenpy/parsers/sumstats_parsers.py
parse(file_name, drop_na=True)
¶
Parse a summary statistics file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_name
|
The path to the summary statistics file. |
required | |
drop_na
|
If True, drop any entries with missing values. |
True
|
Returns:
Type | Description |
---|---|
A pandas DataFrame containing the parsed summary statistics. |