kg_covid_19.transform_utils.string_ppi package

Submodules

kg_covid_19.transform_utils.string_ppi.string_ppi module

class kg_covid_19.transform_utils.string_ppi.string_ppi.StringTransform(input_dir: str = None, output_dir: str = None)

Bases: kg_covid_19.transform_utils.transform.Transform

StringTransform parses interactions from STRING DB into nodes and edges.

load_gene_info(input_dir: str, output_dir: str, species_id: List = None) → None

Load mappings from NCBI gene_info (gene_info.gz).

Args:

input_dir: A string pointing to the directory to import data from. output_dir: A string pointing to the directory to output data to. species_id: A list with the species IDs.

Returns:

None.

load_mapping(input_dir: str, output_dir: str, species_id: List = None) → None

Load Ensembl Gene to Protein mapping from NCBI gene2ensembl (gene2ensembl.gz).

Args:

input_dir: A string pointing to the directory to import data from. output_dir: A string pointing to the directory to output data to. species_id: A list with the species IDs.

Returns:

None.

run(data_file: Optional[str] = None) → None

Method is called and performs needed transformations to process protein-protein interactions from the STRING DB data.

Args:

data_file: data file to parse

Returns:

None.

kg_covid_19.transform_utils.string_ppi.string_ppi.parse_header(header_string: str, sep: str = ' ') → List

Parses header data.

Args:

header_string: A string containing header items. sep: A string containing a delimiter.

Returns:

A list of header items.

kg_covid_19.transform_utils.string_ppi.string_ppi.parse_stringdb_interactions(this_line: str, header_items: List) → Dict

Methods processes a line of text from Drug Central.

Args:

this_line: A string containing a line of text. header_items: A list of header items.

Returns:

item_dict: A dictionary of header items and a processed Drug Central string.

Module contents

class kg_covid_19.transform_utils.string_ppi.StringTransform(input_dir: str = None, output_dir: str = None)

Bases: kg_covid_19.transform_utils.transform.Transform

StringTransform parses interactions from STRING DB into nodes and edges.

load_gene_info(input_dir: str, output_dir: str, species_id: List = None) → None

Load mappings from NCBI gene_info (gene_info.gz).

Args:

input_dir: A string pointing to the directory to import data from. output_dir: A string pointing to the directory to output data to. species_id: A list with the species IDs.

Returns:

None.

load_mapping(input_dir: str, output_dir: str, species_id: List = None) → None

Load Ensembl Gene to Protein mapping from NCBI gene2ensembl (gene2ensembl.gz).

Args:

input_dir: A string pointing to the directory to import data from. output_dir: A string pointing to the directory to output data to. species_id: A list with the species IDs.

Returns:

None.

run(data_file: Optional[str] = None) → None

Method is called and performs needed transformations to process protein-protein interactions from the STRING DB data.

Args:

data_file: data file to parse

Returns:

None.