Input and output format
The input files for GDAP are in tables text format (.CSV
) described in the Data Collection section. Loaded with pandas
as a dataframe.
File | Description | File Format |
---|---|---|
Target Data | The target disease data table consists of symbol Gene Name (Protein) and score type (direct, indirect, global). (Read More) | pd.Dataframe | .csv |
String Database | A protein-protein interactions dataset consisting of GeneName1 (Protein), GeneName2 (Protein), and the combined score. (Read More) | pd.Dataframe | .csv |
GraphML Format | An XML-based file format for storing graphs with nodes and edges. It supports metadata and complex attributes. (Read More) | networkx.Graph object | .graphml (XML) |
Edge List | The edge list format consists of pairs of nodes representing the edges in the graph. This is a simple text-based format. | .edgelist (Plain Text) |
Edges CSV | A CSV file representing edges in a graph. It contains two columns: source and destination nodes. | .csv (Comma-Separated Values) |
Positive Edges | File storing positive edges in a graph. This is typically used in graph learning for positive relationships. | .npy (NumPy Array) |
Negative Edges | File storing negative edges in a graph, representing negative relationships or absence of a connection. | .npy (NumPy Array) |