Input and output format
The input files for GDAP are in tables text format (.CSV) described in the Data Collection section. Loaded with pandas as a dataframe.
| File | Description | File Format |
|---|---|---|
| Target Data | The target disease data table consists of symbol Gene Name (Protein) and score type (direct, indirect, global). (Read More) | pd.Dataframe | .csv |
| String Database | A protein-protein interactions dataset consisting of GeneName1 (Protein), GeneName2 (Protein), and the combined score. (Read More) | pd.Dataframe | .csv |
| GraphML Format | An XML-based file format for storing graphs with nodes and edges. It supports metadata and complex attributes. (Read More) | networkx.Graph object | .graphml (XML) |
| Edge List | The edge list format consists of pairs of nodes representing the edges in the graph. This is a simple text-based format. | .edgelist (Plain Text) |
| Edges CSV | A CSV file representing edges in a graph. It contains two columns: source and destination nodes. | .csv (Comma-Separated Values) |
| Positive Edges | File storing positive edges in a graph. This is typically used in graph learning for positive relationships. | .npy (NumPy Array) |
| Negative Edges | File storing negative edges in a graph, representing negative relationships or absence of a connection. | .npy (NumPy Array) |