deepnog.io
¶
Author: Roman Feldbauer
Date: 2020-02-19
Description:
Input/output helper functions
-
deepnog.io.
create_df
(class_labels, preds, confs, ids, indices, threshold=None, verbose=3)[source]¶ Creates one dataframe storing all relevant prediction information.
The rows in the returned dataframe have the same order as the original sequences in the data file. First column of the dataframe represents the position of the sequence in the datafile.
- Parameters
class_labels (list) – Store class name corresponding to an output node of the network.
preds (torch.Tensor, shape (n_samples,)) – Stores the index of the output-node with the highest activation
confs (torch.Tensor, shape (n_samples,)) – Stores the confidence in the prediction
ids (list[str]) – Stores the (possible empty) protein labels extracted from data file.
indices (list[int]) – Stores the unique indices of sequences mapping to their position in the file
threshold (int) – If given, prediction labels and confidences are set to ‘’ if confidence in prediction is not at least threshold.
verbose (int) – If bigger 0, outputs warning if duplicates detected.
- Returns
df – Stores prediction information about the input protein sequences. Duplicates (defined by their sequence_id) have been removed from df.
- Return type
pandas.DataFrame
-
deepnog.io.
get_data_home
(data_home: str = None) → pathlib.Path[source]¶ Return the path of the deepnog data dir.
This folder is used for large files that cannot go into the Python package on PyPI etc. For example, the network parameters (weights) files may be larger than 100MiB. By default the data dir is set to a folder named ‘deepnog_data’ in the user home folder. Alternatively, it can be set by the ‘DEEPNOG_DATA’ environment variable or programmatically by giving an explicit folder path. If the folder does not already exist, it is automatically created.
- Parameters
data_home (str | None) – The path to deepnog data dir.
Notes
Adapted from SKLEARN_DATAHOME.
-
deepnog.io.
get_weights_path
(database: str, level: str, architecture: str, data_home=None, download_if_missing=True) → pathlib.Path[source]¶ Get path to neural network weights.
This is a path on local storage. If the corresponding files are not present, download from remote storage. The default remote URL can be overridden by setting the environment variable DEEPNOG_REMOTE.
- Parameters
database (str) – The orthologous groups database. Example: eggNOG5
level (str) – The taxonomic level within the database. Example: 2 (for bacteria)
architecture (str) – Network architecture. Example: deepencoding
data_home (string, optional) – Specify another download and cache folder for the weights. By default all deepnog data is stored in ‘~/deepnog_data’ subfolders.
download_if_missing (boolean, default=True) – If False, raise a IOError if the data is not locally available instead of trying to download the data from the source site.
- Returns
weights_path – Path to file of network weights
- Return type
Path