Usage
To use MetaNetMap in a project:
import metanetmap
Command-line usage
Based on the input listed in :Inputs and outputs: Mapping mode, metanetmap can be run with two main modes:
Database building mode: to create a third-party conversion datatable from MetaCyc or MetaNetX data.
- Mapping mode: to map metabolomic data against metabolic networks using the conversion datatable. The mapping mode can be run in two different ways:
classic mode: one or multiple metabolomic data files against a single metabolic network.
community mode: one or multiple metabolomic data files against multiple metabolic networks.
Additionally, the Test mode runs predefined tests using toy data included in the package.
Note
Before running the different modes, you must first build your own conversion datatable, as described below in the section Custom third party database.
Custom third party database
Non-trivial mapping between metabolomic data and metabolic networks requires a comprehensive knowledge base that links various identifiers from both sources. This is achieved through a third-party conversion datatable that acts as a bridge between the two datasets. It can currently be built using two different knowledge bases:
Using MetaCyc files (not provided with this package). You need a license to use MetaCyc data; here we use the information stored in the
compounds.dat(orcompounds_version.dat) file — in order to build this conversion datatable.Using MetaNetX reference files, which can be downloaded from MetaNetX Reference Data
You can also provide your own custom conversion data table, as long as it follows the required column naming convention. This ensures that the mapping mode runs correctly. See the Advanced usage for more details.
Note
The list and description of the required column names are available in the Inputs and outputs: Third-party database building mode section.
Running database building mode for MetaCyc:
metanetmap build_db --db metacyc -f file/path/to/compounds.dat --compfiles file/path/to/complementary_datatable.tsv # Optional --out_db file/path/to/output_conversion_datatable.tsv # Optional -q quiet_mode (True/False) # Optional: False by default
Running database building mode for MetaNetX:
metanetmap build_db --db metanetx -f file/path/to/MetaNetX_chem_prop.tsv file/path/to/MetaNetX_chem_xref.tsv # Optional --compfiles file/path/to/complementary_datatable.tsv # Optional --out_db file/path/to/output_conversion_datatable.tsv # Optional -q quiet_mode (True/False) # Optional: False by default
Note
The parameters
file/path/to/output_conversion_datatable.tsvandfile/path/to/complementary_datatable.tsvare optional.
If the output argument –out_db is not provided, the output file
file/path/to/output_conversion_datatable.tsvwill be created by default in the current working directory.If the argument –compfiles is not provided, the step for completing the conversion datatable with the user’s additional mapping data
file/path/to/complementary_datatable.tsvwill be skipped.For the
metanetxoption, the-fargument specifies the input files. If not provided by the user, the defaultchem_propandchem_xreffiles will be downloaded automatically.The file
file/path/to/complementary_datatable.tsvcan also be a manually curated file created by users to include specific or custom IDs.
Depending on the selected knowledge base (metanetx or metacyc), the output file name will include the database as a prefix.
For more details on input/output data and directory structure, see Inputs and outputs: Third-party database building mode
Mapping mode
Once a conversion data table is built, you can run MetaNetMap in two different sub-modes with a partial match option :
Classic mode:
The classic mode allows you to input a single metabolomic annotation profile (tabulated file, .maf or .tsv) or a directory containing multiple metabolomic annotation profiles, and a unique metabolic network (.sbml or .xml) to which metabolites will be mapped.
metanetmap classic -s path/to/metabolic_networks.sbml # Single SBML file -a path/to/metabolomic_data/ # Single file or directory -d path/to/conversion_datatable.tsv -o path/to/output/directory/ # Optional -p partial_match(True/False) # Optional explanation below: False by default -q quiet_mode (True/False) # Optional: False by default
Community mode:
The “community” mode allows you to input a directory containing multiple metabolomic annotation profiles (tabulated files, .maf or .tsv), as well as a directory containing multiple metabolic networks (.sbml or .xml).
It will map each metabolomic data file against each metabolic network file, resulting in a comprehensive mapping across all combinations.
This mode is useful for large-scale analyses involving a microbial community where multiple organisms and their associated networks are considered in the metabolomic study.
metanetmap community -s path/to/metabolic_networks_directory/ # Directory containing multiple SBML files -a path/to/metabolomic_data/ # Single file or directory -d path/to/conversion_datatable.tsv -o path/to/output/directory/ # Optional -p partial_match(True/False) # Optional, explanation below: False by default -q quiet_mode (True/False) # Optional: False by default
Note
The partial match option aims at increasing the chances of finding a match for metabolites that were not mapped during the initial run.
This step is optional, as it can be time-consuming depending on the number of unmatched entries. To rescue those unmatched entries, specific strategies are applied, such as searching via ChEBI, InChIKey, or enantiomer simplification.
For more details on input/output data and directory structure, see Inputs and outputs: Mapping mode, for more details on advanced methods (partial match, ambiguities, …), see Advanced usage.