The visualization pipeline is only run for datasets with at least 3 samples.
More information about the individual tools that make up the pipelines can be found in the bioBakery
repository.
User Options
StrainPhlAn: Should strain profiling with StrainPhlAn be run? Strain profiling can greatly increase the runtime of your
job depending on the size and diversity of your samples. (Logical. Default: False). Note: at least 4 samples
must be provided in order to run StrainPhlAn.
Project name: A project name to go at the top of the html graphical output.
Output Files
The workflows tutorial goes through the pipelines step-by-step including
information about all intermediate and final output files. We list some of the
output files that may be of interest to our users here, as well as any output files made or removed by Nephele.
log files:
logfile.txt: contains the messages associated with the Nephele backend, such as transferring
files
main: Nephele removes all bowtie2 sam files produced by MetaPhlAn2. So, this folder only
contains sample_name_taxonomic_profile.tsv, a taxonomic profile for each sample
merged/metaphlan_taxonomic_profiles.tsv: merged taxonomic profiles for all samples
merged/metaphlan_species_counts_table.tsv: total number of species identified for each sample
metaphan_forMicrobiomeDB.biom: merged taxonomic profiles for all samples provided as a
BIOM-formatted file for import into MicrobiomeDB.
main: for each sample, a file of gene family and pathway abundances, pathway coverage, and a
log
merged/*.tsv: gene families, ecs, and pathways files for all samples merged into single files
merged/*_relab.tsv: data sets normalized to relative abundance
merged/humann_pathabundance.biom: pathway abundances for all samples in a BIOM-formatted file,
generated by Nephele
merged/humann_ecs.biom: ec abundances for all samples in a BIOM-formatted file, generated by
Nephele
counts/humann_feature_counts.tsv: feature counts of gene families, ecs, and pathways for all
samples
counts/humann_read_and_species_count_table.tsv: total species identified after filtering and
total reads aligning (for nucleotide and translated search) for each sample
strainphlan: Select this option to run
StrainPhlAn for strain-level profiling. This step will only run if the dataset has at least 4 samples.
RAxML.*: trees generated for each species, may not exist if species are not found (more
information can be found in the MetaPhlAn2/StrainPhlAn manual)
clade_name.fasta: the alignment file of all metagenomic strains
*.info: general information like the total length of the concatenated markers (full sequence
length), number of used markers, etc.
*.polymorphic: statistics on the polymorphic site, details here
*.marker_pos: this file shows the starting position of each marker in the strains.
wmgx_vis: If you submit at least 3 samples, the html report from the visualization pipeline will be created here. It includes
the software versions as well as the individual commands used.
Tools and References
McIver, L. J., Abu-Ali, G., Franzosa, E. A., Schwager, R., Morgan, X. C., Waldron, L., ... Huttenhower, C.
(n.d.). "bioBakery: a meta'omic analysis environment." Bioinformatics. https://doi.org/10.1093/bioinformatics/btx754