Thank you for choosing Nephele. If you would like to try submitting with a small dataset, below are the sample input files for each pipeline type.
Note: please unzip the file and upload individual*.fastq.gz
files when submitting.
Quality Check (Short Read) and 16S Amplicon Data (Mothur, QIIME, DADA2) | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
Sequences | 72MB | Contains paired-end data (forward and reverse) from 10 samples sequenced on an Illumina MiSeq |
Experimental Microbial Dysbiosis Does Not Promote Disease Progression in SIV-Infected Macaques.
NCBI BioProject: PRJNA417022 |
Mapping File (Excel) | 9KB | Metadata file used in Nephele job submissions that describes samples, groups, etc. for analysis |
Quality Check Nanopore Data | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
Sequences | 142.4MB | Contains subsampled fastq files from 2 samples sequenced using Minion |
Genome-centric analysis of short and long read metagenomes reveals uncharacterized microbiome diversity in Southeast Asians.
NCBI BioProject: PRJEB49168 |
Mapping File (Excel) | 17KB | Metadata file used in Nephele job submissions that describes samples for analysis |
ITS Amplicon Data (QIIME, DADA2 ITS) | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
Sequences | 24MB | Contains paired-end data (forward and reverse) from 3 samples sequenced on an Illumina MiSeq |
A fungal mock community control for amplicon sequencing experiments.
NCBI BioProject: PRJNA377530 |
Mapping File (Excel) | 10KB | Metadata file used in Nephele job submissions that describes samples, groups, etc. for analysis |
Downstream Analysis: Diversity | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
Biom | 252KB | Contains abundance and taxonomy assignments. It is generated by the analysis pipelines | Abundance table of a DADA2 analysis of dataset from 2017. NCBI BioProject: PRJNA417022 |
Tree | 25KB | It is a rooted phylogenetic tree in newick format |
Metagenome Inference: PICRUSt2 | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
Biom | 99KB | Abundance table generated by DADA2 pipeline | Peluso et.al. (2020) |
Fasta | 130KB | Sequences corresponding to sequence variants identified by the DADA2 pipeline in Nephele | |
Mapping File (Excel) | 40KB | Metadata file used in Nephele job submissions that describes samples, groups, etc. for analysis |
WGSA and bioBakery | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
Sequences | 856MB | Sequence files (fastq.gz) derived from a sequencing run using the Illumina HiSeq platform |
The example dataset is a subsampled version of HiSeq sample data collected from the
2nd CAMI Toy Human Microbiome Project Dataset
Sczyrba et al. (2017) |
Mapping File (Excel) | 9KB | Metadata file used in Nephele submissions that describes samples for analysis |
SARS-CoV-2 SGS | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
Sequences | 730MB | Directory with four fastq.gz files (pairs) corresponding to sequencing with Pool A and B primers. | Elodie Ghedin's Lab |
Primers | 1KB | A directory with two fasta files (new_A.fa and new_B.fa) each with all primers for Pools A and B | Elodie Ghedin's Lab |
Mapping File (Excel) | 9KB |
SARS-CoV-2 ARTICplus | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
Sequences | 137MB | Directory with four sample fastq.gz files (pairs) | |
Mapping File (Excel) | 11KB | ||
Primer File | 8KB |
DiscoVir | |||
---|---|---|---|
File (Type) | Size | Description | Reference |
.fasta and .bam files | 632.9MB | Contains assemblies and bam files subsampled after being generated using WGSA2 on paired reads from 10 samples | Shkoporov, A. N. et al. (2019). NCBI Bioproject PRJNA545408 |
Mapping File (Excel) | 10KB | Metadata file used for DiscoVir job and describes samples for analysis |