We do a lot of sequencing as we're developing products, and we always like to see it get used for science! Some of this ends up getting published. Other data we'll be sharing here and/or posting in public repositories. Feel free to use but please give us and our collaborators a shout out.
Ultra Long Nanopore Datasets
This data was generated using Nanobind-Enhanced Ultra Long Nanopore Sequencing.
HG02723 Ultra Long Nanopore Sequencing Dataset - R9.4
This dataset contains 468 Gb (>140X total coverage) of ultra long nanopore sequencing data acquired over the course of developing the Nanobind Ultra Long Sequencing method. The HG02723 human cell line was acquired from Coriell and used as a control sample. This data was generated in collaboration with UCSC Genomics Institute as part of the Human Pangenome Reference Project and T2T Consortium work.
The read length N50 of the entire data set is 67 kb, with 161 Gb in reads >100 kb, 57 Gb in reads >200 kb, and 5 Gb in reads >500 kb. The longest read in this dataset is 2.5 Mb.
The data is provided in the Fast5 format and FastQ format. Basecalling was performed using Guppy 4.2.2.
Frozen HG02723 cell pellets cultured at Circulomics
UHMW DNA extraction using Nanobind CBB Big DNA Kit
Nanobind Ultra Long Sequencing using Nanobind UL Library Prep Kit and Oxford Nanopore Ultra-Long DNA Sequencing Kit. This data was generated using multiple developmental versions of the protocol.
Sequenced on Oxford Nanopore MinION, GridION, and PromethION using R9.4.1 flow cells.
Circulomics in collaboration UCSC Genomics Institute
Sequencing data is available on Amazon S3. For convenience, the data has been broken down between MinION and PromethION data.
Ultra long sequencing data is available as:
HG02723_Circulomics_MinION/PromethION_R941_partXX.fast5.tar - signal data
HG02723_Circulomics_MinION/PromethION_Guppy_4.2.2_fastq.gz - Fastq file basecalled using Guppy v4.2.2
HG02723_Circulomics_MinION/PromethION_Guppy_4.2.2_summary.txt.gz - Sequencing summary file from Guppy basecalling that can be used for indexing signal data
Standard nanopore sequencing data from Shafin et al. (https://doi.org/10.1038/s41587-020-0503-6) is also available:
HG02723_X.fast5.tar - signal data
HG02723_X_Guppy_4.2.2_prom_fastq.gz - Fastq file basecalled using Guppy v4.2.2
HG02723_X_Guppy_4.2.2_summary.txt.gz - Sequencing summary file from Guppy basecalling that can be used for indexing signal data
This is data we generated during the course of development. It wasn't originally intended for external consumption at the time of generation but we didn't want it to go to waste so we're sharing it here. We're still in the process of analyzing it ourselves so please take care during your own analysis.