Datasets

We do a lot of sequencing as we're developing products, and we always like to see it get used for science! Some of this ends up getting published. Other data we'll be sharing here and/or posting in public repositories. Feel free to use but please give us and our collaborators a shout out.

Ultra Long Nanopore Datasets

This data was generated using Nanobind-Enhanced Ultra Long Nanopore Sequencing.

HG02723 Ultra Long Nanopore Sequencing Dataset - R9.4

Overview

This dataset contains 468 Gb (>140X total coverage) of ultra long nanopore sequencing data acquired over the course of developing the Nanobind Ultra Long Sequencing method. The HG02723 human cell line was acquired from Coriell and used as a control sample. This data was generated in collaboration with UCSC Genomics Institute as part of the Human Pangenome Reference Project and T2T Consortium work. 

The read length N50 of the entire data set is 67 kb, with 161 Gb in reads >100 kb, 57 Gb in reads >200 kb, and 5 Gb in reads >500 kb. The longest read in this dataset is 2.5 Mb.

The data is provided in the Fast5 format and FastQ format. Basecalling was performed using Guppy 4.2.2.

Sample Type

Frozen HG02723 cell pellets cultured at Circulomics

DNA Extraction

UHMW DNA extraction using Nanobind CBB Big DNA Kit

Library Preparation

Nanobind Ultra Long Sequencing using Nanobind UL Library Prep Kit and Oxford Nanopore Ultra-Long DNA Sequencing Kit. This data was generated using multiple developmental versions of the protocol.

Sequencing

Sequenced on Oxford Nanopore MinION, GridION, and PromethION using R9.4.1 flow cells.

Generated By

Circulomics in collaboration UCSC Genomics Institute

Release Date

03/24/2021

Files

Sequencing data is available on Amazon S3. For convenience, the data has been broken down between MinION and PromethION data. 

Ultra long sequencing data is available as:

  • HG02723_Circulomics_MinION/PromethION_R941_partXX.fast5.tar - signal data

  • HG02723_Circulomics_MinION/PromethION_Guppy_4.2.2_fastq.gz - Fastq file basecalled using Guppy v4.2.2

  • HG02723_Circulomics_MinION/PromethION_Guppy_4.2.2_summary.txt.gz - Sequencing summary file from Guppy basecalling that can be used for indexing signal data

Standard nanopore sequencing data from Shafin et al. (https://doi.org/10.1038/s41587-020-0503-6) is also available:

  • HG02723_X.fast5.tar - signal data

  • HG02723_X_Guppy_4.2.2_prom_fastq.gz - Fastq file basecalled using Guppy v4.2.2

  • HG02723_X_Guppy_4.2.2_summary.txt.gz - Sequencing summary file from Guppy basecalling that can be used for indexing signal data

Friendly Disclaimer

This is data we generated during the course of development. It wasn't originally intended for external consumption at the time of generation but we didn't want it to go to waste so we're sharing it here. We're still in the process of analyzing it ourselves so please take care during your own analysis.

701 E Pratt St, Baltimore, MD 21202

INFO@CIRCULOMICS.COM

  • Grey Twitter Icon
  • Grey LinkedIn Icon
  • Grey Facebook Icon