Tim Dunn, David Blaauw, Reetuparna Das, Satish Narayanasamy, "nPoRe: n-Polymer Realigner for improved pileup variant calling". bioRxiv, 2022.
PDF Code

  title={nPoRe: n-Polymer Realigner for improved pileup variant calling},
  author={Dunn, Tim and Blaauw, David and Das, Reetuparna and Narayanasamy, Satish},
  publisher={Cold Spring Harbor Laboratory}

Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now regularly exceeds 99.5%, INDEL recall at relatively high coverages (85x) remains below 80% for standard R9.4.1 flow cells. Current nanopore variant callers work in two stages: an efficient pileup-based method identifies candidates of interest, and then a more expensive full-alignment model provides the final variant calls. Most false negative INDELs are lost during the first (pileup-based) step, particularly in low-complexity repeated regions. We show that read phasing and realignment can recover a significant portion of INDELs lost during this stage. In particular, we extend Needleman-Wunsch affine gap alignment by introducing new gap penalties for more accurately aligning repeated n-polymer sequences such as homopolymers (n = 1) and tandem repeats (2 <= n <= 6). On our dataset with 60.6x coverage, haplotype phasing improves INDEL recall in all evaluated high confidence regions from 63.76% to 70.66% and then nPoRe realignment improves it further to 73.04%, with no loss of precision.

Conference Papers

Tim Dunn*, Harisankar Sadasivan*, Jack Wadden, Kush Goliya, Kuan-Yu Chen, David Blaauw, Reetuparna Das, Satish Narayanasamy, "SquiggleFilter: An Accelerator for Portable Virus Detection". 54th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). Virtual Event, Athens, Greece, 2021. IEEE MICRO 2022 Top Picks Honorable Mention
PDF Code Short Talk Full Talk

  title={SquiggleFilter: An Accelerator for Portable Virus Detection},
  author={Dunn, Tim and Sadasivan, Harisankar and Wadden, Jack and Goliya, Kush and Chen, Kuan-Yu and Blaauw, David and Das, Reetuparna and Narayanasamy, Satish},
  booktitle={MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture},

The MinION is a recent-to-market handheld nanopore sequencer. It can be used to determine the whole genome of a target virus in a biological sample. Its Read Until feature allows us to skip sequencing a majority of non-target reads (DNA/RNA fragments), which constitutes more than 99% of all reads in a typical sample. However, it does not have any on-board computing, which significantly limits its portability.

We analyze the performance of a Read Until metagenomic pipeline for detecting target viruses and identifying strain-specific mutations. We find new sources of performance bottlenecks (basecaller in classification of a read) that are not addressed by past genomics accelerators.

We present SquiggleFilter, a novel hardware accelerated dynamic time warping (DTW) based filter that directly analyzes MinION's raw squiggles and filters everything except target viral reads, thereby avoiding the expensive basecalling step. We show that our 14.3W 13.25mm2 accelerator has 274× greater throughput and 3481× lower latency than existing GPU-based solutions while consuming half the power, enabling Read Until for the next generation of nanopore sequencers.

Arun Subramaniyan, Yufeng Gu, Tim Dunn, Somnath Paul, Md Vasimuddin, Sanchit Misra, David Blaauw, Satish Narayanasamy, Reetuparna Das, "GenomicsBench: A Benchmark Suite for Genomics". IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). Virtual Event, 2021.
PDF Code

  title={GenomicsBench: A Benchmark Suite for Genomics},
  author={Subramaniyan, Arun and Gu, Yufeng and Dunn, Timothy and Paul, Somnath and Vasimuddin, Md and Misra, Sanchit and Blaauw, David and Narayanasamy, Satish and Das, Reetuparna},
  booktitle={2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)},

Over the last decade, advances in high-throughput sequencing and the availability of portable sequencers have enabled fast and cheap access to genetic data. For a given sample, sequencers typically output fragments of the DNA in the sample. Depending on the sequencing technology, the fragments range from a length of 150-250 at high accuracy to lengths in few tens of thousands but at much lower accuracy. Sequencing data is now being produced at a rate that far outpaces Moore's law and poses significant computational challenges on commodity hardware. To meet this demand, software tools have been extensively redesigned and new algorithms and custom hardware have been developed to deal with the diversity in sequencing data. However, a standard set of benchmarks that captures the diverse behaviors of these recent algorithms and can facilitate future architectural exploration is lacking.

To that end, we present the GenomicsBench benchmark suite which contains 12 computationally intensive data-parallel kernels drawn from popular bioinformatics software tools. It covers the major steps in short and long-read genome sequence analysis pipelines such as basecalling, sequence mapping, de-novo assembly, variant calling and polishing. We observe that while these genomics kernels have abundant data level parallelism, it is often hard to exploit on commodity processors because of input-dependent irregularities. We also perform a detailed microarchitectural characterization of these kernels and identify their bottlenecks. GenomicsBench includes parallel versions of the source code with CPU and GPU implementations as applicable along with representative input datasets of two sizes - small and large.

Workshop Papers

Tim Dunn, Sean Banerjee, Natasha Banerjee, "User-Independent Detection of Swipe Pressure using a Thermal Camera for Natural Surface Interaction". IEEE 20th International Workshop on Multimedia and Signal Processing (MMSP). Vancouver, Canada, 2018. Top 5% Paper, MMSP 2018
PDF Code Slides

  title={User-independent detection of swipe pressure using a thermal camera for natural surface interaction},
  author={Dunn, Tim and Banerjee, Sean and Banerjee, Natasha Kholgade},
  booktitle={2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)},

In this paper, we use a thermal camera to distinguish hard and soft swipes performed by a user interacting with a natural surface by detecting differences in the thermal signature of the surface due to heat transferred by the user. Unlike prior work, our approach provides swipe pressure classifiers that are user-agnostic, i.e., that recognize the swipe pressure of a novel user not present in the training set, enabling our work to be ported into natural user interfaces without user-specific calibration. Our approach generates average classification accuracy of 76% using random forest classifiers trained on a test set of 9 subjects interacting with paper and wood, with 8 hard and 8 soft test swipes per user. We compare results of the user-agnostic classification to user-aware classification with classifiers trained by including training samples from the user. We obtain average user-aware classification accuracy of 82% by adding up to 8 hard and 8 soft training swipes for each test user. Our approach enables seamless adaptation of generic pressure classification systems based on thermal data to the specific behavior of users interacting with natural user interfaces.

Tim Dunn, Natasha Banerjee, Sean Banerjee, "GPU Acceleration of Document Similarity Measures for Automated Bug Triaging". 1st International Workshop on Software Faults (IWSF). Ottawa, Canada, 2016.
PDF Code

  title={GPU acceleration of document similarity measures for automated bug triaging},
  author={Dunn, Tim and Banerjee, Natasha Kholgade and Banerjee, Sean},
  booktitle={2016 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)},

Large-scale open source software bug repositories from companies such as Mozilla, RedHat, Novell and Eclipse have enabled researchers to develop automated solutions to bug triaging problems such as bug classification, duplicate classification and developer assignment. However, despite the repositories containing millions of usable reports, researchers utilize only a small fraction of the data. A major reason for this is the polynomial time and cost associated with making comparisons to all prior reports. Graphics processing units (GPUs) with several thousand cores have been used to accelerate algorithms in several domains, such as computer graphics, computer vision and linguistics. However, they have remained unexplored in the area of bug triaging.

In this paper, we demonstrate that the problem of comparing a bug report to all prior reports is an embarassingly parallel problem that can be accelerated using graphics processing units (GPUs). Comparing the similarity of two bug reports can be performed using frequency based methods (e.g. cosine similarity and BM25F), sequence based methods (e.g. longest common substring and longest common subsequence) or topic modeling. For the purpose of this paper we focus on cosine similarity, longest common substring and longest common subsequence. Using an NVIDIA Tesla K40 GPU, we show that frequency and sequence based similarity measures are accelerated by 89 and 85 times respectively when compared to a pure CPU based implementation. Thus, allowing us to generate similarity scores for the entire Eclipse repository, consisting of 498,161 reports in under a day, as opposed to 83.4 days using a CPU based approach.

Undergraduate Honors Thesis

Tim Dunn, "Detection of Swipe Pressure using a Thermal Camera and ConvNets for Natural Surface Interaction". Clarkson University Honors Program. Potsdam, New York, 2019.
PDF Code

In this paper, I present a system for reliably distinguishing between two levels of applied finger pressure on planar surfaces using a thermal camera. This work is the first to do so without requiring prior per-user calibration, and will enable arbitrary natural materials to be used as touchscreen surfaces in augmented reality applications. Two approaches were explored during this research for swipe pressure identification, which took place over the Spring and Fall Semesters of 2018. The first approach used morphological filters and supplied handcrafted features as input to a random forest classifier. The second approach used convolutional neural networks to classify both raw and filtered video data using several approaches. It was found that convolutional neural networks could only consistently outperform the random forest classifier when the same morphological filtering had been applied on the input videos.


Tim Dunn, Daniel Thuerck, Michael Goesele "SemiDefinite Program (SDP) Solver". Deutsche Akademische AustauschDienst (DAAD) Research Internships in Science and Engineering (RISE) Student Conference. Heidelberg, Germany, 2018.

Tim Dunn, Sean Banerjee, "Using GPUs to Mine Large Scale Software Problem Repositories". Clarkson University Symposium on Undergraduate Research Experiences (SURE). 2016. Best Poster Presentation in EE & CS, Honorable Mention Oral Presentation
Poster Slides