Education
Carnegie Mellon University
Aug 2018 - Dec 2019
School of Computer Science
·
Master of Information and Technology Strategy
Fudan University
Sep 2014 - Jun 2018
School of Computer Science
·
Bachelor of Science
Skills
Programming Languages
Python
·
Java
·
C
·
JavaScript
·
C++
·
Bash
·
Assembly
·
Scala
·
HTML
·
CSS
·
R
·
PHP
Tools & Technologies
Git
·
AWS
·
Node.js
·
Express
·
Django
·
TensorFlow
·
Scikit-Learn
·
Java RMI
·
RESTful
·
Elastic Stack
·
Splunk
·
MySQL
·
NoSQL
·
MATLAB
·
Linux
·
LaTeX
Industry Knowledge
Machine Learning
·
Deep Learning
·
Distributed System
·
Cloud Computing
·
Algorithms
Languages
English (Professional Working Proficiency)
·
Chinese (Native)
·
Japanese (Elementary proficiency)
Music Tech
FL Studio
·
Ableton Live
·
Analog Lab Lite
Experiences
Highmark Inc.
Mar, 2019 - Present
Pittsburgh, PA
·
Software Development Intern
·
Advisor: Hasan Yasar
·
View Project
YITU Technology
Apr 2018 - Jul 2018
Shanghai
·
Research and Development Intern
·
View Project
eBay Inc.
Nov 2017 - Mar 2018
Shanghai
·
DevOps Research and Development Intern
·
View Project
University of Waterloo
Sep 2017 - Oct 2017
Waterloo, ON
·
Research Intern
·
Advisor: Ming Li
·
View Project
University of California, Irvine
Jul 2017 – Sep 2017
Irvine, CA
·
Research Intern
·
Advisor: Chen Li
·
View Project
Harvard Medical School
Oct 2017 - Jun 2018
Boston, MA
·
Research Assistant
·
Advisor: Li Zhou
·
View Publications
Working Experience
YITU Technology
Shanghai
·
Research and Development Intern
·
Apr 2018 - Jul 2018
Applied Mixup as a data augmentation method for image classification tasks, increased accuracy by 0.5-0.7% on
28 datasets, reached 96.2% accuracy on Cifar-10 dataset.
Added Inception series algorithms into TF Autobot.
Accomplished regression tests on TF Autobot.
Built a speed testing tool for Caffe, TensorRT model on CPU/GPU respectively in C++.
eBay Engineering and Research Center Co., Ltd
Shanghai
·
DevOps Research and Development Intern
·
Apr 2018 - Jul 2018
Revamped an online intelligent operation platform for DevOPS department, used Django as python web-frame
work and Vue as JavaScript framework.
Built a Scala project of back-end algorithms including unsupervised learning models (Kmeans, Hierarchical
Clustering) and abnormal detection on the result of each pool (server) parameter's feature extraction, validated the risk level for each pool based on the classified history data, applying Apache Spark for distributed computing.
Projects
Carnegie Mellon University
·
Capstone Project
·
Sponsored by Highmark Inc.
·
Mar 2019 – Present
Developed a customized Splunk application for medical device vulnerability search and aggregation of Rapid7
Nexpose records, hospital records, Mitre and NVD database using Javascript, Node.js, and XML.
Built Splunk backend and node server using Splunk API, applied unit test using Artillery, performance test using
Splunk REST API, and workflow test using Selenium.
READ MORE
Carnegie Mellon University ·
June 2019 – Present
Designed and built a 4-tier, auto-scaling, e-commerce web service using Node.js with Express, indexed MySQL
database using AWS RDS.
Deployed on AWS EC2 with ELB (Elastic Load Balancer). Implemented central queueing, optimized on multiple
load-balancing method.
READ MORE
Carnegie Mellon University ·
Jan 2019 – Apr 2019
Project for CMU course 11-642 Search Engine.
Built a text-search engine with Apache Lucene in Java, containing a corpus of 500,000+ documents.
Implemented multiple retrieval models including Ranked/Unranked Boolean, Okapi BM25, Indri, etc.
Added SVM re-ranking, query expansion, intent-aware diversified search, and forward indexing as additional
features.
READ MORE
Carnegie Mellon University ·
Mar 2019 – Apr 2019
Project for CMU course 15-640 Distributed Systems.
Implemented a distributed transaction system using two-phase commit using Java.
Dealt with lost and delayed messages.
Handled node crashes and node recovery.
Utilized thread-safe data structures and locks to process concurrent requests, used logging to persistent storage for failure recovery.
READ MORE
Carnegie Mellon University ·
Jan 2019 – Mar 2019
Projects for CMU course 15-640 Distributed Systems.
Built an RPC system to allow concurrent remote file operations in C.
Designed and implemented a distributed system for file caching using Java RMI and Java threading.
Designed and built a caching proxy, robustly handled multiple concurrent clients and ensured open-close session
semantics on concurrent file access.
READ MORE
Carneigie Mellon University ·
Sep 2018 - Dec 2018
Project for CMU course 10-805 Machine Learning for Large Datasets (PhD Level).
Integrated the merit from both Dropout and Batch Normalization for each building block by keeping consistent variance, brought improvements on state-of-art CNN networks by efficiently mitigating overfitting problems.
Achieved 4.74% error rate on CIFAR-10 using ResNet50 and 4.56% error rate using DenseNet.
READ MORE
Carneigie Mellon University ·
Nov 2018 - Dec 2018
Project for CMU course 15-513 Introduction to Computer Systems.
Designed and created a sequential caching web proxy in C,
which serves for both local server and public websites.
Upgraded the proxy to deal with multiple concurrent connections
using POSIX Threads,
caching both static and dynamic contents using LRU eviction policy
and dealt with race conditions and synchronization using blocking.
READ MORE
Carneigie Mellon University ·
Oct 2018 - Dec 2018
Project for CMU course 15-513 Introduction to Computer Systems.
Created a general purpose dynamic storage allocator for C programs
which supports calls to malloc, free, realloc, and calloc functions.
Use doubly-linked segregated list to preserve free block list,
used LIFO as insertion policy.
READ MORE
Carneigie Mellon University ·
Sep 2018 - Oct 2018
Project for CMU course 15-513 Introduction to Computer Systems.
Implemented a Linux shell program, tsh (tiny shell),
that supports a simple form of job control and I/O redirection
using process control and signalling.
READ MORE
Fudan University ·
Mar 2017 - May 2017
Project for Introduction to Database Systems (COMP130010.03) @ Fudan University,
an full-featured web application for lazy gluttons.
Built with Python Flask framework, SQLite, jQuery, HTML5, CSS.
READ MORE
Research
University of California, Irvine ·
Advisor: Chen Li
·
Jul 2017 – Sep 2017
Built part of middleware on top of a big data management system, Apache AsterixDB, mapped front-end query to
MySQL and PostgreSQL query in Scala.
Wrote the SQL translator and connector of middleware to databases which sent and received queries and transferred datatypes for
semi-structured data models.
Developed a generalized data structure for back-end translators to support efficient real-time analytics.
READ MORE
University of Waterloo ·
Advisor: Ming Li
·
Sep 2017 - Oct 2017
Contributed to the development of DeepNovo, a deep neural network model about de novo peptide sequencing using ion-CNN, spectrum-CNN, and long short term memory (LSTM) recurrent neural network (RNN) in Tensorflow.
Promoted the recall accuracy by 6.2% for training model about inferring sequences from a tandem mass spectrum on the analysis of data independent acquisition via parameter adjustment.
Used Cython to speed up 50% of data feeding process in Python and decrease training perplexity for 59.4%.
READ MORE
Publications
Haohan Zhang,
Chunlei Tang, Joseph M Plasek, Yun Xiong, Jing Ma, Li Zhou, David W Bates
2018 American Medical Informatics Association Annual Symposium (AMIA)
We previously developed a regional classifier based on a spiral timeline
for visualizing literature data,
which presents research topic words under different themes in a spiral map
to show the chronological development of focused research topics.
When timelines are combined with a geographical map,
they depict temporal patterns of events with respect to their spatial attributes.
we call a “spiral atlas of temporal variation” (abbreviated as “atlas”).
READ MORE
Chunlei Tang, Joseph M Plasek,
Haohan Zhang,
Yun Xiong, David W Bates, Li Zhou
2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
Chronic Obstructive Pulmonary Disease (COPD) is a leading cause of mortality in the United States.
Representing COPD progression using temporal graphs may offer critical clinical insights.
LSTM (Long-Short Term Memory) units in recurrent neural networks can process data with constant elapsed times between consecutive elements of a sequence
but cannot handle irregular time intervals
(i.e., segments with unequal-time).
In this study, we propose a four-layer deep learning model that utilizes a specially configured recurrent neural network to capture irregular time lapse segments.
Experiments on a corpus of COPD patients' clinical notes compared to baseline algorithms showed that our model improved interpretability as well as the accuracy of estimating COPD progression.
READ MORE
Chunlei Tang,
Haohan Zhang,
Kenneth H Lai, Yuxuan She, Yun Xiong, Li Zhou
2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
In this paper, we propose a linear classifier based on a spiral,
which we call a regional classifier.
The study emphasizes the development of visualization
methods and the process of finding a
specific research clue to track patient needs reported in
medical literature.
When timelines are combined with a spiral geographical map,
they show a geometric shape that helps to reveal the clues from different
spatial viewpoints and periodical constraints.
Our evaluation showed that the regional classifier produces better
visual effects than support vector machine classifiers.
READ MORE