Academic Background
Northwestern University, Evanston, IL, USA
Ph.D in Computer Engineering | December 2005 |
Northwestern University, Evanston, IL, USA
M.S. in Computer Engineering | December 2001 |
Peking University, Beijing, China
B.S. (Major) in Computer Science | July 1999 |
Associate Professor | January 2006 - present |
Individual credit scoring:Conduct research on credit scoring algorithms and implement the credit scoring system. Project funded by the people's bank of China.
Optimization based utility mining. Research funded by Natural Science Foundation of China.
Research Assistant | September 2001 - December 2005 |
Distributed mining on data streams:Conducting research on distributed data stream mining. Due to security issues, communication, memory and time-related constraints, stream mining algorithms performed on centralized data are not feasible. A distributed stream mining algorithm is being explored, which offloads simple mining to the sensors and discovers global patterns based on the local mining results at minimum communication cost. It is highly desirable for applications with continuous data streams from distributed servers or sensor networks, such as roadway traffic system, stock market transactions, ATM transactions, and clickstreams.
Utility mining: Utility mining identifies the high utility (profit or total value rather than the frequency of the presence of an item in Association Rule Mining) itemsets or most valuable customers from very large transaction databases. Designed and implemented Two-Phase utility mining algorithm, which outperforms the existing algorithms in terms of accuracy, memory space cost, and speed by several orders of magnitude. A number of high profit item combinations and valuable customers were efficiently discovered from our experiments on a grocery store's real data, which guided its business strategy.
Performance evaluation and characterization of scalable data mining algorithms: Developed a data mining Benchmark suite named MineBench (consisting of several representative data mining algorithms), and parallelized the programs on Shared-Memory Parallel machines (SMP) using OpenMP. Evaluated the performance, characterized the computation kernels of MineBench programs and investigated the usage of the memory hierarchy (in data access patterns and data locality) on a real machine and on a simulation tool. The kernels and characteristics observed from this research helped provide guidance for new architecture and algorithm optimizations. Research funded by Intel Corporation.
High performance data mining in large-scale scientific simulations: As the data mining computation in large-scale scientific applications is time-consuming, a parallel clustering algorithm, called HOP, was designed and implemented. This algorithm achieves good speedups by partitioning data into a balanced KD tree and minimizing inter-processor communication through batch data transfer. An online processing model was proposed to integrate the parallel data mining algorithms into scientific simulations so that the entire simulation cycles can execute automatically without human intervention or data input/output when performing data mining applications repeatedly and continuously. Research funded by Department of Energy (DOE).
Research Assistant | September 1999 - August 2001 |
Text classification: As each word in a text document may have multiple meanings (senses), it is difficult to classify a document to its appropriate class. Designed and implemented a WordNet-based algorithm to disambiguate word senses in a text document. Each keyword is automatically mapped to its synsets and a counter is maintained for each synset. The sense a keyword uses is determined by the synset containing it with the greatest counter. A document is classified based on the senses of its keywords. (using Java, CGI, JDBC, SQL, Oracle 8i)
Data warehousing: Designed a data warehouse for Mass Storage Performance Information System, implemented the user interface and query system to facilitate the knowledge discovery. (using Java, CGI, JDBC, SQL, Oracle 8i) Research funded by National Aeronautics and Space Administration (NASA).Development Intern | February 1999 - June 1999 |
Designed and implemented the statistic tools for Qingniao object oriented online database system. (using C++, HTML, CGI, Oracle Database)
Kellogg School of Management, Northwestern University
Teaching Assistant | September 2005 - December 2005 |
Electrical and Computer Engineering Department, Northwestern University
Teaching Assistant | January 2005 - March 2005 |
Kellogg School of Management, Northwestern University
Teaching Assistant | March 2004 - June 2004 |
Electrical and Computer Engineering Department, Northwestern University
Teaching Associate | January 2002 - March 2002 |
Electrical and Computer Engineering Department, Northwestern University
Teaching Associate | January 2001 - March 2001 |
Jianwei Li, Ying Liu, Wei-keng Liao, Alok Choudhary, Parallel Data Mining Algorithms for Association Rules and Clustering, Handbook of Parallel Computing: Models, Algorithms and Applications. Sanguthevar Rajasekaran and John Reif, ed., CRC Press, 2006.
Contact Information
Email: yingliu@ece.northwestern.edu
Return to Home
Last Updated November, 2007 by Ying Liu