BUS 535 Data Mining and Warehousing
Data mining is a relatively new term used in the academic and business worlds often associated with the development and quantitative analysis of very large databases. Its definition covers a wide spectrum of analytic and information technology topics including a set of techniques tht have been designed to find interesting pieces of information or knowledge in large amounts of data most efficiently. Association rules, for instance, are a class of patterns that tell which products tend to be purchased together. There is currently a large commercial interest in the area, both for the development of data mining software and for the offering of consulting services on data mining, with a market for the former estimated in the billions of U.S. dollars. In this course we explore how this interdisciplinary field brings together techniques from databases, statistics, machine learning, and information retrieval. We discuss the main data mining methods currently used, including data warehousing, denormalization, data cleaning, clustering, classification, association rules mining text indexing and searching algorithms, how search engines rank pages, and recent techniques for web mining.