I: tk[AJ € D i • For discretized attributes, this means that any continuous value can be related to one of the intervals. Null (missing) values may be encoded by a special value. To discover informative patterns (rules) between relation attributes, we make the following partition of the relation schema: - A subset CcR of candidate input attributes (/C! ~ 1). These attributes are always known (given in the database) and they can be used to predict the values of target attributes (see next). - A subset OcR of target ("output") attributes (/0/ 8).

These include pre-processing, data mining and post-processing in the following three chapters. Chapter 2 discusses the information-theoretic method of static discretization. Chapter 3 presents the core of the IFN methodology. Chapter 4 focuses on rule extraction and reduction. When finished with Part I, the reader will be familiar with the details of the IFN unified methodology. Part IT then proceeds to explain the art and science of KDD implementation, both in general and as it especially pertains to the IFN system.

In the appendices, we present the necessary equations from the information theory (Appendix AI) and detailed results of the comparative study (Appendix A2). 1Ilaimoniifn-kdQD. The reader can start with the project data given there and then to continue and explore additional datasets. Chapter 2 Automated Data Pre-Processing Static Discretization of Quantitative Attributes 1. DISCRETIZATION OF ORDINAL FEATURES As indicated in Chapter 1 above, many learning methods require partition of continuous attributes (features) into discrete intervals.

