Posts

Confusion Matrix

Image
  Confusion Matrix : A confusion matrix contains information about actual and predicted classifications done by a classification model. Performance of such systems is commonly evaluated using the data in the matrix. The following table shows the confusion matrix for a two-class classifier. Different accuracy measures can be derived from the classification matrix.         i.    Accuracy: The accuracy is the proportion of the total number of predictions that were correct. The overall accuracy of a classifier is estimated by        ii.    Misclassification rate: The main accuracy measure is the estimated misclassification rate, also called the overall error rate. It is given by      iii.    Sensitivity : The sensitivity of a classifier is its ability to detect important class members correctly. This is measur...

Data Mining Process

Image
  Data Mining Process Data mining practice should be undertaken in a certain guided way to be manageable and traceable; a collection of data mining standard process has been proposed such as CRISP-DM, SEMMA and KDD. It is important to understand the whole approach of how data mining can be conducted before one start running algorithms to discover interesting patterns from data. Blindly applying data mining model on input data without well-organized application also called “data dredging”, could always produce meaningless or unintelligible patterns, which may eventually fail a project. In comparison, a formalized, well-defined data mining application could frequently discover valid, understandable and novel patterns. Moreover, a common data mining framework could also help as a measurable roadmap for people to follow during project planning and implementing. CRISP-DM The CRISP-DM process was initially conceived in late 1996 by Daimler Chrysler, SPSS and NCR, three “veterans...

Moses Test of Extreme Reaction

Image
An extreme reactions test is employed when researchers anticipate observing significant treatment effects in subjects displaying extreme behaviors. For instance, a treatment might lead certain subjects to exhibit calmness, while others under the same experimental conditions might display active behaviors. In studies related to anxiety, the impact of anxiety could enhance performance for some subjects while impairing it for others. Conversely, the treatment's effect might cause the experimental subjects' scores to cluster around a central point, similar to some perception studies. Moses (1952) addressed this issue and highlighted that location-based tests were insufficient for extreme reactions, sometimes concealing differences. This masking effect arose from averaging the extreme scores of the experimental group and comparing them with control group scores, which could all be "average." Given that these averages often exhibit only minor differences, they frequently fa...

Ordinal Logistic Regression

Image
Ordinal logistic regression is a method for modelling the relationship between an ordinal response variable and one or more explanatory variables. Let Y be a ordinal outcome with J categories.  X be a binary predictor, then ordinal logistic regression model is parameterized as: The the two equations for x = 1 and x= 0 are Then Then the odds ratio is defined as Odds Ratio  Interpretation: logit [P (y ≤ j)] = β 0j + βx For categorical predictors, the odds ratio compares the odds of the event occurring at two different levels of the predictor. Odds ratios that are greater than 1 indicate that the first event and the events closer to the first event are more likely at the level of the predictor in the logistic regression table than at the reference level of the predictor. Odds ratios that are less than 1 indicate that the last event and the events that are closer to it are more likely at the level of the predictor in the logistic regression table than at t...