Saturday, March 27, 2010

Data Mining Competition!

If you are interested in Knowledge discovery and data mining, the rest of the post will excite you. The 14th Pacific-Asia Knowledge Discovery and Data Mining conference (PAKDD 2010) will be held at Hyderabad, India during June 21-24, 2010. A data mining competition will also be hosted during the conference. It is open to both the academia and the industry. More details are available here.

Problem Summary - Re-Calibration of a Credit Risk Assessment System Based on Biased Data

The most fundamental and most frequently found type of decision is the Binary Decision. This type of decision appears in any business activity where the decision outcome is either to "do that" or to "do something else". In general, this is made via a simple threshold which serves as the control parameter for producing decisions over a propensity score. Binary decisions, in principle, could be assessed "successful" or  "unsuccessful" for either outcome, via errors type-I and type-II, but in  general, only the "do that" decision outcome is monitored for decision assessment.

As a consequence, only a part of the "market" is monitored and has its decisions assessed as a "successful" or "unsuccessful", forming a very biased sample for system re-calibration/re-training because, it has been extracted from the market by a process focused on the decision objective. This competition focuses on how to build a model for a binary decision support system based on this type of biased sample in a credit scoring application. There are only data about the company's clients for modeling, but not about the rejected applicants. This is the context of PAKDD 2010 Competition.

-- Varun

No comments:

Post a Comment