Posts Tagged ‘Weka’
Training sets, Test set, Measurements, Classifier, Models and how to get a clue out of it …
Currently, I’m attending a Data Mining course at the University of Maastricht. I’m quite surprised about the overlap of techniques that are applied in Data Mining and also in Recommender System research. For instance, the Data mining classifier are also evaluated on measures like accuracy, precision, and recall through computing a confusion matrix and finally draw a ROC curve. During the course we are using the open source software Weka to train various classifiers on different training sets. In a next step, these trained classifiers are applied on a test set to compare their accuracy to each other. The most reliable classifier can then be further optimized.
Data mining techniques are getting more and more populare in Technology Enhanced Learning since 2004. This also called Educational Data Mining can be applied for student grouping, task analysis, and to support teachers or learners. Marco and I thought about training a classifier on behavioral data of Moodle courses from 2007 with the specific focus on students that dropped out. The most accurate classifiers could then be used to monitor courses in 2009. If the classifier identifies similar patterns of behavior by students in the present course it could inform the teacher. Maybe the drop out of the students can be prevented through giving them special attention or explicitly connect them with students that perform more successfully.
The following figure describes the typical Knowledge Dicovery Process for Data mining like it is defiend by Usama Fayyad in his article of 1996: “From Data Mining to Knowledge Discovery in Database”.

