International Journal of Applied Information Systems (IJAIS) – ISSN : 2249-0868
Foundation of Computer Science FCS, New York, USA
Volume 6 – No. 3, September 2013 – www.ijais.org
Classification of Students in a Web based Learning
Environment
Mohit Shroff
Prashant Kanade
Prashant Zaware
Dept. of Computer Engineering,
VESIT, Mumbai. India.
Asst. Professor,
Dept. of Computer Engineering,
VESIT, Mumbai. India.
Dept. of Computer Engineering,
VESIT, Mumbai. India.
ABSTRACT
Predicting academic performance and monitoring the progress of students in a web based learning environment is a critical issue. In this paper, K-means Clustering algorithm is implemented to predict student performance at the end of the semester. The results can be used to enhance the understanding of the course instructor to reform the syllabus, thereby increasing the chances of a higher score by lagging students. Higher education institutes offering distance learning courses through web can use this model to identify which area of their course can be improved by data mining technology to achieve higher student marks.
General Terms
Pattern Recognition, Data Mining, Algorithms.
Keywords
Web based learning, performance measures, k means.
1. INTRODUCTION
The proliferation of use of data in many application areas such as banking, fraud detection, insurance and medicine is due to the result of powerful, affordable and sustainable database systems which can be easily scaled to collect and generate millions of datasets over a period of time.
Nowadays, a promising frontier of database applications [1] and analytics is Data Mining. Data Mining is the process of extracting useful knowledge and information including patterns, associations, clusters and anomalies from a great deal of data stored in data warehouses or other information repositories. The capability of data mining in predicting behavior [2] and thus to classify students in a web based learning environment is the crux of this paper.
In this paper, data mining techniques have been highlighted for classifying students [3] in a web based environment to predict the final scores of the students depending on previous academic history. The prediction will help the course instructor to enhance/ modify/ remodel their syllabus to improve the scores of the students in the final exams.
A clustering algorithm [4,7] has been used to customize the behavior for predicting student’s performance. One of the main goals in applying the clustering method was to group students in clusters with dissimilar behavior; the student within the same cluster embraces the closest behavior and the ones in different clusters have the most dissimilar behavior.
Five performance measures were devised which were unique to this model and normalized scores were obtained for each performance measure by assigning weightages to them.
2. PERFORMANCE INDICATORS
The ideal course time for the course as prescribed by the instructor was 45 hours or 2700 minutes during the course of the semester.
Cluster analysis was used in our model to predict and identify patterns from the data warehouse. The data warehouse consists of 100 records i.e. 100 student’s data was used to extract definitive patterns.
The different entities and attributes of each entity are listed below: Student (Sid, sex, age, disabled, birthplace, curr_city, total_time, courses_count)
Where, total_time indicates the time spent by the student in one particular course.
Assignments
(sid, assgn_given, total_time_allocated, total_time_taken)
assgn_completed,
Where, assgn_given indicates the total number of assignments which the student was expected to complete and assgn_completed is the number of assignments students actually managed to submit. total_time_allocated is the based on the number of assignments the students submitted. Students were encouraged to utilize all the days allocated for completion of assignments. Each assignment was assigned to be completed over a period of three