Invitation to Educational Data Mining Workshop with Dr. Joseph E. Beck

The Department of Information Systems and Computer Science cordially invites faculty and students to a workshop with Dr. Joseph E. Beck of Worcester Polytechnic Institute.

This workshop consists of a set of three talks:
   1. Thrashing: Failure of Students to Learn Material in a Timely Manner
   2. WEBsistments: Integrating Procedural Practice and Web-based Content
   3. Too many results! Focusing on Strong, Casual Relations in the Data

28 May 2012, 9:00am to 5:00pm
29 May 2012, 9:00am to 12:00nn
Registration starts at 8:00am. Workshop begins promptly at 9:00am.

Ching Tan Room (SOM 111), John Gokongwei School of Management,
Ateneo de Manila University, Quezon City
View on Google Maps

Participants must register online at
Strictly no walk-ins.
The limited slots will be allotted on a first-come first-served basis.
Confirmation of registration will be sent via e-mail.
Registration closes on Friday, 25 May 2012.

Php 500.00 for the entire workshop
This fee covers three snacks, certificates, and some miscellaneous expenses.
Lunch is NOT included.
Payments must be made in cash in either of the following methods:
   a. Metrobank deposit (please see bank deposit instructions)
   b. upon arrival on the first day of the workshop

For further inquiries, please contact the Workshop Secretariat at


Thrashing: Failure of Students to Learn Material in a Timely Manner
Many intelligent tutoring systems operate under the mantra of “practice makes perfect.” That is, students acquire mastery by practicing problem solving. Although this idea holds intuitive appeal, many students using such software do not in fact acquire the knowledge one would expect. Depending on the exact criteria used, between 10% and 35% of students fail to master skills in a timely manner. For these students, practice does not make perfect. Since problem-solving practice fails for these students, our goal is to identify them as quickly as possible and found an alternate intervention for them. We constructed a detector that, after a student has completed two problems, is able to achieve an R2 of 0.39. This detector has a low false positive rate (1.6%), so is appropriate for triggering an intervention, but has a high false negative rate (73.9%), so misses many students who will in fact thrash. In addition, we found that thrashing is connected with student gaming behaviors, as student who would game also registered 8.6 times as high on our gaming detector. The causal path between thrashing and gaming is a subject of future work, and this talk will discuss future and ongoing experiments to tease apart these issues.

WEBsistments: Integrating Procedural Practice and Web-based Content
Many computer tutors provide very strong environments for computer-assisted problem solving practice. These tutors provide coaching for students in the form of performance feedback or enabling the student to solve complex problems step by step. However, many tutors do not provide declarative instruction on the subject matter. This lack is understandable, as the skills needed to create instructional content differ from those needed to model student performance and provide coaching during problem solving. In addition, computer tutors are expensive environments to construct, and spending even more resources on them can be prohibitive. This paper introduces an approach to cheaply integrate problem-solving practice and declarative content. We extend our tutor to include links to existing web pages, created by external parties, that teach the 150 skills covered by our tutor. When students are stuck on a question, they have the option of requesting to see a web page to teach them how to solve it. We found that students were generally reluctant to use this functionality, but those who did appear to learn from using the web resources. Since the tutor designers had to simply find effective resources on the web, rather than create our own, construction costs are greatly reduced.

Too many results! Focusing on Strong, Casual Relations in the Data
Advances in storage and networking have led to an explosion of data available for analysis. This trend has had many positive impacts, and has greatly extended the scope and quality of analyses performed from data collected by intelligent tutoring systems. As we collect more types of data, researchers are capable of testing many more hypotheses than they were previously capable of. In addition, an increase in sample sizes results in greater statistical power, enabling greater sensitivity to detect small effects in the data. Although these advances have brought great benefits, there is also a definite cost in terms of an increase in the number of analyses that are reportable due to statistical “significance,” but are of marginal utility and may even be false. The reason for concern is due to arithmetic. First, as the number of variables collected grows, the number of testable relationships increases as variables, since each new variable can be tested against all of the existing variables in the database. Second, the ability to detect statistically effects increases according to sqrt(rows). This two effects are additive, and result in a vast increase in the number of significant relationships one can discover from the collected data. The problem arises when one considers the number of useful relationships in the data. Many, many variables will correlate with each other just due to random chance, or due to being associated merely by sharing a common cause. Discovering all of these chance associations is not exciting from a research standpoint, but by community standards, such results would be publishable, and it is not always immediately obvious from statistical hypothesis testing which results are of interest and which are not. Simply put, we do not want to be in a community where researchers are reporting every effect they discover that has a small p-value. This talk we discuss two better methods for finding relationships in the data that are of broader use to the community: those that are of high magnitude, and those that are causal rather than merely relational.


Dr. Beck received his PhD in Computer Science from the University of Massachusetts Amherst in May 2001. He has worked with the Computer Science Department of Worcester Polytechnic Institute since September 2007; first as a Research Scientist, then as Assistant Professor since July 2009. His areas of specialization include intelligent tutoring systems, educational data mining, and artificial intelligence.

Comments are closed.