The workbench will allow learning scientists to
1) define and modify behavior categories of interest (e.g., gaming, unresponsiveness, off-task conversation, help avoidance). The Workbench will also support researchers in automatically re-labeling data when labeling schemes change.
2) label previously collected educational log data with the categories of interest, considerably faster than is possible through previous live observation or existing data labeling methods, through a “Customized Log Action Viewer” (CLAV) (see Figure 1).
3) collaborate with others in labeling data by providing tools to communicate and document labeling guidelines and standards.
4) validate inter-rater reliability between multiple labelers of the same educational log data corpus.
5) analyze textual data (e.g., chat), in collaborative learning situations, by integrating a text categorization tool such as TagHelper (e.g. Rosé et al, 2008).
6) automatically distill additional information from log files for use in machine learning, such as estimates of student knowledge and context about student response time (i.e. how much faster or slower was the student’s action, than the average for that problem step) .
7) provide support for directly and immediately running the labeled data through a machine-learning tool, such as WEKA or RapidMiner.
8 ) produce code that can be used to immediately transfer the detectors generated by the EDM Workbench and machine-learning tool into educational software that can use the detectors to react to student metacognitive and motivational behavior in real time.
9) export resultant models of student behavior to tools which enable sophisticated secondary analyses, such as the sequential pattern analysis offered by Jeong’s (2003) DAT.
Through the use of a tool such as the one proposed here, the process of developing a detector of relevant metacognitive and motivational behaviors can be sped up by as much as a factor of 100 – i.e. a detector can be developed in about 1% as much time as was previously possible. Just the use of “text replays” (cf. Baker, Corbett, & Wagner, 2006; Baker & de Carvalho, 2008), a visualization technique similar to but much less flexible than the visualizations given by CLAV (point 2 above) on previously collected log data has been shown to speed detector development by about 40 times, with no reduction in detector accuracy. However, text replays do not provide any of the other capabilities listed above.