Using Automatic Metadata Extraction to Build a Structured Syllabus Repository
Syllabi are important documents created by instructors for students. Students use syllabi to find information and to prepare for class. Instructors often need to find similar syllabi from other instructors to prepare new courses or to improve their old courses. Thus, gathering syllabi that are freely available, and creating useful services on top of the collection, will yield a digital library of value for the educational community. However, gathering and building a repository of syllabi is complicated by the unstructured nature of syllabus representation and the lack of a unified vocabulary in syllabus construction. In this paper, we propose an intelligent approach to automatically annotate freely-available syllabi from the Web to benefit the educational community by supporting services such as semantic search. We discuss our detailed process for converting unstructured syllabi to structured representations through effective information recognition, segmentation, and classification. Our evaluation results proved the effectiveness of our extractor and also suggested a few aspects in need of improvement. We hope our reported work will also benefit people who are interested in building other genre specific repositories.
|Main Author:||Yu, Xiaoyan|
|Other Authors:||Fan, Weiguo, Perez-Quinones, Manuel, Fox, Edward, Cameron, William, Teng, GuoFang, Cassel, Lillian|