Dealing with Missing/Incorrect Meta-Data
The Library Of Congress has a huge amount of historical data accessible through the American Memory collections. These collections include old videos, manuscripts, books, pictures and web archives. Collection items are usually associated with meta-data information which might be incorrect, missing or inconsistent with each other. Degraded quality of meta-data will result in low-quality search results. So one of the important directions that CADS people pursue, is to deal with such problem automatically. Projects accomplished in this area are listed below. These projects include different approaches to meta-data remediation. Chrono/Geo metadata remediation packages are available here for download, more packages are coming soon.
- Matching Wikipedia with LCSH
- By David Gliech and Ying Wang
- publication
- David's presentation
- Ying's presentation /spring 2008
- Disambiguation Framework for Proper Names
- By Xiangrui Meng and Ying Wang
- Unified Framework presentation/summer 2009
- Geo Name presentation/spring 2008
- Geo/Chrono Metadata Remediation Using External Resources
- By Xiangrui Meng and Ying Wang
- Chrono meta-data/summer 2009
- Geo meta-data/summer 2008
- Comparison of Controlled vs Open Authority for Metadata Remediation
- By Xiangrui Meng and Ying Wang
- Chrono meta-data/summer 2009
- Automatic Keyword/Title Generation
- By Farnaz Ronaghi
-
presentation/summer 2009
-
presentation/winter 2010
- Automatic Extraction of Chronological and Geographic References
- By Xiangrui Meng and Ying Wang
-
presentation/winter 2010


