|
|
 |
CiteReader
>> Identify, Quantify and Analyze: ITX CiteReader is a systematic process that scans documents using the Portable Document Format (PDF) to capture and link references together in a database. This citation data can be used for a variety of purposes, including:
- Identifying the most important works in a given collection.
- Analyzing patterns in citations.
- Providing hyperlinks to help researchers quickly move through a chain of research.
>> Systematic Process: The CiteReader process is comprised of four steps:
- Capture - Scans for reference section in PDF documents.
- Verify - A workflow application that uses efficient human labor to verify captured references.
- Parse - Divides the captured references into smaller fields (author, title, etc.)
- Match - Establishes a connection between the referring and cited documents either within the database or externally.
>> Accuracy & Quality: The quality of CiteReader is amazing:
- Under most conditions, it can capture and verify references stored in a "Reference" section with nearly 100% accuracy.
- About 90% of verified references can be successfully parsed.
- Matching rates depend on the quality and size of the document collection, but hit rates over 5-% and beyond are possible through the use of external databases.
|
|