Content ExtRactor and MINEr
CERMINE is a Java library and a web service for extracting metadata and content from scientific articles in born-digital form. The system analyses the content of a PDF file and attempts to extract information such as:
- Title of the article
- Journal information (title, etc.)
- Bibliographic information (volume, issue, page numbers, etc.)
- Authors and affiliations
- Keywords
- Abstract
- Bibliographic references