Metadata Extractor
|
The Metadata Extractor programmatically extract preservation metadata from a range of file formats like PDF documents, image files, sound files Microsoft office documents, and many others.
Metadata Extractor can be used as part of an archiving and preservation strategy for digital artefacts.
The Tool builds on the Library's work on digital preservation, and its logical preservation metadata schema. It is designed to:
1. automatically extracts preservation-related metadata from digital files
2. output that metadata in a standard format (XML) for use in preservation activities.
The Tool was designed for preservation processes and activities, but can be used to for other tasks, such as the extraction of metadata for resource discovery.
The Metadata Extract Tool includes a number of 'adapters' that extract metadata from specific file types. Extractors are currently provided for:
1. Images: BMP, GIF, JPEG and TIFF.
2. Office documents: MS Word (version 2, 6), Word Perfect, Open Office (version 1), MS Works, MS Excel, MS PowerPoint, and PDF.
3. Audio and Video: WAV and MP3.
4. Markup languages: HTML and XML.
If a file type is unknown the tool applies a generic adapter, which extracts data that the host system 'knows' about any given file (such as size, filename, and date created).
Capabilities
The tool has both a Microsoft Windows interface and a UNIX command line interface. This enables work to be automated through batch processing or processed on an individual basis as required.
Requirements:
* Java
The license of this software is Free, you can free download and free use this file converter software.