Extract Metadata Action

Description

This Action retrieves the metadata of the default Document Resource from the Resource Service. It extracts the configured metadata properties based on the mimetype of the Document Resource and stores them as Document Resource Properties. If configured, a default section is used for mimetypes that don't have a specific configuration.

Java Class

The Action is implemented by the Java class de.ecm4u.faw.api.impl.ExtractMetadataAction.

Parameters

  • ExtractMetadataAction.mimetypeToMetadata: A mapping from mimetype to metadata that shall be extracted.

Example Parameters

This parameter defines a metadata mapping for two mimetypes, application/pdf and message/rfc822. Depending on the mimetype of the default Document Resource one of these is chosen. The metadata is retrieved from the Resource Service and parsed according to the mapping for the mimetype. Since the response of the Resource Service contains sections (connector, conent, ...), the mapping specifies which property in which section shall be saved a swhich Document Resource Property.

Given a response from the Resource Service like this ...

{
    "subresources": {},
    "connector": {
        "name": "lorem-426.pdf",
        "modified": "2018-07-19T17:57:04.855+02:00",
        "size": 24254,
        "title": ""
    },
    "resource": {
        "backend": "http://alf-sbcs.ecm4u.intra/",
        "mimetype": "application/pdf",
        "id": 768,
        "defaultRendition": "previewHtml"
    },
    "content": {
        "producer": "LibreOffice 4.2",
        "creator": "Writer",
        "creationDate": "2015-10-01T12:33:47.000+0200"
    }
}

this parameter can be used ...

ExtractMetadataAction.mimetypeToMetadata:
  application/pdf:
    connector:
      name: resource_name
      title: resource_title
      size: resouce_size
    content:
      producer: resource_producer
  message/rfc822:
    connector:
      name: resource_name
      title: resource_title
      size: resouce_size
    content:
      subject: resource_subject
      msgId: resource_msgid
      from: resource_from
      to: resource_to
      received: resource_received
      sent: resource_sent

to save these Document Resource Properties:

  • resource_name = "lorem-426.pdf"
  • resource_title = ""
  • resource_size = 24254
  • resource_producer = ""LibreOffice 4.2"