technology-Chris Pritchard-Photodisc-Getty Images-139616349

Programming Images: Using WIC to extract metadata

March 27, 2012 | By Karthik Srinivasan | By Devs, For Devs, Technology

Not surprisingly, a lot of image handling goes on at Getty Images.  On the technology side, our image tools span multiple programming languages, frameworks, tools and applications.  Most .NET developers are probably used to working with System.Drawing namespace for all common image processing needs like resizing and cropping.

technology-Chris Pritchard-Photodisc-Getty Images-139616349

(Photo by Chris Pritchard/Photodisc/Getty Images)

However there is a catch with System.Drawing that some developers miss, which make the System.Drawing namespace ill suited for backend and server processing needs.  This is highlighted on Msdn and can be catastrophic for the massive amount of content which goes through our systems:

Classes within the System.Drawing namespace are not supported for use within a Windows or ASP.NET service. Attempting to use these classes from within one of these application types may produce unexpected problems, such as diminished service performance and run-time exceptions. For a supported alternative, see Windows Imaging Components.

Keeping the possibility of performance degradation in mind, we looked around for an image processing library when building a new .NET image processing service.   We ended up using Windows Imaging Components (WIC) as it has native support within the .NET framework, provided easy integration, and did exactly what we needed it to do.  WIC was introduced with WPF as the underlying image-processing technology for WPF.  For those interested in learning more check out the MSDN topic about Windows Imaging Components.  We are using WIC for metadata extraction and some image processing on some (but not all) of our internal systems.

This post will focus on how you can extract metadata from an image using WIC.  Image metadata is extremely important for us as it provides a mechanism for applying context to the vast number of images we handle every day.

WIC, by default, includes several built-in codecs:

Codec Mime Types
BMP (Windows Bitmap Format), BMP Specification v5. image/bmp
GIF (Graphics Interchange Format 89a), GIF Specification 89a/89m image/gif
JPEG (Joint Photographic Experts Group), JFIF Specification 1.02 image/jpeg, image/jpe, image/jpg
PNG (Portable Network Graphics), PNG Specification 1.2 image/png
TIFF (Tagged Image File Format), TIFF Specification 6.0 image/tiff, image/tif

 

The BitmapMetadata class provides support for reading and writing metadata to and from a bitmap image.  Each supported bitmap image format handles metadata differently, but the facility for reading and writing metadata is the same.  This allows us to handle different formats with a clean, uniform interface.

WIC supports the following image metadata schemas: Exchangeable image file (Exif), tEXt (PNG Textual Data), image file directory (IFD), International Press Telecommunications Council (IPTC), and Extensible Metadata Platform (XMP). If a BitmapMetadata is exposed by a BitmapFrame that is obtained by using a BitmapDecoder, it is read-only by default and mutable operations will throw an exception.  If it is exposed by a BitmapFrame that wraps another BitmapSource, it is mutable on construction. GetQuery methods can be used to construct and read metadata queries.

Here’s a sample of reading an image and pulling out all metadata elements in a simple Key/Value list:

using System.Collections.Generic;
using System.Windows.Media.Imaging;
using System.IO;
namespace ImageMetadata
{
public class RawMetadata
{
public string Key { get; set; }
public object Value { get; set; }
}
public class MetadataProcessing
{
List _rawMetadataItems = new List();
public ListGetMetadata(string imagePath)
{
using (Stream fs = File.Open(imagePath, FileMode.Open))
{
BitmapMetadata bitmapMetaData = null;
var photoDecoder = BitmapDecoder.Create(fs,
BitmapCreateOptions.DelayCreation, BitmapCacheOption.None);
try
{
bitmapMetaData = photoDecoder.Frames[0].Metadata as
BitmapMetadata;
bitmapMetaData.Freeze(); //this makes metadata
unmodifiable
}
finally
{
if (photoDecoder != null && photoDecoder.Dispatcher
!= null && photoDecoder.Dispatcher.Thread.IsAlive) //release
photoDecoder.Dispatcher.InvokeShutdown();
photoDecoder = null;
}
if (bitmapMetaData != null)
Extract(bitmapMetaData, string.Empty);
}
return _rawMetadataItems;
}
private void Extract(BitmapMetadata bitmapMetadata, string
query)
{
foreach (string relativeQuery in bitmapMetadata)
{
string fullQuery = query + relativeQuery;
var metadataQueryReader = bitmapMetadata.GetQuery
(relativeQuery) ?? string.Empty;
var metadataItem = new RawMetadata();
metadataItem.Key = fullQuery;
metadataItem.Value = metadataQueryReader;
_rawMetadataItems.Add(metadataItem);
var innerBitmapMetadata = metadataQueryReader as
BitmapMetadata;
if (innerBitmapMetadata != null)
Extract(innerBitmapMetadata, fullQuery);
}
}
}
}

For above code to work, following References should be included : “PresentationCore”, “WindowsBase” , “System.Xaml”

Handling Custom Properties

WIC provides BitmapMetadata.GetQuery(“query”) method where we can extract metadata from a specific metadata property. In the above method we dynamically iterate through all properties. However you can short-circuit looping through all fields if you know the path to the IPTC or XMP field you want to access.  Following are some examples of paths to XMP and IPTC properties:

XMP:

{ MetaDataField.CaptionWriter, "/xmp/photoshop:CaptionWriter" },
{ MetaDataField.CreateDate, "/xmp/photoshop:DateCreated" },
{ MetaDataField.Country,"/xmp/photoshop:Country" },
{ MetaDataField.BylineName, "/xmp/dc:creator" },
{ MetaDataField.Caption, "/xmp/dc:description" }

IPTC:

{ MetaDataField.CaptionWriter , @"/app13/irb/8bimiptc/iptc/Writer\/Editor" },
{ MetaDataField.CreateDate , "/app13/irb/8bimiptc/iptc/Date Created" },
{ MetaDataField.Source , "/app13/irb/8bimiptc/iptc/Source" },
{ MetaDataField.City , "/app13/irb/8bimiptc/iptc/City" },
{ MetaDataField.Caption , "/app13/irb/8bimiptc/iptc/Caption" },
{ MetaDataField.Credit, "/app13/irb/8bimiptc/iptc/Credit"},
{ MetaDataField.Headline, "/app13/irb/8bimiptc/iptc/Headline"},
{ MetaDataField.State, @"/app13/irb/8bimiptc/iptc/Province\/State"},

We can use the above code to extract specific fields like:

C#: Extract(bitmapMetaData, “/app13/irb/8bimiptc/iptc/Headline”);

More XMP and IPTC tags can be found here:

XMP: http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html

IPTC: http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/IPTC.html

More posts by this author

  • eric

    How can I read the IPTC EnvelopeRecord Tags using the above query?

    i.e. I would like to read the following via the query above:

    EnvelopeRecordVersion, Destination, FileFormat, FileVersion, ServiceIdentifier, EnvelopeNumber, ProductID, EnvelopePriority,

    DateSent, TimeSent, CodedCharacterSet, UniqueObjectName, ARMIdentifier, ARMVersion

  • eric

    I guess my question is what’s the “Query Path” I should use to get/set the IPTC EnvelopeRecord.

    This is NOT work for Envelope Tag:
    /app13/irb/8bimiptc/iptc/EnvelopeRecordVersion