Categories
Technique++

The Photographer’s Metadata Survival Guide – In Depth Description

Have you wondered, as I did, how to deal with metadata ? I postponed my research hundreds of times until I noticed that Google results evolve very quickly. This was the trigger for me. I decided to learn and build my photographer’s metadata survival guide. What is photo metadata, how can I use them efficiently with my work ?

Photo metadata describe the picture : its title, GPS coordinates, keywords, author, licensing… You can also now write metadata potentially on all you communicate about on the web, providing the content is described in a global dictionary. for instance, you can write metadata on a news article, a FAQ, a HowTo.

I will first dive into the different metadata types, then describe why and how I use metadata.

Before getting deaper into the subject, I would like to tell you in which context I wrote this article.

With this post, I do not intent to expose an exhaustive description of the metadata world. That would be a big world… That’s why I will describe how I implement metadata with my own environment, gear and software :

  • I am a freelance photographer
  • Fujifilm cameras : X-T2 and X100F.
  • I develop my images with Capture One Pro 11.
  • I also use Photoshop CC and Lightroom CC for several tasks.
  • I use WordPress

Another point before starting with the metadata big picture :

Honestly, learning about metadata is quite fun !

I don’t know why it took me so long to try and know more about metadata…

Here are the head chapters of this article :

Metadata : the big picture

What is metadata ?

Photo metadata is first information about the visual content, the rights and the administration of an image. For instance :

  • Visual Content : headline, caption, keywords.
  • Rights : the creator, copyright information, licensing of the image.
  • Administrative : creation date, color management information, creator work email, Getty references, information about a person in the image.

However, metadata is a larger topic than just photo metadata. You can also use metadata to describe your products and services. You can for instance write metadata to describe a recipe or a book, and use an image in this description. In this case case, it’s not just about the image, it’s about the additional information that the image provides to the product. It’s also about all the extra information you can provide about your product.

How is metadata stored ?

There are different norms to write metadata for an image. We will talk about EXIF, IPTC, XMP, HTML.

Search engines like Google, Bing incite web content provider to write metadata in the webpages in order to describe their content. This metadata, named structured data, gives structured information about the content referring to a global dictionary..

Metadata can be stored in the image file itself, for instance in the JPEG or in the RAW file. Apart from the pixels, image formats are compliant to Exif, IPTC and XMP and reserve space for metadata.

Metadata can also be stored in side cart files, which are files that go along with the image file. For instance, we could hace MyImage.jpg and MyImage.xmp. Personaly, I prefer embedding the metadata inside the image because it helps prevent the metadata from being lost. Anyway, even if the metadata is stored in a side-cart file, I will refer to it as embedded metadata (it’s almost the case as the .xmp must stay along the image file).

Metadata can also be stored outside of the image file in the HTML page with HTML tags.

Structured data are pieces of code in the webpage that are visible to search engines and their bot but does not appear during the page display.

Image Embedded Metadata

When metadata is embedded, it is present aside from the pixel information in the image file, JPEG or RAW.

File metadata have been defined by 3 different specifications : Exif, IPTC and XMP. I will present each of them first, and what type of information they add to the image. I will then see how these metadata can be usefull, who use them.

What is EXIF ?

Metadata in camera : EXIF

Exchangeable image file format is a standard that specifies how to store some of the shooting information and a thumbnail image to the data file created by digital cameras.

The Japanese JEITA association initiated the EXIF standard in 1998. Several versions have been published since then. The latest version 2.3, which was revised on 17 May 2019, was jointly formulated by JEITA and CIPA. Exif is supported by almost all camera manufacturers.

JEITA stands for Japan Electronics and Information Technology Industries association. Canon Inc, Nikon Corporation, Fuji, Olympus Corporation, Sony Corporation are members of Jeita.

CIPA stands for Camera & Imaging Products Association. It is a Japan-based organization created in 2002 to deal with technologies related to photography. Most of the big Japanese photography brands are also part of CIPA.

Here is a link to JEITA where you will find the up-to-date specifications for the EXIF standard.

Exif metadata tags cover :

  • Date and time
  • Camera settings. Camera model, serial number, aperture, shutter, focal length, ISO…
  • RAW / Jpeg
  • A thumbnail for previewing the picture on the camera’s LCD screen, in file managers, or in photo manipulation software.
  • Copyright information
  • GPS information
  • Colorspace

What are the Exif recorded by a Fujifilm x-t2 or x100f camera ?

The Fuji camera will create the Exif file automatically for you. However, you can customize two metadata fields in you camera before shooting :

  • Exif “Artist” : Fujifilm X-t2 “Author”
  • Exif “Copyright” : Fujifilm X-t2 “Copyright”

How to Configure the Exif in the Fuji X-T2 : Go in the Menu SET-UP > SAVE DATA SET-UP > COPYRIGHT INFO > ENTER
AUTHOR’S INFO & ENTER COPYRIGHT INFO.

The dialog and the interface is exactly the same with a Fujifilm X100F camera.

Here is an exiftool result of a Fujifilm X-t2 raw of mine (out of camera):

What is IPTC ?

The International Press Telecommunications CouncilTC (IPTC) and the and the Newspaper Association of America (NAA) developed the Information Interchange Model (IIM), the first multi-media news exchange format. IIM is still in use, mainly by “IPTC Fields” for photo metadata.

Phil Harvey documents in detail the IPTC tags.

What is XMP ?

Extensible Metadata Platform (XMP) is an ISO standard, originally created by Adobe Systems Inc., for the creation, processing and interchange of standardized and custom metadata for digital documents and data sets.

Phil Harvey documents in detail the XMP tags.

I will not dig in details into this XMP specification. Another notion I had to acquire though in order to understand the exiftool results was the definition of namespaces. A namespace is a set of metadata tags that are described with the XMP specification. For instance, XMP-dc refers to the Dublin Core namespace tags, XMP-photoshop refers to the Adobe Photoshop namespace tags.

IPTC, Adobe, and IDEAlliance developped in 2004 the new metadata specification IPTCCore on top of the Extensible Metadata Platform (XMP). It transfers metadata values from IPTC headers to the more modern and flexible XMP format. IPTC published then in 2007 the IPTC Extension.

Here is a result of exiftool on a file I edited in Capture One. I typed in each metadata file in Capture One, the name of the field.

---- XMP-iptcCore ----
Location : [IPTC - Image] Location
Intellectual Genre : [[IPTC - Image] Intellectual Genre
Country Code : [IPTC - Image] ISO Country Code
Creator Work Email : [IPTC - Contact] E-Mail(s)
Creator Work Telephone : [IPTC - Contact] Phone(s)
Creator Postal Code : [IPTC - Contact] Postal Code
Creator Work URL : [IPTC - Contact] Website(s)
Creator Address : [IPTC - Contact] Address
Creator City : [IPTC - Contact] City
Creator Country : [IPTC - Contact] Country
Creator Region : [IPTC - Contact] State/Province
Scene : [IPTC - Image] IPTC scene

Coordination between Exif, IPTC and XMP

Metadata Working Group was formed by a consortium of companies in 2007. This group released in 2010 a specification giving recommendations concerning the use of Exif, IPTC and XMP metadata in images.

Some tags are in the 3 specifications. This means that the same information might me in three or more places in your file. For instance, you have the Exif “Artist”, the IPTC “By-line”, the XMP-dc “Creator” which all represent the creator of the image.

Modern image editor are fully aware of this situation and deal with reconciliation.

For instance, Capture One has only one field for the creator : IPTC – Contact, and stores it in Exif “Artist”, IPTC “By-line” and XMP-dc “Creator”.

As a consequence, I am carefull when I change my file metadata, and try not to introduce incoherences in them. When I change metadata, I use a metadata editor like Photoshop or Capture One, which know how to handle duplicates.

Schema in the Metadata Working Group recommendations for metadata.

How to view and edit embedded metadata ?

There are a lot of ways to read the EXIF, IPTC and XMP metadata in your image file. As soon as you get an image file from your camera, you can read the Exif. You will have these information with JPEG, RAW. It is not used with JPEG2000 and GIF however.

Here are some exif readers and/or writers :

  • You can view and edit metadata with you photo editor, for instance Capture One, Photoshop, Lightroom… A lot of the metadata is available here. However, software often change the original metadata names to another name. For instance the Exif metadata “F Number : 9.0” becomes “Aperture : f/9” in Capture One. Photo editing softwares also only display a subset of all existing metadata.
  • You can also use the exiftool developped by Phil Harvey. It reads Exif, IPTC and XMP metadata. I like this tool a lot. It runs on Mac and Windows. Here is how to use it on Windows. Download it, and open a Windows command tool. Follow these instructions to generate an output “MyPix.txt” file with detailed meta information :
rename "exiftool(-k).exe" to "exiftool.exe"

set PATH=.;%PATH%

exiftool -k -a -u -g1 -w txt MyPic.jpg
  • You may code your own program for decoding the metadata of a JPEG. Choose this method only if you need a very specific answer. For instance, here is a Python tutorial : How to extract image metadata with Python.

Privacy and metadata

You may want to keep some metadata private, which means remove them from the image file. The decision to keep them or remove them is up to you. This could for example be the case for GPS coordinates, your camera Serial Number, and maybe also you shooting settings (aperture, ISO…). However, you should be careful, since removing metadata may be illegal in certain jurisdictions.

If you plan to remove IPTC or XMP tags, edit your file with Capture One or Photoshop. Here is an article where I describe different ways to change metadata in Capture One Pro.

You can use also Capture One Pro to remove specifically the Camera Exif. I describe how to proceed in this article how to strip metadata in Capture One.

The different usages of Exif , IPTC and XMP metadata

Indexing

I use metadata in my indexing and archiving process. With keywords for instance, I can retrieve images from my Capture One catalog related to a certain topic, and work on the resulting set.

Thumbnails for File Explorers

The metadata thumbnail is used when you explore your images on your computer for preview display.

Communication Between Teams

You can also use metadata to pass over information about the image to your client or a media agency. In fact, that’s what metadata have been create for. IPTC-Getty is one example. The “Subject Code” field is another example with a controlled vocabulary. However I do not use these possibilities at all so I will not get deaper into this.

Geo Localization

GPS coordinates stored as metadata enables you to use your pictures easily in mapping softwares.

Color Profiling

ICC color profile metadata help imagery software to display the pixels according to a specific color profile. This helps me to color proof my image before sending them to the printer.

Shooting settings

Once your image is on the web, all possible application have access to the file and may use and interpret the metadata their own way.

For instance some websites use metadata to print extra information about your image. For instance, flickr will display your shooting settings when they are available.

Flickr display the camera settings.

Web Search Engines

Among the tools in internet, Search Engines play a major role. They are often the gate between the world and your pictures. This is why helping search engine do their job with your images is so important.

Web search engines use metadata. However we don’t know exactly which metadata and how they use it. Be it Google, Bing, DuckDuckGo, they’ll never get into details when speaking about their internal algorythms.

Here is a short google webmasters’ video on their use of metadata :

Does Google use EXIF data from pictures as a ranking factor ? – Christian Oliveira, Madrid.

Keep in mind that their algorithm is not published, so we don’t know what SEO uses when optimizing the images. Another point is that a lot is being done regarding images right now, the SEO constantly evolve. For these reasons, I keep my Exif data and enhance them when it’s easy to do and relevant.

Google image presents keywords between the search field and the result images. Here is a screenshot I made today.

I queried Google images with “Tout Eiffel”. For instance, the user can choose to refine his search with paris, drawing, wallpaper, nuit… How does google find those search suggestions ? It could be that google analyzes amongst other things the image metadata. Maybe the XMP keywords we set to the image will help the SEO with this kind of filter (Photography/Drawing, Town, country…).

For instance, Google extract some IPTC metadata from your image in order to display some copyright information.

Click on “Image credits” in google image result to see the Creator and Copyright

Google Images exposes IPTC metadata in search results whenever it is available :

IPTC IMM fields exposed by GoogleParameter in Capture OneDefinition of the metadata
By-line[IPTC – Contact] CreatorName of the photographer.
Credit[IPTC – Status] ProviderThe credit to person(s) and/or organisation(s) required by the supplier of the image to be used when published.
Copyright Notice[IPTC – Status] CopyrightContains any necessary copyright notice for claiming the intellectual property for artwork or an object in the image and should identify the current owner of the copyright of this work with associated intellectual property rights.

Google “Usage rights” advanced search filter for Creative Common Licenses

Since 2009, Google Images allows you to filter the search results with Creative Commons licenses : go to Google Images > Tools > Usage Rights. Then you can choose “Labeled for Reuse”…

However, beware that Google can’t tell if the license label is legitimate.

Google finds this information in the XMP metadata of the image. Note that other applications support this Creative Commons licensing mechanism, for instance Flickr, 500px, Wikipedia Commons.

Creative Commons describes in this document how to use XMP metadata to declare CC licenses.

For example :

Here is an image example from the web : “Creative Commons 10th Birthday Celebration San Francisco” by tvol can be reused under the CC BY license.

Google images is also implementing a new feature around licenses. This feature is a mix of embedded metadata and metadata out of the image. Therefore I will speak about it later in this post (here).

Image Metadata out of the Image file : HTML tags

We saw in the previous chapter how to add metadata in the image file (or in the side cart, which an equivalent). In this part, we are going to see how to add metadata outside of the image file.

This can be done in all sorts of ways, every time you use an business catalog, a database of some sort. So I’m not trying to be exhaustive and present all possible solutions. I will concentrate on the one way to add metadata outside of the image file I use in my photography work : HTML tags.

When you display an image an a webpage, you may add HTML tags to enhance it and give precisions about it. As I am using solely WordPress to edit my HTML pages, I will directly speak about HTML tags we can use via the WordPress ecosystem.

We can define for HTML tags for an image :

  • The title. It is currently discouraged as many user agents do not expose the attribute in an accessible manner as required by this specification.
  • The alternate text. It is the text that will be displayed instead of the picture for the blind and low vision community. It is also used by Goolge to understand the meaning of the image as well (See Matt’s video below). So this is a very important tag. However, leave it empty if the image is purely decorative. Use real sentences and not juxtaposed keywords.
  • The caption can be displayed on the webpage by your WordPress Theme or a plugin. However it will not be used in the HTML <figcaption> Tag.
  • The description. I don’t use it because I’m not sure how it works.

Google uses alt text along with computer vision algorithms and the contents of the page to understand the subject matter of the image.

See Google Image best practices.

Matt Cutts from Google.

BEWARE with WordPress and Alternative texts!

In WordPress, you can define the Alternative Text in the Media Editor, and also in for the Gutenberg block of the image in the page or post were you insert the image. This can be misleading. Indeed, the Media Editor Atl text of the image is copied in the Alt text of the Gutenberg block when it is created. Then you can modify the Gutenberg Atl text, it will not change the Atl text of the image. Another point, when you change the Atl text of the image in the Media Editor, it will not update the Atl text in the posts where you already referenced the image.

Media Editor : Title and Alternative text only initialize Title and Alternative text in the post.
Title and Alternative text in the page / post.

Structured data : metadata for search engines

Structured data for web search results

Search results have evolved over the past years, and may contain now more than just the attribution, link and short description of the page content.

Just the attribution, link and short description.

For instance, you write a recipe, and describe several steps in order to make a cake. You can also gove structured information to search engines that will describe your recipe in a way that is predetermined and thus easily understood. You will detail the steps, the time for each step, and the end image of each step in a structure. This structure is published and the main search engines understand more or less the same dictionary and the same description language.

For instance, if we take our recipe again, we can describe it :

  • with the JSON language
  • refering the https://schema.org/
  • this is an object of “type Recipe”

So we will write the description of our recipe and create and HTML block in ou web page where we will save this JSON code. This code will not show when displaying the webpage, but the web engines will index it and know that this webpage is related to a recipe. In this code, you will add references of your images.

<html>
  <head>
    <title>Non-alcoholic Pina Colada</title>
    <script type="application/ld+json">
    {
      "@context": "https://schema.org/",
      "@type": "Recipe",
      "name": "Non-alcoholic Pina Colada",
      "image": [
      "https://example.com/photo.jpg",
      ],
....
      "recipeInstructions": [
        {
          "@type": "HowToStep",
          "text": "Blend 2 cups of pineapple"
        },
        {
          "@type": "HowToStep",
          "text": "Fill a glass with ice."
        },
      "video": {
        "@type": "VideoObject",
        "name": "How to make a Party Coffee Cake",
        "description": "This is how you make a Cake.",
        "thumbnailUrl": [
          "https://example.com/photos/1x1/photo.jpg",
 ...

So with structured data, search result may look like :

A lot nicer !

The JSON struture of a recipe can be found here.

In conclusion:

1 – With structured data, you can now write metadata on potentialy everything you communicate about on the web, providing the object is described in the dictionary.

2 – Your images can be used in referenced in these structured data.

Let’s listen to this communication Google organized around search results :

Structured data markup always take the priority over the algorithmic extraction of data

Phiroze Parakh, Software Engineer, Google

I think this is key. If you’ve got time to describe your content, I suggest to do it.

Search engines use structured data to help understand the content of websites and enable special Search result features.

Where can you find the documentation for structured data :

Google Images : Images lead to content

Another Google communication related this time to Google Images :

We added some text context.
We changed from ranking only the images to what are the best ranking page leading to what they want to do.
It’s not only about the pixels but it’s what’s the page behind that.

Francois Spies, Product Manager, Google Images

As a photographer, this speaks to me. This means that Google Images is not like Instagram anymore. The main purpose is not to display images. Now, images are a gate to help users go to content, may it be a product, service, knowledge article… In the long term, Google Images will maybe only display images when they are part of an object structured data like a product, a recipe…

An example already available in Google Images : Prominent badges.

If you include structured data, Google Images may display your images as rich results, including a prominent badge, which give users relevant information about your page and can drive better targeted traffic to your site. Google Images supports structured data for the following types:

Google images’ search : product badge

Google Image license metadata

Image licensing is a new Google feature which is now in beta. You can associate more elaborate licensing data to your image, like a licensing page, that Google could display to the image viewer.

With this information, Google will be able to display your image with a licensing badge in Google images for instance. Users will then know who to contact in order to purchase it.

How does Google Image licensing work :

  • either you provide following IPTC / XMP metadata in your image file : Web Statement of Rights and Licensor URL.
  • or you provide a structured data for the licensing of this image in the webpage that displays it.

Here is a Google FAQ on Image License Metadata in Google Images.

You can start now including these information in your photography process. However, the main photo edition software are not all ready yet. I digged into this topic in this chapter of another post Zoom in Google Image License Metadata.

My metadata workflow for photography with Fuji cameras and Capture One Pro

In this chapter, I will detail my metadata to you. This process is always evolving so this is a photography in time.

  • In my camera : I set the CREATOR and COPYRIGHT in my Fujifilm X-T2 and X100F. However, I do not geotag my pictures in camera.
  • I copy my RAF or JPEG on disk in Windows.
  • I use Lightroom to geotag the files. This means that I import the files into Lightroom with no metadata presets applied, I go to MAP and set the geo coordinates with the graphic map interface which is nicely designed ( locations can be named and saved). Then in the menu Metadata > Save metadata to file, I flush the GPS data to the files (xmp for RAFs and directly to JPEG) and remove the photos from the Lightroom catalog.
  • I import the files into Capture One
  • I edit the files (colorgrading, luminosity, crop….)
  • I mainly set the Exif / IPTC / XMP metadata in Capture One.
    • Keywords. I use English and not French.
  • I export the file as JPEG

Now let’s talk about web publication. I’m a WordPress user, so I will focus on this environment :

  • I created a License Page in my website.
  • I use the Yoast plugin for the following structured data : Breadcrumbs, Organization and Logo.
  • I use the “EWWW Image Optimizer” plugin. In Settings > Basic, I unckecked “Remove Metadata” which is checked by default.
  • When creating content in a post, a page or an Envira gallery, I embed the image(s) and fill in the “Alt Text”.
  • Whenever meaningfull, I code a JSON block and insert it as a Gutenberg Custom HTML block in the post or page. I could use pluging to deal with JSON structured data, however I do not want to be tied with a plugin for this data. This includes following structured data :
    • Article
    • FAQ
    • How-to
    • Video

Conclusion

We’ve reached the end of this post on photo metadata : where is it, and how to use it.

I hope you enjoyed it and will find some ideas for you photography business. Personnaly, digging into this metadata subject helped me refine and simplify my process and also opened a new reflection on how to use my images in the web as gates to my photography products and services.

If you wish more information related to metadata and Capture One Pro, you could read :