Metadata for Multimedia and Non-Text Information

Information Organization and Retrieval (INFOSYS 202)

Andrea Moed, iSchool, UC Berkeley (Tuesday, September 25, 2006)

Today's Lecture

...and once again, we want you to be thinking about the relationships and the tradeoffs between IO and IR, and the specific issues non-text media introduces

My background in multimedia

Before coming to SIMS, I produced multimedia as an exhibit designer and independent producer

Over the course of my career, multimedia production as I knew it moved from professional-grade hardware and software, to "prosumer" hardware and professional software, to regular camcorders and PCs (or rather, Macs) and software that just happened to come with those things

Coincidentally, I produced more and more multimedia

So I knew that multimedia would be important to the information systems I studied here, which caused me blather on to Bob about multimedia when I took this class last year, long story short, here I am.

I have never lectured for an hour and a half before; I don't intend to start now, so I'll stop several times for Discussion (the slide headers will even say "Discussion" sometimes) and we will try to expend some of this huge time slot that way.

What Is Multimedia?

Multimedia Documents: Concepts and Relationships


playlist, album, queue

composite, collage, mix, remix

clip, sample, bite

remake, cover version

layout, presentation, performance

broad-/narrow-/simul-/tele-/web-/pod- cast

installation, environment

Compare to Elaine's discussion in chapter 3 about relationships in the book context, such as "revision, update, translation" and whether the identity of a work is preserved.

Look at the groupings... concepts are related, but not quite the same. Can anyone at look one of the groups, and explain the difference between the concepts

Follow-up: How might the metadata for those various things be different?

Multimedia Metadata in the Wild [1]


Source: Matt Earp

Multimedia Metadata in the Wild [2]



Source: Cycling74.com

Multimedia Metadata in the Wild [3]

Discussion

Lossy compression, format migration, interpretive difficulty, DRM might make it hard

Bob: One really important difference between text and mm organization is that for text almost all encodings are "lossless" so that you don't have to consider that when you create or capture content. For mm this is often one of the most critical considerations.For other works this isn't necessary.

For example, many museums have taken extremely high resolution photos of their most valuable paintings so that they can study the changes in pigmentation and degeneration over time.

Think of the variety of audio formats -- the mp3 revolution was enabled by innovations in audio compression that could radically shrink the encoded file size w//o comparable loss in apparent fidelity.

Curatorial Metadata Creation

Categories for the Description of Works of Art

Object/Work*
Classification*
Titles or Names*
Creation*
Styles/Periods/Groups/Movements
Measurements*
Materials and Techniques*
Inscriptions/Marks
State
Edition
Facture
Orientation/Arrangement
Physical Description
Condition/Examination History
Conservation/Treatment History
Subject Matter*
Context
Descriptive Note
Critical Responses
Related Works
Current Location*
Copyright/Restrictions
Ownership/Collecting History
Exhibition/Loan History
Cataloging History
Related Visual Documentation
Related Textual References*

AUTHORITIES
Person/Corporate Body Authority*
Place/Location Authority*
Generic Concept Authority*
Subject Authority*


* core metadata
Source:http://www.getty.edu/research/conducting_research/standards/cdwa/definitions.html

In multimedia, we really do still organize to enable retrieval With big collections, there's no hope of finding what you want if it isn't labelled and categorized. And one of the professions that have been doing this the longest is curators and art historians.

From the introduction to CDWA at http://www.getty.edu/research/conducting_research/standards/cdwa/introduction.html

The CDWA is a product of the Art Information Task Force (AITF)... Formed in the early 1990s, the task force was made up of representatives from the communities that provide and use art information: art historians, museum curators and registrars, visual resource professionals, art librarians, information managers, and technical specialists.

Bob: the big question here is "what kinds of people/skills are required to be a 'metadata-maker' at each of these levels" -- and at what cost? Why do we need different levels of description? For what kinds of works are we making this investment?

Curatorial Metadata Creation: Image "Subject Matter"

  • Subject Matter - Description: Generic elements
    • "A woman holding a baby... with three men located in front of her"
  • Subject Matter - Identification: Names of subjects; iconography
    • "Balthasar, Melchior..." "Adoration of the Magi"
  • Subject Matter - Interpretation: Symbolic meaning
    • "The Magi represent the Three Ages of Man..."
(Harpring, course reader p. 275)

Andrea Mantegna, "Adoration of the Magi."
Copyright 2006 The J. Paul Getty Trust. All rights reserved.

Curatorial Metadata: Discussion

The Context-Capture Approach: Metadata for "the rest of us"?

"This process of manual metadata tagging, subjective and labor-intensive, may work for Corbis, but it's a lot to ask of the rest of us... The metadata most of us attach to our photos is pretty pathetic. We... end up with a hard disk full of photos with names like DSC00012.jpg and DSC00234.jpg. As the years go on, DSC00234.jpg will become an archaeological artifact that might as well be labeled Don't_Know_Don't_Care.jpg."
- David Weinberger, "Point. Shoot. Kiss It Good-bye." Wired, October 2004

Context Metadata: The Exif Standard for Digital Photos

Tag Value
Manufacturer CASIO
Model QV-4000
Orientation top - left
Software Ver1.01
Date and Time 2003:08:11 16:45:32
YCbCr Positioning centered
Compression JPEG compression
x-Resolution 72.00
y-Resolution 72.00
Resolution Unit Inch
Exposure Time 1/659 sec.
FNumber f/4.0
ExposureProgram Normal program
Exif Version Exif Version 2.1
Date and Time (original) 2003:08:11 16:45:32
Date and Time (digitized) 2003:08:11 16:45:32
ComponentsConfiguration Y Cb Cr -
Compressed Bits per Pixel 4.01
Exposure Bias 0.0
MaxApertureValue 2.00
Metering Mode Pattern
Flash Flash did not fire.
Focal Length 20.1 mm
Maker Note 432 bytes unknown data
FlashPixVersion FlashPix Version 1.0
Color Space sRGB
PixelXDimension 2240
PixelYDimension 1680
File Source DSC
InteroperabilityIndex R98
InteroperabilityVersion (null)
Source: Exchangeable image file format. (2006, September 11). In Wikipedia, The Free Encyclopedia. Retrieved 04:08, September 25, 2006, from http://en.wikipedia.org/w/index.php?title=Exchangeable_image_file_format&oldid=75121598
[Notes: You get date&time, camera settings; there is room for copyright info; location]

Context Metadata: Auto-Organization of Personal Photos

Exif Data + GPS + Spatiotemporal Data + Computation = Photographic Context


Source: Naaman, et al. "Context Data in Geo-Referenced Photo Collections." Talk slides (ACM MM 2004).

Context Metadata: More Application Areas

These apps go beyond aiding retrieval to helping produce new, more numerous compositions of media.

Context Metadata: Discussion

The Social Metadata Approach: Metadata from Use and Re-Use

Is this more critical for multimedia than for textual documents?

get people to explain why

(Bob thinks it has to do with the greater requirenent for technology and the greater diversity of encoding fornats for mm comparedto text)

Social Metadata from Personal Sharing

Social Metadata from Public Sharing and Discussion

Social Metadata from Sampling and Remix

[cite Ryan Shaw's research]

Social Metadata: Discussion

Conclusion: Too Little Multimedia Metadata... Or Too Much?