VAMPIRE___Visual Active Memory Processes and Interactive REtrieval
VAMPIRE Events Publications Consortium Media archive
Intro Research Activities Scenario 'Mobile augmented reality' Scenario 'video annotations' Slideshow

Introduction to 'Video Annotation'

Video annotation is the other main application scenario in the VAMPIRE project. We concentrate on the annotation of sports videos in particular tennis. The ever-increasing popularity of sport means that there is a vast amount of sports footage being recorded every day. For example,each year the British Broadcasting Corporation (BBC) provides coverage of the Wimbledon tennis championships. During this event up to fifteen different live feeds are being recorded simultaneously. At the end of the Wimbledon fortnight over one thousand tapes of sports-related footage are brought back to the BBC headquarters. All this video data is generated from just one event. The BBC records hundreds of different sporting events each year.

Ideally, all this sports video should be annotated, and the annotation metadata generated from it should be stored in a database along with the video data. Such a system would allow an operator to retrieve any shot, or important event within a shot, at a later date. Also the level of annotation provided by the system should be adequate to facilitate simple text-based queries. For example a typical query could be: Retrieve the point when Venus Williams won the first set in the Wimbledon final 2002. Such a system would have many uses, such as in the production of television sport programmes and documentaries. It would help also ensure our cultural preservation. Due to the large amount of material being generated, manual annotation is both impractical and very expensive. However, automatic annotation is a very demanding and an extremely challenging computer vision task as it involves high-level scene interpretation - a holy grail of computer vision.

In VAMPIRE, we apply the Active Memory concept to the analysis and browsing of a tennis video. Annotation can be provided at all levels, from shot detection to a complete breakdown of the scoring during the match. At present the system will automatically analyse a tennis video to the extent that it can identify the outcome of individual video shots, with reasonable accuracy. The application was tested using the same Visual Active Memory system as was used in the augmented reality office scenario.

AVI 640X480 Pixel MPG 640X480 Pixel

The main panel provides a top-down summary of the game so far, and can display the corresponding video clips, optionally with the ball and player tracks.


A separate panel can be used to display lower-level information (mosaic, shot boundary / classification etc.) about the progress of individual shots.

Methods used here

  • mosaicing
  • foreground/background separation
  • tennis ball and tennis player tracking
  • tennis court recognition
  • action recognition
  • contextual analysis