Annotation Structure#

class momaapi.data.ann.AAct(info, entities, ias, tas, atts, rels)#

Class for an atomic action annotation. Atomic actions are unary predicates that actors perform.

Variables
  • id_entity – Entity ID

  • kind_entity – type of the entity

  • cname_entity – Entity class name

  • cid_entity – Entity class ID

  • start – start time of the atomic action in seconds, relative to the start of the activity video

  • end – end time of the atomic action in seconds, relative to the start of the activity video

class momaapi.data.ann.Act(ann, taxonomy)#

Class for an activity annotation. An activity is the coarsest level of annotation, consisting of a series of subactivities that are decomposed into smaller subactivities.

Variables
  • cname – Activity class name

  • cid – Activity class ID

  • start – Start time of the activity in seconds

  • end – End time of the activity in seconds

  • ids_sact – List of sub-activity IDs

class momaapi.data.ann.BBox(ann)#

Bounding box in the form of [x, y, w, h]. These are utilized to localize entities.

Variables
  • x – x-coordinate of the top-left corner of the bounding box

  • y – y-coordinate of the top-left corner of the bounding box

  • w – width of the bounding box

  • h – height of the bounding box

class momaapi.data.ann.Clip(ann, neighbors)#

A clip corresponds to a 1 second/5 frames video clip centered at the higher-order interaction - <1 second/5 frames if exceeds the raw video boundary - Currently, only clips from the test set have been generated

class momaapi.data.ann.Entity(ann, kind, taxonomy)#

Class of an annotation of an entity. Entities are the building blocks of interactions. They are either human actors or inhuman objects.

Variables
  • id – entity ID

  • kind – kind of the entity, either “actor” or “object”

  • cname – class name of the entity

  • cid – class ID of the entity

  • bbox – bounding box of the entity

class momaapi.data.ann.HOI(ann, taxonomy_actor, taxonomy_object, taxonomy_ia, taxonomy_ta, taxonomy_att, taxonomy_rel)#

Class for a higher order interaction. A higher-order interaction, abbreviated as HOI, is a predicate involving two or more entities.

Variables
  • id – HOI annotation ID

  • time – time of the HOI annotation in seconds, relative to the start of the activity video

  • actors – list of actor entities involved in the interaction

  • ias – list of intransitive actions occuring between actors

  • tas – list of transitive actions occuring between actors

  • atts – list of attributes that the actor has

  • rels – list of relationships between entities in the interaction

class momaapi.data.ann.Metadatum(ann)#

Metadata class for a video. The metadata contains information for videos in the MOMA-LRG dataset, the properties of which are detailed below.

Variables
  • id – Activity ID

  • fname – File name of the video

  • num_frames – Number of frames in the video

  • width – Width of the video resolution

  • height – Height of the video resolution

  • duration – Duration of the video in seconds

get_fid(time)#

Get the frame ID given a timestamp in seconds :param time: Timestamp in seconds :type time: float

class momaapi.data.ann.Predicate(ann, kind, taxonomy)#

Predicate class, representing unary and binary predicates. Predicates are of the form [src] (cid) [trg], where src refers to the “source entity” performing the action and trg to the “target entity” who is affected by the source entity.

Variables
  • kind – kind of the predicate

  • cname – class name of the predicate

  • id_src – ID of the source entity

  • id_trg – ID of the target entity

class momaapi.data.ann.SAct(ann, scale_factor, taxonomy_sact, taxonomy_actor, taxonomy_object, taxonomy_ia, taxonomy_ta, taxonomy_att, taxonomy_rel)#

Class for a sub-activity class annotation. A subactivity is a finer grained level of annotation which refers to a step taken as part of an activity. It is temporallly localized within the activity (that is, it has a start and end time in seconds that are relative to the start of the activity).

Variables
  • cname – Sub-activity class name

  • cid – Sub-activity class ID

  • start – Start time of the sub-activity in seconds, relative to the start of the activity video

  • end – End time of the sub-activity in seconds, relative to the start of the activity video

  • ids_hoi – List of higher-order interactions

  • times – Times of higher order interactions inside the video