Universal Subtitle Format - Specification (proposal)

document updated: 2004-11-27

document status: proposal (not approved)

Preliminary Remarks

Content

Encoding and Formatting
[X] Interpretation (Rendering, Rasterisation)
[X] Synthax and Symbols
[X] Types
[X] Profiles and Levels
[X] Elements
Attributes
parentheses contain a list of elements the attribute can be assined to
name                                   valid inside elements
alignment position,text,karaoke,image,svg
alpha fontstyle,text, karaoke,image, svg
back-alpha fontstyle,font
b64size b64file
back-color fontstyle,font
back-style karaoke, fontstyle,font
code language
code languageext
color fontstyle,font
colorkey image
duration subtitle
effect text,karaoke,image,svg,font,rt
face fontstyle,font
fall-time karaoke
family fontstyle,font
filename b64file
filetype b64file
horizontal-margin position,text,karaoke,image,svg
hot-style karaoke
italic fontstyle,font
layer position,text,karaoke,image,svg
line-align position,text,karaoke,(image),svg
line-spacing position,text,karaoke,(image),svg
mode karaoke
morphgrid position,text,karaoke,image,svg
name style,effect,(subtitle,text,karaoke,image,svg)
offset-x position,text,karaoke,image,svg
offset-y position,text,karaoke,image,svg
outline-alpha fontstyle,font
outline-color fontstyle,font
outline-level fontstyle,font
position keyframe
position rt
primary-alpha fontstyle,font
profile svgdefs,svg
rawsize b64file
relative-to position,text,karaoke,image,svg
rise-time karaoke
rotate-x position,text,karaoke,image,svg
rotate-y position,text,karaoke,image,svg
rotate-z position,text,karaoke,image,svg
size fontstyle,font
scale-x fontstyle,font
scale-y fontstyle,font
shadow-alpha fontstyle,font
shadow-color fontstyle,font
shadow-level fontstyle,font
spacing fontstyle,font
speaker text,karaoke,image,svg
start subtitle
stop subtitle
strikeout fontstyle,font
style text,karaoke,image,svg,font,rt,style,fontstyle,position
t k
transform position,text,karaoke,image,svg
underline fontstyle,font
version USFSubtitles
vertical-margin position,text,karaoke,image,svg
weight fontstyle,font
width rt
wrap fontstyle,font
x resolution
y resolution
Annex

1

Encoding and Formatting

Character Encoding

The file is of type text on systems that distinguish between text and binary files. The entire file, except binary data parts encoded in base64 inside elements intended to contain such, must be encoded as UTF-8 text accoring to RFC3629 (see also ISO10646 for the UCS). The USF file does not have a Byte-Order-Mask (BOM), instead the character encoding is indicated in the xml root element. Up to the xml root element there must not be any characters that have a different UTF-8 encoding than their ASCII value.
No other character encodings are valid.
Base64 encoding is the only acceptable way tro include binary data in a USF file and binary data is only allowed inside elements where it is explicitly mentioned that they are intended to contain base64 encoded binary data.
Base64 encoding is defined in RFC2045, section 6.8. Base64 encoded data in USF may ignore the line-length requirement in RFC2045 and be encoded in a single line.

Compliant software:

Line Breaks, Whitespace and Indentation

Line breaks, and whitespace between elements and inside elements that do not contain content text are ignored. Line breaks in content text is provided with the <br/> element, therefore line breaks encoded as octet 0x10 (Unix LF) or double-octet 0x1310 (Windows CR-LF) are ignored and never rendered as linebreaks in the output. Consecutive whitespace encoded as octet 0x20 (space) inside the content text is to be collapsed into a single 0x20 space character and any 0x9 (tab) are to be removed for end-user representation (rendering onto video). Editing software may preserve any whitespace and line breaks present and display the content without collapsing space as long as it marks this representation clearly as not the one used for display in playback software or devices.
Whitespace encoded with different characters is discouraged but allowed if the user explicitly wishes to format content text by this means.
Content text is what will be presented to the user as text in a playback software or device. Examples are non-element content in <text> and <karaoke>.

Compliant software:

2

Interpretation (Rendering, Rasterisation)

This chapter deals with how the final picture is composed. It contains some basic specifiactions. You will find lots of terms in this section that are only specified further below. You can still read on without having to read the definitions first, you will most probably find that the definitions later will about confirm what you first imagined a term to mean.

timing

The document clock indicates the time referred to inside USF as "the global absolute time". Subtitle start and stop use the global absolute time. Subtitle duration is the difference (stop-start). Karaoke times and instances of effects use a time relative to the subtitle start time. Effects or object instance attributes inside of subtitles may use a percentage notation where 0% equals to the subtitle start time and 100% equals to the subtitle stop time.

The earliest time a subtitle is displayed is exactly at the start time. The lastest time a subtitle is displayed is the last picture before the end time. If you want to compose a picture at time t you must only render the picture if
((t >= subtitle.start) AND (t < subtitle.stop)) or
((t >= subtitle.start) AND (t < subtitle.start + subtitle.duration)) respectively.

Subtitles must be sorted inside a specific <subtitles> block. The sorting order is by the subtitle start time, with the subtitle with lowest start time in the beginning. Subtitles may overlap in time, so that a display software must compose the picture from multiple subtitle objects.

A USF document may contain several <subtitles> elements (blocks). The user or software may select what blocks are used and which ones are invisible and ignored. Ideally each block is identifiable by the combination of language and languageext attributes, if it is not, blocks can still be referred by their phisical coding order in the file - the first coded block has the lowest index and the last coded block has the highest index. This index is implicit, you do not specify that in an attribute.

Levels or other ways for software to state its capability of displaying multiple subtitle at once and parse more than one block are to be defined.

Compliant software:

coordinate space

The document is rendered into a virtual 3-dimensional space which is then projected onto the two dimensional picture to output. The three dimensional space has the three spanning vectors X,Y and Z. X is mapped to the horizontal on the screen with positive X pointing to the right. Y is mapped to the up-vector on the screen. Z points from the user into the screen. As such this coorinate space is "left handed". The origin is located at the lower right edge of the screen canvas with Z = 0 at the screen depth where no perspectific modification occurs, ie one unit corresponds to one screen pixel. If the picture is composed from objects at different Z depths, the most negative plane layer index are closest to the user and cover the ones with higher layer index. However the space is not truely three dimensional as layers do not mix. A object rotated cannot have parts on both sides of another layer. As such, an object on layer 2 never cover parts of an object on level 1.

+y
^
|   ,+z
|  /
| /
|/
*------->+x

  .-------------.
  |     layer 1 |
.-------------. |
|     layer 0 | |
|             | |
|             | |
|             |-'
'-------------'

Besides the level specified in element attributes, the level of an object may be shifted by a block constant. That is it is possible for the rendering software to place objects on the fly on levels according to the block they belong to. For example a renderer my place all objects from a secondary displayed block at layers starting from max_layer(primary block)+1.

Compliant software:

colors

Each layer is initially clear black, that is all four channels set to zero. An object that is placed on a layer is merged with the existing color on the layer. Thereby the RGB channel are mixed according to the opacity of source (object to place) and target (layer). In the same manner layers are merged together to form the final picture. The layer index does not affect perspectific modification (objects on higher indexed layer do not seem smaller).

object model

Objects that are rendered to the screen are instanciated. An object is created (instanciated) by placing it inside a <subtitle> element. Or by referring to a declaration of it inside and instanciated object. For example, a style definition spawns an instance if a text element refers to it by the style attribute - that instance is base for the inheritace of the attributes of the text object. Child elements become embedded object instances (members of the parent).

The elements <styles>, <effects>, <svgdefs> and <embedded> contain collections of object definitions. Declarations not created themself, only references to them instanciate objects following the declaration. If no objects refers to a definition, then there is no object following that definition. On the other hand if there are multiple references to a definition, then there are as many instanciated objects from that definition.

Object instances may also refer to other object instances. In this case the referred object is cloned and the clone is assigned to the referrer. Of course the reference must point to a uniquely identifiable element (for example one with a id attribute).

Compliant software:

box model

An Object that is to be placed on a layer has the obtain a two-dimensional graphic representation. This bitmap of a single object has a rectangular shape that just fits the maximum extent of the object. This is the objects bounding box. The bounding box itself can be used as reference coordinate space for child objects or attributes and this bounding box can change size during the process of rasterising the object. It also is a child object of the object to rasterise and may take a visible representation. For example a <text > object may define the box attribute with extend and fill defined - in this case the size of the bounding box will change in order to fill the background of the object bitmap with fill on an area as big as defined by extend.

Outside of an svg element only top-level elements may define bounding boxes. For example you cannot modify the bounding box of a single letter in a word or the ruby text sitting on top of other text. Of course you can brute force this look by arranging multiple objects in a way to simulate the effect, but this is not a explicit feature of the box model.

3

Synthax and Symbols

Intro

If you already know XML, you may find much of the content of this chapter not of great interest. However, be warned that the synthax must be followed strictly (unlike dirty webpages do). Capitalisation of identifiers (tags, attributes, custom names) does matter, elements ("teags") are always closed, attribute values are always enclosed in qouble quotes and if the refer to a keyword capitalisation also matters.
Software must enforce these specifications and deny read-in or write-out of bastardised source code.

Elements

The USF source code consists of elements ("objects,nodes") that belong to different classes.
Each element has a start token and if the element is not empty it also must have an end token. Every element token, start and end form, begin with a less than sign (<) and end with a greater than sign (>). Therefore these two characters must not occur anywhere else - in content text they are escaped.
Empty elements may omit the end token by ending the start token content with a forward slash. That is, the slash is placed immediately before the greater than (>) character.
Element start and end token begin with the name of the element class right after the opening less than sign ('>'). Element names are always formed from the letters of the english alphabet [a-z][A-Z], the arab numbers [0-9] plus the dash and underscore (-,_)(0x2D,0x5F). Case does matter in element names. The element 'USFSubtitles' is defined, 'USFsubtitles' is not defined.
If the opening token contains attributes, there must be at least one white space seperating the first attribute from the element name.
The list of elements specified for each element what content and attributes it can or/and must have and also whether it can, must or must not use and end token. All elements that must not use an end token must use the abbreviated form with final forward slash to be closed.
There is exactly one element that breaks the closing and naming rules, which is the special xml comment element that uses the synthax <!--anything-->.

Compliant software:

Nested Elements

Some classes of elements allow other elements of specific classes to be nested inside its elements.These innwer elements are then called child element of the outer elelemnt which is then called the parent element. Some element classes even require child elements of a certain type. All USF elements are nested inside the USFSubtitle root element. Some as direct children, others as further nested children of children. In the element list, for each element is listed what child element classes it can or must contain and the tree in the table of content has already been arranged in order to approximate the nesting as far as possible.
Some classes of elements contain content text or child elements intermixed with content text. This is specified if the case. Content text must not occur outside of elements of those types.

Compliant software:

Attributes

Attributes are optional or mandatory fields that define properties of an element of a certain class. No attirbute my be mentioned more than once in one single element. Attributes are specified in the start (opening) token of an element and seperated by one or more 0x9 space characters and optionally by linebreaks (LF or CR-LF). An attribute consists of a name, that identifies it's type, immediately followed by an equal sign, immediately followed by it's value enclosed in double quotes (")(0x22).
Example: some-attribute="value"
Attribute names define the type and therefore the valid values an attribute can take. Attribute names are always formed from the letters of the english alphabet [a-z], the arab numbers [0-9] plus the dash and underscore (-,_)(0x2D,0x5F). So far all attribute names are all-lowerspace, but this is not fixed and may change in the future. Capital letters of the english alphabet [A-Z] are not explicitly forbidden in attribute names, but so far no attributes named with capital letters have been defined. Case does matter in attribute names. The attribute 'size' is defined for <font> elements, but the attributes 'FACE','Face' and 'fAcE' are not defined.
Attribute values that provide identifiers, refer to identified elements or contain custom text can contain any UTF-8 characters.
Attributes that provide lists define the proper way to seperate list items. Often this will be either the characters comma or semi-colon (,,;)(0x2C,0x3B).
Numbers have a specified format and numeric values a defined unit.
Attributes must not be interrupted by whitespaces or line breaks unless otherwise specified in the attribute type.

Compliant software:

4

Types

Intro

This section describes the more common types found in attribute values. Attribute definitions refer to these type declarations if they contain these types. Compliant software:

Timestamps

USF knows two types of time notations, relative and absolute. The specific attribute class defined whether relative or absolute timestamps are used. No attribute allow the use of both forms.

In the following specifications, these definitions hold valid;

Omitted hour, minute or millisecond unit values take the default value of zero.
Units are never explicitly mentioned. The unis of a (sub-) value is derived from the presence or absence of structuring symbols (: and .). Other structuring symbols are not allowed, especially worth noting is that the symbol seperating milliseconds and seconds must be a point and never a comma.
All (sub-) values are acceptably in decimal notation only.

Absolute timestamps are absolute in the context of the content of this one USFSubtitles element. If a container format contains serveral USF roots it may specify different offsets to be applied to the different USFSubtitle roots. By default a USFSubtitle root element must be assumed to have offset 0 milliseconds and this offset may only be changed by means outside the scope of USF. Absolute timestamps can take several equivalent forms:

Relative timestamps refer to a previously processed attribute - this process may be iterating over several steps, but it must always be possible to compute an absolute timestamp for a relative timestamp in a element that is effectively instantiated. Definitions, such as found in style and effect definitions may leave absolute resolving to the instatiated element referring them (possibly indirect, over several references).
Relative timestamps have either one of the following forms called structured relative timestamps:

Or the form of an unstructured relative timestamp:

Which form of relative timestamp (structured or unstructured) is acceptable for an attribute class is defined by the attribute. No attribute allows both forms.

Compliant software:

Integer Numbers

An integer value does not have a decimal or fractal part. It's smallest value is one greater than negative infinity and its largest value is one smaller then positive infinity.
Unless otherwise specified in the attribute type declaration, integers are only expressed in decimal notation (base ten).
If a Integer is expressed in hexadecimal notation (base sixteen), it is prefixed by a carret symbol (#).
Positive integers are those with a value greater then zero. Some other notations used include:

Integer said to be unsigned are always Z0+.

Compliant software:

Real Numbers

A real number can be an integer or floating point number. The decimal seperator symbol, if present, is the dot (.).
Real numbers are always expressed in standard, never scientific notation. Attributes that allow real numbers never allow hexadecimal notation, only decimal notation.
If the decimal part is ommited, the dot must also be omitted. The integer part cannot be omitted even if it is zero.

Compliant software:

Colors

USF uses hexadecimal color notation in ARGB colorspace with two digits per channel. This means, a fully specified color value looks like this: '#AARRGGBB', where each of A,R,G,B stands for one hexadecimal digit. 00 is the smallest value of a channel and FF the largest. If all channels are set to zero('#00000000'), the color is clear (completely transparent) black. If channels are set to FF ('#FFFFFFFF') the color is fully opaque white. A notation omitting the alpha channel is possible. In this case the alpha channel is assumed to have the value FF (fully opaque). Notations with other numbers of digits per channel than two are invalid.
Some attributes only specify one channel (typically the alpha channel) - in this case the value takes the form '#XX' and the channel cannot be omitted even if it is the alpha channel.

Compliant software:

Relative Units

Some attributes specfify factors or precentages relative to some absolute base value. If the attribute specifies a percentage, the percentage sign (%) must be appened to the value. Percentages may be integers only or real numbers - this is specified by the attribute class. If the attribute specifies a factor, it is a real number without unit that the base value must be multiplied with (1.0 = no change). The size attribute in font and fontstsyle elements defines one more relative value: A signed single digit integer ranging from -5 to +5 without unit; This must be interpreted as base+(relative/10), or in other words, -5 equals base-50% (=50%) and +5 equals base+50% (=150%).

Units

Some attributes classes (but not all) require a unit identifier postfixed to the value.
Unit identifier and meanings:

5

Profiles and Levels

Introduction

Elements and Attributes in this specification are grouped into different profiles and have an assigned status and a level.
Profiles are a coarse grouping to seperate the basic elements from the ones that might only be of interest in particular appliances such as editing or non-realitime rasterisation.
The status of an element or attribute classifes it as being mandatory to support, soon to be expected to become mandatory or newly introduced for evaluation.
The level of an element of attribute is a hint about its complexity or expected usage frequency.

A software must support base profile, stable level 0 elements and attributes only for compliance.
But it must correctly ignore elements and attributes it does not support. If it does not support a certain element it must ignore all its attributes and child elements or other content.

Profiles

Base

Elements and attributes belonging to the base class are considered important for all appliances.

Fancy

Elements and attributes belonging to the fancy class are expected to be used only when applying advanced styling/typesetting. The members of this class may typically require more processing power than the base class and are expected to not always be rasterisable in realtime.

Object

This class contains elements that are expected to require the software to handle elements in an object oriented. It is at the moment expected that all of USFxSVG and functional attributes will fall into this class.

Status

Stable

Elements and attributes labeled stable are a fix part of the format and may not easily be dropped or changed in an non-backward-compatible way. However it is possible to add optional child elements and attributes as software must be able to ignore these if it does not yet know them.

Proposal

Elements and attributes labeled as proposal will become stable if they prove non-problematic and useful. However they may still be dropped of changed if this is not the case.
Software should be prepared to support these in foreseeable future (given the profile/level is not out of scope of that software).

Experimental

New elements and attributes that may easily be changed or dropped anytime. Also "private" elements and attributes that are not expected to be used by all USF software may be included in this class. The specification may not list the complete experimental class.

Levels

The lowest complexity and most common elements and attributes are assigned to level 0. Increasing levels designate increased complexity or more specialised usage only.

6

Elements

USFSubtitles

Name: USFSubtitles
Child of: {none}
Content: elements
Mandatory Children: metadata, subtitles (+)
Allowed Children: styles (?), effects (?), svgdefs (?), embedded (?)
   
Mandatory Attributes: version
   
Description: the root element of a USF document.
Topics: container

All other elements are children or subchildren of this element and there must be only one element of this class in one file.

Attributes

version

Description: The USF version used in the document. This attribute declares what complexity the document at maximum has and therefore indicates a parser whether it will be able to understand all of the file or not.
   
Type: real
Mandatory: yes
   
Topics: profile,level

If the content of two USFSubtitle elements is merged to form a single root, the two must have compatible versions and profile/levels and the new values for these will be the highest found in any of the sources.

metadata

Name: metadata
Child of: USFSubtitles
Content: elements
Mandatory Children: title, language, author (+)
Allowed Children: languageext, date, comment, resolution
   
Description: container for global metadata variables
Topics: container

author

Name: author
Child of: metadata
Content: elements
Mandatory Children: name
Allowed Children: email (?), task (?), url (?)
   
Description: container for details of one human contibutor to the file
Topics: container, metadata
   
Element Version: 2

name

Name: name
Child of: author
Content: content text
   
Related Elements: email, task, url
   
Description: the name of the person described by this author element
Topics: metadata, author

email

Name: email
Child of: author
Content: content text
   
Related Elements: name, task, url
   
Description: email address of the person described by this author element
Topics: metadata, author

task

Name: task
Child of: author
Content: content text
   
Related Elements: name, email, url
   
Description: description of the work this person contributed to the file (translation,timing etc)
Topics: metadata, author

url

Name: url
Child of: author
Content: content text
Related Elements: name, email, task
   
Description: URL associated with this person, for example his/her website
Topics: metadata, author

comment

Name: comment
Child of: metadata
Content: mixed
Allowed Children: br (*), b (*), i (*), s (*), u (*),
   
Description: a free text comment about the file that can contain basic markup
Topics: content text
   
Element Version: 2

date

Name: date
Child of: metadata
Content: content text
   
Description: date stamp of the file. Format: YYYY-MM-DD (ISO 8601)
Topics: metadata,content text

language

Name: language
Child of: metadata, subtitles
Content: content text
   
Mandatory Attributes: code
   
Related Elements: languageext
   
Description: the language of the content described by this element.
Topics: block attributes

Inside the metadata element, language applies to all elements in the file except those that contain an own language element. A subtitles element can contain a language lement to override the global language for this subtitles "block" (all children of that subtitles element).
The content text of this element should provide a short name for the language either in english or the described language itself.

Attributes

code

Description: ISO 639-2 language code consisting of 3 lowercase characters.
   
Type: text
Range: aaa-zzz
Mandatory: yes

languageext

Name: languageext
Child of: metadata, subtitles
Content: content text
   
Mandatory Attributes: code
   
Related Elements: language
   
Description: the version or edition of the script
Topics: block attributes

Inside the metadata element, languageext applies to all elements in the file except those that contain an own languageext element. A subtitles element can contain a languageext element to override the global version/edition for this subtitles "block" (all children of that subtitles element).
The content text of this element can contain a comment or precision on the selected code attribute value.

Attributes

code

Description: ("Normal | HearingImpaired | DirectorComments | Forced | Children")
   
Type: ("Normal | HearingImpaired | DirectorComments | Forced | Children")
Default: Normal
Inheritance: from parent object

resolution

Name: resolution
Child of: metadata
Content: empty
   
Mandatory Attributes: x, y
   
Description: the video canvas size this script was designed for
Topics: coordinate systems
   
USF Level: base proposal level 0

If the video canvas size on playback differs from the specified, the coordinate system of the runtime instance should be scaled by the factor between real and intended canvas size.
Thereby the smaller of x and y factor should be used for both directions. Canvas aspect ratio of the subtitle coordinate system should not be changed.
If this element is present, both x and y attribute must be specified. If this element is not present, a default value of 640 for x and 480 for y attribute is used.

Attributes

x

Description: canvas width in pixel
   
Type: integer N+
Range: 1..+inf
Units: px
Units Specified: yes
Default: 640
Mandatory: yes
Inheritance: none
   
Topics: coordinate systems

y

Description: canvas height in pixel
   
Type: integer N+
Range: 1..+inf
Units: px
Units Specified: yes
Default: 480
Mandatory: yes
Inheritance: none
   
Topics: coordinate systems

title

Name: title
Child of: metadata
Content: content text
   
Description: the title of the file (script)

Intented for the title of the movie or video this subtitle file is associated with, can also be extended to contain more info, but preferably the comment element is used for extended description.

styles

Name: styles
Child of: USFSubtitles
Content: elements
Allowed Children: style (*)
   
Related Elements: effects, svgdefs, embedded
   
Description: container for style (declaration) elements
Topics: container, object declaration

style

Name: style
Child of: styles
Content: elements
Allowed Children: fontstyle, position
   
Mandatory Attributes: name
Allowed Attributes: style
   
Related Elements: effect, keyframe
   
Description: declares a style
Topics: object declaration, styles
   

A style "instance", when referred to from an object instance, inherits from the parent of that object and is itself overridden/modified by attributes explicitly set in the object instance calling it. If an object refers both to a style and an effect, then the effect inherits from and overrides the style. Some objects ignore the position child of a style they call and only use the fontstyle child. Other objects may only use the position child and ignore the fontstyle child.

fontstyle

Name: fontstyle
Child of: style, keyframe
Content: empty
   
Allowed Attributes: style, all of fontstyle family:
face, family, size, color, back-color, outline-color, shadow-color, alpha, primary-alpha, back-alpha, outline-alpha, shadow-alpha, outline-level, shadow-level, weight, italic, underline, strikeout, wrap, spacing, scale-x, scale-y
   
Related Elements: position, font,
   
Description: declares the "font" attributes of the parent style.
Topics: object declaration, styles
   

The attributes are mostly identical to those available in the markup element "font".

position

Name: position
Child of: style, keyframe
Content: empty
   
Allowed Attributes: style, all of positional family:
alignment, line-align, line-spacing, horizontal-margin, vertical-margin, offset-x, offset-y, rotate-x, rotate-y, rotate-z, relative-to, layer, transform, morphgrid
   
Related Elements: fontstyle, text, karaoke, image, svg
   
Description: declares the positional attributes of a style.
Topics: object declaration, styles
   

Attributes dealing with coordinate-space, position, scaling and orientation in the parent style or effect keyframe. The allowed attributes are mostly identical to those available in children of subtitle (text,karaoke,image,svg).

effects

Name: effects
Child of: USFSubtitles
Content: elements
Allowed Children: effect (*)
   
Related Elements: styles, svgdefs, embedded
   
Description: container for effect (declaration) elements
Topics: container, object declaration

effect

Name: effect
Child of: effects
Content: elements
Allowed Children: keyframes
   
Mandatory Attributes: name
   
Related Elements: style
   
Description: declares an effect
Topics: object declaration, effects
   

An effect is basicly an animated style. It defines a series of keyframes that correspond to styles applied to the caller a times relative to the caller's time. The effect style at time between specified keyframes is interpolated from the previous and following keyframes. The allowed children fontstyle and position allow attributes that cannot be interpolated. Such attributes must be ignored in an effect.

keyframes

Name: keyframes
Child of: effect
Content: elements
Allowed Children: keyframe
   
Description: container for keyframe elements
Topics: container, object declaration, effects
   

keyframe

Name: keyframe
Child of: keyframes
Content: elements
Allowed Children: fontstyle, position
   
Mandatory Attributes: position
Allowed Attributes: style
   
Related Elements: effect, style
   
Description: declares the style of an effect at a certain time
Topics: object declaration, effects
   

This element is almost identical to the style element, except it has an attribute indicating the time it is valid - the referencing happens via the effect, a keyframe is never directly referenced, and therefore a keyframe needs no name.

Attributes

position

Description: timestamp of the keyframe
   
Type: relative timestamp or real percentage
Range: positive for timestamps, 0..100% for percentages
Units: ms | %
Units Specified: only for percentages
Default: +inf
Mandatory: yes

The default value is used when the value is invalid in a instanciation, this means the keyframe will simply not be applied.
An instance of an effect, as produced when a object instance refers to the effect, inherits from the referring object. If the referring object also refers to a style, the style is applied before the effect. The style as defined coming from the object instance is applied to time zero (to the first keyframe if it is at time zero, else a virtual keyframe with that style is inserted). All following keyframes inherit from the keyframe preceding them.
The effect is inherited to children of the caller, but they can override the effect by setting attributes explicitly, referring to a style or even another effect.

svgdefs

Name: svgdefs
Child of: USFSubtitles
Content: elements
Allowed Children: svg (*)
   
Related Elements: styles, effects, embedded
   
Description: container for svg object (declaration) elements
Topics: container, object declaration
   
USF Level: object experimental level 0

subtitles

Name: subtitles
Child of: USFSubtitles
Content: elements
Allowed Children: language (?), languageext (?), subtitle (*)
   
Description: container for a a collection of timed text
Topics: container, subtitle block
   

A root element can have several <subtitle> elements to allow for several languages or different editions such as additional subtitles for hearing-imparied persons. The children of a <subtitles> element must be ordered with the (optional) <language> and <languageext> elements on top, followed by the <subtitle> elements ordered by increasing start time.

subtitle

Name: subtitle
Child of: subtitles
Content: elements
Allowed Children: text (*), karaoke (*), image (*), svg (*)
   
Mandatory Attributes: start and one of either stop or duration
   
Description: encapsulates a collection of top-level object instances that share the same start and stop time
   

This element creates instences of the objects and puts them on the screen. Everything that is not in a <subtitle> element is not instanciated unless it is referred to by an object inside a <subtitle> element.
Most of the time you typically only have one child inside a <subtitle>) element as you usually have different timings for the different subtitles. Also note that the times of <subtitle> elements may overlap and you are encouraged to use that instead of spanning one longer element containing lines of different speakers when their speech overlaps.

Attributes

start

Description: start timestamp
   
Type: non-negative, absolute timestamp
Mandatory: yes

This is the exact time when the element may first be rendered. The element may not be rendered before that time, even if you request a snapshot only one millisecond earlier. See timing.

stop

Description: stop timestamp
   
Type: non-negative, absolute timestamp
Mandatory: yes unless the duration attribute is specified

This is the first time the element must no longer be rendered. At this exact time or any later time the element must be turned off. See timing.
Example: subtitle1 has start="1.000" and stop="2.000" and subtitle2 has start="2.000" and stop="3.000".
These subtitles do never overlap. At time 2.000 subtitle1 is no longer visible and only subtitle2 is displayed. On the other hand at time 1.999, only subtitle1 is visible and subtitle2 is still not turned on.

duration

Description: duration in seconds
   
Type: relative timestamp larger than zero
Mandatory: yes unless the stop attribute is specified

This attribute is an alternative to specifying the stop timecode. From it the stop timecode is computed as stop := start + duration.
This value is always in (potentially real) second notation, it never uses a notation containing hours or minutes.

text

Name: text
Child of: subtitle
Content: mixed
Allowed Children: markup elements:
b, br, font, i, ruby, s, u
   
Allowed Attributes: name, style, effect, speaker, all positional attributes, alpha
   
Related Elements: karaoke, image, svg
   
Description: top-level object instance that contains styled text
   

Attributes

speaker

Description: informational, specifies the person speaking the text in this element
   
Type: text

can be used to group subtitles in the authoring process

karaoke
Child of: subtitle
Content: mixed
Allowed Children: markup elements:
b, br, font, i, ruby, s, u, k
Allowed Attributes: name, style, effect, speaker, all positional attributes, alpha, hot-style, back-style, rise-time, fall-time, mode
Related Elements: text, image, svg
Description: text element encanced by karaoke-capability
This element must contain a number of <k> elements with a summed up time in their t (time) attributes not exceeding the duration of the parent <subtitle> element.
Attributes
Name: hot-style
Description: style reference for the "hot" block or letter in PerBlock or PerLetter mode
Type: text
USF Level: fancy experimental level 1
Name: back-style
Description: style reference for the not yet highlighted text
Type: text
USF Level: fancy experimental level 1
Name: rise-time
Description: part of the "hot" time of a block that is spent on the transition from back-style to hot-style
Type: real (fractal) or integer time
Range: 0..1 for fractal or integer > 1 for time
Units: none for fractal | ms for time
USF Level: fancy experimental level 1
The sum of rise-time and fall-time must not evaluate to more time than assigned as hot time for that block (<k> element's t attribute). When both times are in fractal this simply means the sum must not be larger than 1, for millisecond notation the editor hat to take more care.
Name: fall-time
Description: part of the "hot" time of a block that is spent on the transition from hot-style to style
Type: real (fractal) or integer time
Range: 0..1 for fractal or integer > 1 for time
Units: none for fractal | ms for time
USF Level: fancy experimental level 1
The sum of rise-time and fall-time must not evaluate to more time than assigned as hot time for that block (<k> element's t attribute). When both times are in fractal this simply means the sum must not be larger than 1, for millisecond notation the editor hat to take more care.
Name: mode
Description: defines how the text highlighting progresses
Type: ("Continuous | PerLetter | PerBlock")
Default: continuous
USF Level: base proposal level 1

markup elements

These elements can occur in the content of text or karaoke elements, intermixed with content text. Much like known from HTML, they can generally be nested in any order as long as the nesting is correct. That is any one class of these can be child of any other class of these including itself. Of course there are nestings that don't make sense such as respecifying italic inside an already italic part.
Apart from the sematic foolishnesses, there are also exceptions to note:

b (bold)
Child of: markup inside <text> and <karaoke> - explicitly
<font>, <i>, <rb>, <rt>, <s>, <u>, and of course <text> and <karaoke> directly.
Content: mixed
Allowed Children: <br>, <font>, <i>, <k>, <ruby>, <s>, <u>
Related Elements: <font>, <i>, <s>, <u>
Description: sets the enclosed text to bold weight
Is almost equivalent to the weight attribute in <font> and <font> set to bold.
In fact you can replace every occurence of <b> with <font weight="bold"> (when adjusting the ending tag </b> to </font> as well!). It is not fully equivalent becuse you can modify the weight attribute by setting it to a diffent value in a child element while you cannot cancel out a <b> element inside it.

br (line break)
Child of: markup inside <text> and <karaoke> - explicitly
<b>, <font>, <i>, <rt>, <s>, <u>, and of course <text> and <karaoke> directly.
Content: empty (only shorthand notation allowed)
Description: following text is placed on a new line
This element has no ending tag, it must be immediately closed by using the notation <br/>.
Line break is not allowed in the ruby base <rb>. It is allowed in ruby text, <rt>, but discouraged.

font
Child of: markup inside <text> and <karaoke> - explicitly
<b>, <font>, <rb>, <rt>, <i>, <s>, <u>, and of course <text> and <karaoke> directly.
Content: mixed
Allowed Children: <b>, <br>, <font>, <i>, <k>, <ruby>, <s>, <u>
Allowed Attributes: style, all of fontstyle family:
face, family, size, color, back-color, outline-color, shadow-color, alpha, primary-alpha, back-alpha, outline-alpha, shadow-alpha, outline-level, shadow-level, weight, italic, underline, strikeout, wrap, spacing, scale-x, scale-y
Related Elements: <fontstyle> (child of <style>)
Description: sets various font and text attributes
Unlike the simple attribute-less markup elements, <font> can be nested inside itself. But still there can be nesting that does not make sense, namely when no attribute is effectively modified in the child as it only confirms the attributes already set. The allowed attributes are identical to the ones available in <fontstyle>. Even the style attribute, which in <fontstyle> specifies the base style has in fact an equivalent meaning as it serves as inheritance root in the <font> element and only respects the <fontstyle> element of the referred style (the <position> element is ignored). You could call the inheritance root the instance form of a definitions base..

i (italic)
Child of: markup inside <text> and <karaoke> - explicitly
<b>, <font>, <rb>, <rt>, <s>, <u>, and of course <text> and <karaoke> directly.
Content: mixed
Allowed Children: <b>, <br>, <font>, <k>, <ruby>, <s>, <u>
Related Elements: <b>, <font>, <s>, <u>
Description: sets the enclosed text to italic
Is almost equivalent to the italic attribute in <font> and <font> set to yes.
In fact you can replace every occurence of <i> with <font italic="yes"> (when adjusting the ending tag </i> to </font> as well!). It is not fully equivalent becuse you can modify the italic attribute by setting it to no in a child element while you cannot cancel out a <i> element inside it.

k
Child of: markup inside <karaoke> - explicitly
<b>, <font>, <i>, <s>, <u>, and of course <karaoke> directly.
Content: empty
Mandatory Attributes: t
Related Elements: <karaoke>
Description: set a time marker in the text
The text extending from after this element up to the next <k> element or the closing tag of the <karaoke> element will be passed by the karaoke hot spot in the time specified in the elements only and mandatory attribute t
Attributes
Name: t