Annotating video and text simultaneously

Hello,

I’m currently trying to build a setup where we could simultaneously annotate data that is composed of synced text and video. The text corresponds to an automatic transcription of the audio signal from the video.
Then, labels are used to classify movements occuring in the video, using transcribed text as details to support the labeling process.
Thus, text should appear as one line, below the video, and should scroll automatically as the video is scrolling.

Is it something that can be done currently in the interface? And if not from initial object, can I create specific object to handle this scenario?

Thanks in advance!

Hi, try using this one if you need a single choice for the whole video:

<View>
  <Header value="Video Movement Classification with Synced Transcript" size="3" style="margin-bottom: 1em;"/>
  <Video name="video" value="$video" height="400" sync="audio" />
  <Audio name="audio" value="$video" sync="transcript" />
  <Paragraphs name="transcript" value="$transcript" layout="dialogue" 
              contextScroll="true" audioUrl="$video" sync="audio" />
  
  <Choices name="movement" toName="video" choice="single">
    <Choice value="Walking" />
    <Choice value="Running" />
    <Choice value="Jumping" />
    <Choice value="Sitting" />
    <Choice value="Other" />
  </Choices>
</View>

<!--{
  "video": "/static/samples/opossum_snow.mp4",
  "transcript": [
    {"author": "Speaker", "text": "The subject enters the frame and starts walking.", "start": 0, "end": 2},
    {"author": "Speaker", "text": "Now the subject is running quickly across the field.", "start": 2, "end": 5},
    {"author": "Speaker", "text": "The subject jumps over an obstacle.", "start": 5, "end": 7},
    {"author": "Speaker", "text": "Finally, the subject sits down to rest.", "start": 7, "end": 10}
  ]
}
-->

Here’s another option if you need labels across the timeline:

<View>
  <Header value="Video Movement Classification with Synced Transcript" size="3" style="margin-bottom: 1em;"/>
  <Video name="video" value="$video" height="400" sync="audio" />
    <TimelineLabels name="timelineLabels" toName="video">
    <Label value="Walking" />
    <Label value="Running" />
    <Label value="Jumping" />
    <Label value="Sitting" />
    <Label value="Other" />
  </TimelineLabels>
  
  <Audio name="audio" value="$video" sync="transcript" />
  <Paragraphs name="transcript" value="$transcript" layout="dialogue" 
              contextScroll="true" audioUrl="$video" sync="audio" />
  
</View>

<!--{
  "video": "/static/samples/opossum_snow.mp4",
  "transcript": [
    {"author": "Speaker", "text": "The subject enters the frame and starts walking.", "start": 0, "end": 2},
    {"author": "Speaker", "text": "Now the subject is running quickly across the field.", "start": 2, "end": 5},
    {"author": "Speaker", "text": "The subject jumps over an obstacle.", "start": 5, "end": 7},
    {"author": "Speaker", "text": "Finally, the subject sits down to rest.", "start": 7, "end": 10}
  ]
}
-->

Thank you very much for your answer!
The second example you provided corresponds to my use case.
The only issue I’m encountering now is that the text is not scrolling automatically, even with the option enabled. It is also showing the first text (“The subject enters the frame and starts walking.”) at the end of the scrollable paragraph area, even though its start time is 0.
Am I missing some options?
I’m using the Community Label Studio Version.

When you tried my exact example, did it work correctly?

Your example works perfectly.
I narrowed down the problem to using a webm video file.
After a conversion to mp4, the paragraph scrolls correctly through the texts.
Thanks!