Recommended Label Studio workflows for streaming / large event-based time-series
1) Treat “streaming” as micro-batch tasks (most reliable)
Label Studio is built around tasks (records) rather than a continuously updating stream. The most effective pattern is:
- Ingest live IoT data (MQTT/REST/etc.) into your own storage (DB/object store).
- Periodically cut it into labelable windows (for example 10–60 seconds, or “N samples around each trigger”).
- Create one Label Studio task per window, and point the task to the data via
valueType="url" (CSV) or valueType="json" (embedded/hosted JSON).
- Annotators label regions (intervals) or “events” inside the window using
TimeSeriesLabels.
This matches how the TimeSeries labeling UI is intended to work (annotate spans on a timeline), and it scales better than trying to label an infinite stream in one task.
Tip: use a “trigger-centered” window (e.g., t_trigger - 2s to t_trigger + 5s) so annotators mostly see the part that matters.
2) Use TimeSeriesLabels for event/interval annotation (true movement, false positive, etc.)
For event-based labels, you typically want to mark a short interval around the event (even if the raw trigger is instantaneous). Label Studio’s TimeSeries region labels are a good fit.
Minimal example (CSV behind a URL):
<View>
<TimeSeriesLabels name="event" toName="ts">
<Label value="true_movement" background="#4CAF50"/>
<Label value="false_positive" background="#F44336"/>
<Label value="unknown" background="#9E9E9E"/>
</TimeSeriesLabels>
<TimeSeries name="ts"
valueType="url"
value="$csv"
sep=","
timeColumn="timestamp">
<Channel column="pir" legend="PIR trigger"/>
<Channel column="accel_mag" legend="Accel magnitude"/>
<Channel column="temp_c" legend="Temperature"/>
<Channel column="humidity" legend="Humidity"/>
</TimeSeries>
</View>
If your timestamps are Unix seconds, set timeFormat="%s" (this is a common gotcha when importing time series). See the Time Series template docs and examples:
https://docs.humansignal.com/templates/time_series.html
3) Keep overview readable with overviewChannels (but note current limitations/bug)
overviewChannels is intended to control which channels appear in the bottom overview/brush panel (not the main plot) per the tag docs:
https://docs.humansignal.com/tags/timeseries
However, there is an open report that overviewChannels can be ignored (showing all channels) in some versions/repros:
https://github.com/HumanSignal/label-studio/issues/8176
Practical guidance:
- Always pass exact channel
column names (not display labels).
- If it still shows everything, assume you hit the known issue and focus on other ways to reduce clutter (next section).
4) Reduce clutter for many sensors: use MultiChannel and/or engineer “summary channels”
If you have many axes (IMU x/y/z, multiple sensors), showing everything can overwhelm annotators. A common approach:
5) Important for “event-based”: decide instant vs interval and standardize it
Label Studio will store TimeSeries labels as regions with start/end in the exported annotation JSON. Even if your device produces a single trigger timestamp, it’s usually better for consistency to label a small interval (e.g., trigger_time ± 250ms) so reviewers can see context.
Also note for pre-annotations/predictions: server-side validation can be strict about start/end matching the TimeSeries axis (especially when you use timeColumn). If your axis is discrete, fractional times that don’t exist in the underlying time array can be rejected. This behavior is discussed in an internal support summary and aligns with stricter prediction validation:
6) Storage/ingestion patterns: “URL-based tasks” + optional Redis workflow
If your pipeline already produces CSV files (or JSON snapshots) per window/event, host them somewhere reachable and import tasks that reference those URLs.
There’s also a user-reported pattern of using Redis as a source to help with syncing URLs and imports; a discussion around not having direct PostgreSQL integration and using automated conversion + storage appears here:
https://github.com/HumanSignal/label-studio/issues/7263
7) What’s not supported today: true “single stream, label all sensor_ids in one column”
If your CSV contains multiple logical “groups” in a single column (e.g., sensor_id and values stacked) and you want separate plots per group in one task, that specific format has been called out as not supported in a support thread:
https://label-studio.slack.com/archives/CQ8LYPPJS/p1662469110552389
The usual workaround is to reshape data into separate channels/columns (wide format) or separate tasks.