Why the column “type” can be txt/str, but Filters are stuck on data (and error)
Columns display type and Filters value type are not the same system in Label Studio.
-
In the backend, Data Manager filter type is validated against a fixed enum (Number, Datetime, Boolean, String, List, Unknown) in label_studio/data_manager/prepare_params.py. Anything else (like data) is not a valid filter type and can error.
Source: prepare_params.py filter Type enum
-
Separately, Data Manager columns are generated by get_all_columns() and task.data.* fields are typed based on labeling config / inferred data types. Anything not recognized as media types ends up as String for display.
Source: get_all_columns() column type mapping
So yes: this looks like an incomplete/unified UX — changing a column’s display type doesn’t necessarily produce a compatible filter definition. The reliable path is to ensure your filter payload uses one of the supported filter type values (usually String for your meta_info).
How to filter on your custom field correctly (recommended)
Use a Data Manager filter that targets task.data.meta_info and set the filter type to String.
For example, via a Data Manager View payload (REST; SDK v2 helper for “views” requires a check, maybe this function is not implemented):
{
"project": 123,
"data": {
"title": "Images by path fragment",
"filters": {
"conjunction": "and",
"items": [
{
"filter": "filter:tasks:data.meta_info",
"operator": "contains",
"type": "String",
"value": "redacted-path/"
}
]
}
}
}
The key detail is the filter name format for task.data fields:
Why you see base64-ish image paths (even with presigned URLs)
If your task’s data.image value is something like:
/tasks/<some_id>/presign/?fileuri=<BASE64...>
that fileuri is typically a base64-encoded storage URI (gs://, s3://, etc.). This often happens when tasks were created by copying/reusing the output of /api/tasks (which may contain presigned/proxy URLs) instead of importing the original storage URIs.
To avoid generating tasks with these /presign/?fileuri=... values, when you read tasks from the API for re-use you should request un-resolved URIs:
/api/tasks?project=<id>&resolve_uri=false
Practical best practice for your use case (fast filtering)
Keep data.image as the real media pointer, and add additional short searchable fields (like you’re doing) such as:
meta_info (full key)
image_prefix (folder/prefix)
filename (basename)
dataset, source_id, etc.
Then filter on those fields using filter:tasks:data.<field> + type: String.
Clarifying questions (so I can give the exact payload / workaround for your UI)
- Which edition/version are you on (Community vs Enterprise, and version)?
- When you open a task JSON, is
data.image currently s3://... or /tasks/.../presign/?fileuri=...?
- What is the exact error message when you try filtering your
meta_info field in the UI?
Further information: