Create filter for "annotated by" in SDK

Hi all,

I’m trying to do something seemingly simple, but I cannot find any reference for it online.

I’d like to create a view from the SDK where images annotated by a certain user would be shown.

I’m looking for a solution that would look like this:

filters = []
filters.append(Filters.item(Column.updated_by, Operator.CONTAINS, Type.String, Filters.value("userhandle1")))
full_filter = Filters.create("and", filters)

tasks = ls.project.get_tasks(full_filter)

I can’t seem to figure it out.
Thanks a lot!

Hello,

To filter tasks annotated by a specific user using the Label Studio SDK, you can use the Column.annotators field. This field represents the list of annotator IDs who have annotated each task. Here’s how you can achieve this:

from label_studio_sdk import Client
from label_studio_sdk.data_manager import Filters, Column, Operator, Type

# Initialize the Label Studio SDK Client
LABEL_STUDIO_URL = 'https://your-label-studio-url.com'  # Replace with your Label Studio URL
API_KEY = 'your-api-key'  # Replace with your API key

ls = Client(url=LABEL_STUDIO_URL, api_key=API_KEY)
ls.check_connection()

# Get the project
project_id = YOUR_PROJECT_ID  # Replace with your project ID
project = ls.get_project(project_id)

# Get the user ID for 'userhandle1'
users = ls.get_users()
user_id = None
for user in users:
    if user.email == 'userhandle1' or user.first_name == 'userhandle1':
        user_id = user.id
        break

if user_id is None:
    print('User not found')
    exit()

# Create filter to get tasks where 'annotators' contains the user_id
filters = Filters.create(
    Filters.AND,
    [
        Filters.item(
            Column.annotators,
            Operator.CONTAINS,
            Type.Number,
            Filters.value(user_id)
        )
    ]
)

# Retrieve tasks with the filter
tasks = project.get_tasks(filters=filters)

# Output the task IDs or any other desired information
for task in tasks:
    print(f"Task ID: {task['id']}")

Explanation:

  • Retrieve User ID:
    • We fetch all users using ls.get_users().
    • We search for the user with the email or first name matching 'userhandle1' to get their user_id.
  • Create Filter:
    • We use the Column.annotators column, which contains a list of annotators’ IDs for each task.
    • We use Operator.CONTAINS to check if the annotators list includes the user_id.
    • The Type.Number is used because user_id is a numerical value.
  • Retrieve Tasks:
    • We apply the filter using project.get_tasks(filters=filters) to get all tasks annotated by the specific user.

Got it.

Now, the UI of LabelStudio allows for two different user filter:

  • Annotated by
  • Updated by

Using Column.annotators seems to only consider the former. I’m trying to understand how to filter based on the latter.

In the tasks retrieved by project.get_tasks(...), there seems to be a field 'updated_by': [{'user_id': 4}], but I haven’t figured out how to access this through a filter.

Thanks again,
Best

Answer: The filter needed here is Filters.item("tasks:updated_by_id", Operator.EQUAL, Type.Number, Filters.value(userid))

There’s no attribute in class Column to retrieve it though.