Displaying Only Label Values in Data Manager of Annotation Tool

Question:

In the data manager of our annotation tool, the annotation results display the entire content of the result including metadata. I am interested in finding a way to show only the label values, regardless of the label type. Does anyone have suggestions on how to achieve this?

Answer:

Unfortunately, the feature to display only label values directly within the data manager is currently not available but is on our product roadmap for future updates.

However, there is a workaround using the Label Studio SDK that you can implement. The following steps and example script will allow you to extract labels from all annotations and display them in a separate column within the data manager:

  1. Write a script using the Label Studio SDK to go through all tasks in your project.
  2. Extract labels from all task.annotations.result items.
  3. Save these labels as a list in task.data['labels'] using the provided API endpoint (PATCH api/tasks/id).

Do note that the labels displayed in the separate data manager column won’t be reactive, meaning the script needs to be rerun occasionally to update the labels with new annotation results.

Here’s a basic example script:

from label_studio_sdk import Client

# Initialize the Label Studio SDK Client
LABEL_STUDIO_URL = 'http://localhost:8080'  # Replace with actual Label Studio URL
API_KEY = 'your_api_key_here'  # Replace with actual Label Studio API key

# Connect to Label Studio
ls = Client(url=LABEL_STUDIO_URL, api_key=API_KEY)

# Specify the project ID
PROJECT_ID = 1  # Replace with your project ID

# Get all tasks in the project
tasks = ls.get_project(PROJECT_ID).get_tasks()

# Process each task
for task in tasks:
    labels = [result['value']['labels'] for annotation in task['annotations'] for result in annotation['result'] if 'value' in result and 'labels' in result['value']]
    
    # Remove duplicates
    unique_labels = list(set(labels))

    # Update the task with the new label list
    task_id = task['id']
    task_data = task['data']
    task_data['labels'] = unique_labels
    ls.update_task(task_id=task_id, data=task_data)

print('All tasks have been updated with extracted labels.')

Before running the script, ensure that you have replaced LABEL_STUDIO_URL, API_KEY, PROJECT_ID, and any other placeholders with the correct values for your setup. Additionally, the script assumes that your data’s label architecture is under result['value']['labels']. If your data is structured differently, you will need to modify the path within the script.