How do I upload annotations in the MS COCO format for images?

Hello,

I have labeled images for the object detection task in MS COCO dataset format.
I have Label Studio (v 1.13.1), which is running in a Docker container.
Is there a way to upload existing labeling MS COCO for images to an existing project in label studio?

Thank you for your help.

Yes, you can import your existing MS COCO annotations into your Label Studio project. To do this, you can use the Label Studio Converter, which is part of the Label Studio SDK. The converter allows you to transform your MS COCO dataset into the Label Studio JSON format, which can then be imported into your existing project.

Here are the steps to import your COCO annotations:

  1. Install the Label Studio SDK:Since you’re running Label Studio in a Docker container, you may need to run the converter outside the container or install the SDK inside the container. You can install the Label Studio SDK using pip:
pip install label-studio-sdk
  1. Use the Label Studio Converter to convert COCO annotations:Use the converter to transform your COCO annotations into Label Studio JSON format. Run the following command, replacing the paths with your actual paths:
label-studio-converter import \
  --input-format COCO \
  --input-file /path/to/annotations.json \
  --output-file /path/to/output/label_studio_annotations.json

This command converts the COCO annotations.json file into a format that Label Studio can import.
3. Import the converted annotations into Label Studio:In the Label Studio web interface:

  • Go to your existing project.
  • Click on the Import button.
  • Upload the label_studio_annotations.json file generated in the previous step.
  • Ensure that the tasks and annotations are correctly imported into your project.
1 Like

Hello, thank you very much for you instructions and explanations.

I tried to do them.

I tried to install Label Studio SDK by pip and command

pip install label-studio-sdk

in conda virtual environment and docker with Label studio.
Also I tried to do it in poetry env by command:

poetry add label-studio-sdk

After that I tried to run Label Studio Converter by command:

label-studio-converter import \
 --input-format COCO \
 --input-file data/my_coco_annotation.json \
 --output-file data/label_studio_annotations.json

And I got this error:

label-studio-converter: command not found

I searching for it and I found this git repo(link).

As I understand it, this tool is now in the archive and its code has been included in the Label Studio SDK.

I found the converter classes:

from label_studio_sdk.converter import Converter

c = Converter()

c.convert_to_json()

Unfortunately, I didn’t able to figure out how to use them properly.

I decided to use this tool from the repository.
I install Label Studio Converter by command:

pip install label-studio-converter

I was able to convert my MS COCO annotation to Label Studio project json by command:

label-studio-converter import coco\
  -i data/my_coco_annotation.json \
  -o data/label_studio_annotations.json

After completing the conversion process, Label Studio Converter printed me the following instructions in the console:

INFO:root:Reading COCO notes and categories from /data/my_coco_annotation.json
INFO:root:Found 2 categories, 5 images and 75 annotations
WARNING:root:Segmentation in COCO is experimental
INFO:root:Saving Label Studio JSON to /data/label_studio_annotations.json

  1. Create a new project in Label Studio
  2. Use Labeling Config from "/data/label_studio_annotations.label_config.xml"
  3. Setup serving for images [e.g. you can use Local Storage (or others):
     https://labelstud.io/guide/storage.html#Local-storage]
  4. Import "/data/label_studio_annotations.json" to the project

According to the documentation from the terminal:

  1. I create a new project in Label Studio.
  2. I insert the Labeling Config provided by the converter data/label_studio_annotations.label_config.xml in project’s Labeling Config by Web UI.
  3. I import a json file with the project from Label Studio data/label_studio_annotations.json by Web UI.

After this step, the paths to the pictures will appear in the project, but the pictures themselves will not be:

I connect the local storage according to the documentation [link](https://label stud.io/guide/storage.html#Local-storage) to place images in it that will be in our project.

In order to activate Local storage, we register additional environment variables in the mydata/.new file with environment variables for the Docker container link, launching Label Studio:

LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/data/local_storage

Launching Label Studio with support for the new environment variables we have set:

docker run -it -p 8080:8080 --env-file $(pwd)/mydata/.env -v $(pwd)/mydata:/label-studio/data heartexlabs/label-studio:latest 

We point the folder to dataset1:

/label-studio/data/local_storage/images

Images from the dataset appear in the project and their markup is displayed on them.

Everything works, thank you )

1 Like

I still have a few questions that I would also like to clarify )

As I understand it, when we upload images to Label Studio, it adds hashes to the file names.
Fow example, we upload “image_1.png” by Data Manager Web UI.
Label Studio shows us the filename of this picture as “1d2f1q2d4fg3_image_1.png”.

If we upload images to the project, their names will change. However, when we convert our original MS COCO annotation file, we get a Label Studio project json file with the original filenames of the images.

In this situation, Label Studio would not be able to locate images based on their new names.
It turns out that using local storage or cloud storage for our image files allows us to preserve the original names of the images specified in the MS COCO annotation?

Initially, MS COCO stores images in a separate “.\images” folder.

  "images": [
    {
      "width": 6125,
      "height": 17375,
      "id": 0,
      "file_name": "images\/image_1.png"
    },

Therefore, when connecting a dataset stored locally, the full path to the dataset is specified like this:

/label-studio/data/local_storage/images

It would be great to save pictures by dataset names.:

/label-studio/data/local_storage/dataset1

but in this case, the pictures will not be found.

If we attach the images folder to the dataset folder as subfolder, the images will not be detected either:

/label-studio/data/local_storage/dataset1/images

We need to get rid of the path containing the “./images” folder in all the paths to the pictures in the original MS COCO annotation file before convertation to Label Studio project json?

  "images": [
    {
      "width": 6125,
      "height": 17375,
      "id": 0,
      "file_name": "image_1.png"
    },