I am labeling images in label studio 1.12.1 and have cloud storage connected to a microsoft azure blob. Whenever the target storage is synced, it uploads files without an extension that are the LS json format. Is there anyway to have it export as VOC format so I can have both images and .xml files uploaded?
Hey @wdm230, thanks for posting here and welcome to the community!
Currently, Label Studio does not support exporting annotations in formats other than its native JSON format when syncing with cloud storage. However, you can achieve your goal by using the label-studio-converter
tool to convert the exported JSON files into the VOC format and then upload the converted files to your Azure Blob Storage.
You can install it via:
pip install label-studio-converter
And here’s a sample script that you modify:
import os
from label_studio_converter import Converter
from azure.storage.blob import BlobServiceClient
# Initialize Azure Blob Storage client
blob_service_client = BlobServiceClient.from_connection_string("your_connection_string")
container_client = blob_service_client.get_container_client("your_container_name")
# Path to the directory containing the JSON files
json_dir = "/path/to/json/files"
voc_output_dir = "/path/to/voc/output"
# Convert JSON to VOC format
converter = Converter()
for json_file in os.listdir(json_dir):
if json_file.endswith(".json"):
json_path = os.path.join(json_dir, json_file)
converter.convert(json_path, voc_output_dir, format="VOC")
# Upload VOC files to Azure Blob Storage
for voc_file in os.listdir(voc_output_dir):
voc_path = os.path.join(voc_output_dir, voc_file)
blob_client = container_client.get_blob_client(voc_file)
with open(voc_path, "rb") as data:
blob_client.upload_blob(data, overwrite=True)
Let me know if this is helpful