How can I restore the labeling from individual files stored in Target Storage?

Hello @makseq

I was trying to do the same thing and stumbled upon this discussion.

I added the env variable to my docker run command as such:
docker run --rm -p 80:8080 -v /home/ec2-user/mydata:/label-studio/data heartexlabs/label-studio:latest -e FUTURE_SAVE_TASK_TO_STORAGE=1

The saved annotations don’t seem to differ though. Before and after adding the env variable, the format is the same and seems to match what you were recommending so I was wondering if something had been implemented in a recent version of LS to include that (I am on 1.14.0).

Here is the format I get:

{
    "id": 5,
    "result": [
        {
            "original_length": 3,
            "value": {
                "start": 0.20736288504883546,
                "end": 2.129977460555973,
                "channel": 0,
                "labels": [
                    "Siren"
                ]
            },
            "id": "smw56",
            "from_name": "label",
            "to_name": "audio",
            "type": "labels",
            "origin": "manual"
        }
    ],
    "created_username": " antoine.purier@microdb.fr, 1",
    "created_ago": "0\u00a0minutes",
    "completed_by": {
        "id": 1,
        "first_name": "",
        "last_name": "",
        "email": "antoine.purier@microdb.fr"
    },
    "task": {
        "id": 21859,
        "data": {
            "audio": "s3://dbflash/audio/ref_mic/Paris17_2022_04_12_09_39_02_1649749066_ref_mic.wav"
        },
        "meta": {},
        "created_at": "2024-12-06T14:53:45.190681Z",
        "updated_at": "2024-12-11T08:30:27.196447Z",
        "is_labeled": true,
        "overlap": 1,
        "inner_id": 10926,
        "total_annotations": 1,
        "cancelled_annotations": 0,
        "total_predictions": 0,
        "comment_count": 0,
        "unresolved_comment_count": 0,
        "last_comment_updated_at": null,
        "project": 2,
        "updated_by": 1,
        "file_upload": null,
        "comment_authors": []
    },
    "was_cancelled": false,
    "ground_truth": false,
    "created_at": "2024-12-11T09:10:48.736560Z",
    "updated_at": "2024-12-11T09:10:48.736595Z",
    "draft_created_at": null,
    "lead_time": 5.257,
    "import_id": null,
    "last_action": null,
    "project": 2,
    "updated_by": 1,
    "parent_prediction": null,
    "parent_annotation": null,
    "last_created_by": null
}

I then tried to import it add this file (+ json extension) to my target source storage. It imports without any error but I can’t see the audio or annotation.
I tried to import it via the import button but I get the following error:

Note: My source storage is dbflash/tasks and the referenced audio is in another bucket dbflash/audio/ref_mic. But I tried with a more standard json task file without annotations that was in this dbflash/tasks bucket, referencing audio in the dbflash/audio/ref_mic and it works. It is just with this annotated task file that it is not working.

  • Is this format correct or does it need further processing?
  • Do I still need to include the env variable at launch or has something been implemented in the latest LS version?
  • What is the proper way to import such an annotation file?