Enabling Async Taxonomy Feature

Dmitry Kuleshov: Hi!
We are trying to create view form nlp labelling task. One of texts has thousands of choices and if we provide that as Choice in Choices or Label in Labels it kills UI performance
Can someone advise on this case? Thanks

Jo Booth (HumanSignal): Hi Dmitry,

For this scenario we are currently working on a feature called “Async Taxonomy”, wherein the taxonomy can be loaded from a .json file. When this feature is used the taxonomy choices are not part of the labeling config, and performance will be much better.

It’s not generally available yet, but if you’re running open source label studio, you can try it out by enabling the relevant feature flag on a recent version of the app: fflag_feat_front_lsdv_5451_async_taxonomy_110823_short

Then you’d need to provide a URL hosting a flat file containing your taxonomy. The file’s structure should be like the following:

{
  "items": [
    {
      "alias": "AN",
      "value": "Antarctica",
      "children": [
        {
          "alias": "AQ",
          "value": "Antarctica",
          "children": [
            {
              "value": "McMurdo Station"
            }
          ]
        },
        {
          "alias": "GS",
          "value": "South Georgia and the South Sandwich Islands",
          "children": [
            {
              "value": "Grytviken"
            }
          ]
        },
        {
          "alias": "TF",
          "value": "French Southern Territories",
          "children": [
            {
              "value": "Port-aux-Français"
            }
          ]
        }
      ]
    }
  ]
}

Your labeling config should contain something like this:

<View>
  <Text name="text" value="$text"/>
  <Taxonomy name="taxonomy" toName="text" apiUrl="<YOUR TAXONOMY URL>" >
  </Taxonomy>
</View>

Again this feature isn’t generally available, so it’s quite likely you’ll encounter some rough edges in testing this out. But, if you’re running open source Label Studio and interested in trying out some of our newest functionality, this is an option.

Dmitry Kuleshov: Thanks a lot! I’ll try it

Dmitry Kuleshov: Can you point me to documentation on how to enable ff?
I’ve created config.json file with

{
    "global": {
        "featureFlags": {
            "fflag_feat_front_lsdv_5451_async_taxonomy_110823_short": true
        }
    }
}

and started latest docker image with start --config config.json
but I don’t see requests from LS to apiURL

Dmitry Kuleshov: from helm chart it looks like ENV variable fflag_feat_front_lsdv_5451_async_taxonomy_110823_short should be set to true

Jo Booth (HumanSignal): I believe the env variable approach should work, yes :+1: let me know if you run into issues with it!

Dmitry Kuleshov: I did)
docker run -it -p 8080:8080 -e "fflag_feat_front_lsdv_5451_async_taxonomy_110823_short=true" heartexlabs/label-studio:latest label-studio start I use this to start label-studio

<View>
  <Text name="text" value="$text"/>
  &lt;Taxonomy name="taxonomy" toName="text" apiUrl="<http://127.0.0.1:8888/codes>" &gt;
  &lt;/Taxonomy&gt;
&lt;/View&gt;

Project template

{"items":[{"alias":"S28222S","value":"Partial traumatic amputation of left breast, sequela"}]}

example of API response

but I don’t see requests from label-studio to API

Jo Booth (HumanSignal): Could you try this one more time with version 1.9.0, which was just released / published to Docker Hub? https://hub.docker.com/layers/heartexlabs/label-studio/1.9.0/images/sha256-261eb2ca1c9c43dc2ea4c7c7197475d9c31cd6e0d039d943c0ebda3cce9e9d7a?context=explore|https://hub.docker.com/layers/heartexlabs/label-studio/1.9.0/images/sha256-261eb2ca1[…]4c7c7197475d9c31cd6e0d039d943c0ebda3cce9e9d7a?context=explore

Dmitry Kuleshov: thanks it works
is it possible to configure single choice for taxonomy?

Jo Booth (HumanSignal): we have the somewhat related leafsOnly="true" as a config option where only leaf nodes in the taxonomy can be selected, but afaik we don’t have a way of restricting the user to selecting only one choice yet (though I would need some time to fully confirm this)

Note: This post was generated by the Label Studio Archive Bot from a conversation in the Label Studio Slack, a gathering place for the Label Studio community. Someone in the community thought this was worth sharing!

If this post answered a question for you, hit the Like button - we use that to assess which posts to put into docs.

archivebot: Gosh, this is an interesting conversation - I’ve filed a copy at http://community.labelstud.io/t/enabling-async-taxonomy-feature/73 for future reference!

Jo Booth (HumanSignal): @Dmitry Kuleshov re: limiting number of selections, we’ll eventually have maxUsages support in Async Taxonomy, which will be what you’re looking for, but it doesn’t work yet.

Dmitry Kuleshov: Thanks a lot for your help!

Dmitry Kuleshov: Do you have somewhere PR or discussion regarding this feature?
My question is - can we display both alias and value during search, otherwise it’s confusing a bit

Dmitry Kuleshov: at least from my testing with alias and value I’ve got strange result
when you have alias and value it filters by alias

Jo Booth (HumanSignal): This is the PR containing most of the implementation; there have been several others containing fixes and improvements in the same repo: https://github.com/HumanSignal/label-studio-frontend/pull/1526

We always welcome github issues! Would be great to get some more specific insights into the issue you’re experiencing with aliases and values (in the context of search, as I understand it)?