Tesseract interactive bounding boxes not working

ot9 · May 21, 2024, 9:48pm

Hello - I have followed the instructions to configure a Tesseract engine for my annotation project here: https://docs.humansignal.com/tutorials/tesseract

However, the Autodetect labels are not automatically displaying the text as shown in the tutorial. am using the same config as the one on the instructions page above. I tested the Tesseract connection and it is returning a good response of “200.” Any pointers to resolve the issue would be appreciated.

As a temporary workaround, I am manually preprocessing my documents using Tesseract and importing the resulting JSON files into Label Studio as described here: https://labelstud.io/blog/improve-ocr-quality-for-receipt-processing-with-tesseract-and-label-studio/.

Share your setup!
What version of Label Studio you’re using (for example, 1.10.0). 1.12.0

How you installed Label Studio (for example, pip, brew, Docker, etc.).
Docker

Your labeling configuration.

<View>    
   <Image name="image" value="$ocr" zoom="true" zoomControl="false"
         rotateControl="true" width="100%" height="100%"
         maxHeight="auto" maxWidth="auto"/>
   
   <RectangleLabels name="bbox" toName="image" strokeWidth="1" smart="true">
      <Label value="Label1" background="green"/>
      <Label value="Label2" background="blue"/>
      <Label value="Label3" background="red"/>
   </RectangleLabels>

   <TextArea name="transcription" toName="image" 
   editable="true" perRegion="true" required="false" 
   maxSubmissions="1" rows="5" placeholder="Recognized Text" 
   displayMode="region-list"/>
</View>

Sample data to help us reproduce the issue.
You can use the receipt on Wiki here: https://upload.wikimedia.org/wikipedia/commons/0/0b/ReceiptSwiss.jpg

sajarin · May 23, 2024, 3:43pm

Hey @ot9, can you check if there are any logs you can share (either LS logs or Tesseract logs?)

Since you mentioned that the Tesseract connection returns a “200” response, it does mean that the backend is running. But one more thing to verify is that the response contains the expected predictions. You can test this by sending a sample request to the Tesseract backend and checking the response format.

Nikolai_Liubimov · May 27, 2024, 5:53pm

Also please note the data ingestion pipeline from Label Studio to ML backend Label Studio Documentation — Integrate Label Studio into your machine learning pipeline

This is a common error when trying to access images uploaded directly to Label Studio with import file dialog. In this case you have to provide LABEL_STUDIO_HOST and LABEL_STUDIO_ACCESS_TOKEN environmental parameters label-studio-ml-backend/label_studio_ml/examples/tesseract at master · HumanSignal/label-studio-ml-backend · GitHub

Topic		Replies	Views
Label Studio & Tesseract Integration Machine Learning	2	200	May 23, 2024
Label Studio not presenting the labels and OCR text Label Studio Support annotations	9	71	April 15, 2025
Why can't I see my text areas in my annotation interface? Label Studio Support text , api , annotations	2	78	April 11, 2025
How do I access and modify the "text" value associated with my image annotations? Label Studio Support text , annotations	0	101	January 22, 2025
Resets zoom when marking objects on the map after pressing the Send button Label Studio Support	0	22	October 8, 2024

Tesseract interactive bounding boxes not working

Related topics