I’m trying to integrate Label Studio with a custom S3-compatible storage endpoint hosted at http://s3.company.internal, but I’ve encountered an issue where the system isn’t correctly parsing the domain name (at label-studio/label_studio/io_storages/s3/utils.py at develop · HumanSignal/label-studio · GitHub). In this line, registered_domain returns as an empty string, which then causes the next line to fail and the “unrecognized S3 domain” exception.
I’ve verified that the URL is correct, and other tools can access it without issues.
I’ve attempted to troubleshoot using urlparse directly, and the result shows that while the domain (internal) and subdomain (s3.company) are extracted, the registered_domain is empty.
Has anyone faced similar issues with custom S3 domains or endpoints? Any advice on how to make Label Studio correctly handle custom domains like s3.company.internal?
I am using Label Studio 1.16.0. I’d prefer to keep the custom domain as it follows a similar format to all of my other endpoints.
Thanks for writing in with this error - and great work digging into the issue! One thing to note is that if the tldextract code is being reached at all, this does imply that an exception is already being raised when Label Studio is trying to interact with the storage endpoint. If you’re able to see the logs of the running Label Studio instance, you’d be able to read the details of the full logged exception on stderr:
By reading these logs, it should be possible to figure out what’s causing the exception and remedy the issue.