Label Studio Customized Backend for Semiautomatic Image Segmentation Labeling | by Alison Yuhan Yao

Customized backend; GCP Deployment; Data Versioning with GCS Integration

13 min read

20 hours ago

· Introduction
· Overview
∘ Goal
∘ Why semiautomatic?
∘ Entering Label Studio
∘ 1 frontend + 2 backends
· Implementation (Local)
∘ 1. Install git and docker & download backend code
∘ 2. Set up frontend to get access token
∘ 3. Set up backend containers
∘ 4. Connect containers
∘ 5. Happy labeling!
· GCP Deployment
∘ 1. Select project/Create new project and set up billing account
∘ 2. Create VM instance
∘ 3. Set up VM environment
∘ 4. Follow previous section & set up everything on VM
· GCS Integration
∘ 1. Set up GCS buckets
∘ 2. Create & set up service account key
∘ 3. Rebuild backend containers
∘ 4. SDK upload images from source bucket
∘ 5. Set up Target Storage
· Acknowledgement
· References

Creating training data for image segmentation tasks remains a challenge for individuals and small teams. And if you are a student researcher like me, finding a cost-efficient way is especially important. In this post, I will talk about one solution that I used in my capstone project where a team of 9 people successfully labeled 400+ images within a week.

Thanks to Politecnico de Milano Gianfranco Ferré Research Center, we obtained thousands of fashion runway show images from Gianfranco Ferré’s archival database. To explore, manage, enrich, and analyze the database, I employed image segmentation for smarter cataloging and fine-grained research. Image segmentation of runway show photos also lays the foundation for creating informative textual descriptions for better search engine and text-to-image generative AI approaches. Therefore, this blog will detail:

how to create your own backend with label studio, on top of the existing segment anything backend, for semiautomatic image segmentation labeling,
how to host on Google Cloud Platform for group collaboration, and
how to employ Google Cloud Storage buckets for data versioning.

Code in this post can be found in this GitHub repo.

Goal

Segment and identify the names and typologies of fashion clothing items in runway show images, as shown in the first image.

Why semiautomatic?

Wouldn’t it be nice if a trained segmentation model out there could perfectly recognize every piece of clothing in the runway show images? Sadly, there isn’t one. There exist trained models tailored to fashion or clothing images but nothing can match our dataset perfectly. Each fashion designer has their own style and preferences for certain clothing items and their color and texture, so even if a segmentation model can be 60% accurate, we call it a win. Then, we still need humans in the loop to correct what the segmentation model got wrong.

Entering Label Studio

Label Studio provides an open-source, customizable, and free-of-charge community version for various types of data labeling. One can create their own backend, so I can connect the Label Studio frontend to the trained segmentation model (mentioned above) backend for labelers to further improve upon the auto-predictions. Furthermore, Label Studio already has an interface that looks somewhat similar to Photoshop and a series of segmentation tools that can come in handy for us:

Brush & eraser
Magic Wand for similar-color pixel selection
Segment Anything backend which harnesses the power of Meta’s SAM and allows you to recognize the object within a bounding box you draw.

1 frontend + 2 backends

So far, we want 2 backends to be connected to the frontend. One backend can do the segmentation prediction and the second can speed up labelers’ modification if the predictions are wrong.

Now, let’s fire up the app locally. That is, you will be able to use the app on your laptop or local machine completely for free but you are not able to invite your labeling team to collaborate on their laptops yet. We will talk about teamwork with GCP in the next section.

1. Install git and docker & download backend code

If you don’t have git or docker on your laptop or local machine yet, please install them. (Note: you can technically bypass the step of installing git if you download the zip file from this GitHub repo. If you do so, skip the following.)

Then, open up your terminal and clone this repo to a directory you want.

git clone https://github.com/AlisonYao/label-studio-customized-ml-backend.git

If you open up the label-studio-customized-ml-backend folder in your code editor, you can see the majority are adapted from the Label Studio ML backend repo, but this directory also contains frontend template code and SDK code adapted from Label Studio SDK.

2. Set up frontend to get access token

Following the official guidelines of segment anything, do the following in your terminal:

cd label-studio-customized-ml-backend/label_studio_ml/examples/segment_anything_modeldocker run -it -p 8080:8080 \
-v $(pwd)/mydata:/label-studio/data \
--env LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true \
--env LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/data/images \
heartexlabs/label-studio:latest

Then, open your browser and type http://0.0.0.0:8080/ and you will see the frontend of Label Studio. Proceed to sign up with your email address. Now, there is no project yet so we need to create our first project by clicking Create Project. Create a name and description (optional) for your project.

Upload some images locally. (We will talk about how to use cloud storage later.)

For Labeling Setup, click on Custom template on the left and copy-paste the HTML code from the label-studio-customized-ml-backend/label_studio_frontend/view.html file. You do not need the four lines of Headers if you don’t want to show image metadata in the labeling interface. Feel free to modify the code here to your need or click Visual to add or delete labels.

Now, click Save and your labeling interface should be ready.

On the top right, click on the user setting icon and click Account & Setting and then you should be able to copy your access token.

3. Set up backend containers

In the label-studio-customized-ml-backend directory, there are many many backends thanks to the Label Studio developers. We will be using the customized ./segmentation backend for segmentation prediction (container 1) and the ./label_studio_ml/examples/segment_anything_model for faster labeling (container 2). The former will use port 7070 and the latter will use port 9090, making it easy to distinguish from the frontend port 8080.

Now, paste your access token to the 2 docker-compose.yml files in ./segmentationand ./label_studio_ml/examples/segment_anything_model folders.

environment:
- LABEL_STUDIO_ACCESS_TOKEN=6dca0beafd235521cd9f23d855e223720889f4e1

Open up a new terminal and you cd into the segment_anything_model directory as you did before. Then, fire up the segment anything container.

cd label-studio-customized-ml-backend/label_studio_ml/examples/segment_anything_modeldocker build . -t sam:latest
docker compose up

Then, open up another new terminal cd into the segmentation directory and fire up the segmentation prediction container.

cd label-studio-customized-ml-backend/segmentationdocker build . -t seg:latest
docker compose up

As of now, we have successfully started all 3 containers and you can double-check.

4. Connect containers

Before, what we did with the access token was helping us connect containers already, so we are almost done. Now, go to the frontend you started a while back and click Settings in the top right corner. Click Machine Learning on the left and click Add Model.

Be sure to use the URL with port 9090 and toggle on interactive preannotation. Finish adding by clicking Validate and Save.

Similarly, do the same with the segmentation prediction backend.

Then, I like to toggle on Retrieve predictions when loading a task automatically. This way, every time we refresh the labeling page, the segmentation predictions will be automatically triggered and loaded.

5. Happy labeling!

Here is a demo of what you should see if you follow the steps above.

Video by Author

If we are not happy with the predictions of let’s say the skirt, we can delete the skirt and use the purple magic (segment anything) to quickly label it.

Video By Author

I’m sure you can figure out how to use the brush, eraser and magic wand on your own!

Source link

What's Hot

No Train, All Gain: Enhancing Deep Frozen Representations with Self-Supervised Gradients

Meta AI Researchers Introduce Mixture-of-Transformers (MoT): A Sparse Multi-Modal Transformer Architecture that Significantly Reduces Pretraining Computational Costs

A Practical Framework for Data Analysis: 6 Essential Principles | by Pararawendy Indarjo | Nov, 2024

Label Studio Customized Backend for Semiautomatic Image Segmentation Labeling | by Alison Yuhan Yao | Apr, 2024

A Practical Framework for Data Analysis: 6 Essential Principles | by Pararawendy Indarjo | Nov, 2024

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Increase Trust in Your Regression Model The Easy Way | by Jonte Dancker | Nov, 2024

Leave A Reply Cancel Reply

How ML AI Can Help Businesses Reduce Overhead Costs

How the AI Surge May Help Current WFH Employees

The ultimate contact center automation guide

Top 5AI Development Companies To Transform Your Business | by Amyra Sheldon

No Train, All Gain: Enhancing Deep Frozen Representations with Self-Supervised Gradients

Meta AI Researchers Introduce Mixture-of-Transformers (MoT): A Sparse Multi-Modal Transformer Architecture that Significantly Reduces Pretraining Computational Costs

A Practical Framework for Data Analysis: 6 Essential Principles | by Pararawendy Indarjo | Nov, 2024

How I Created a Data Science Project Following CRISP-DM Lifecycle | by Gustavo Santos | Nov, 2024

Our Picks

No Train, All Gain: Enhancing Deep Frozen Representations with Self-Supervised Gradients

Meta AI Researchers Introduce Mixture-of-Transformers (MoT): A Sparse Multi-Modal Transformer Architecture that Significantly Reduces Pretraining Computational Costs

A Practical Framework for Data Analysis: 6 Essential Principles | by Pararawendy Indarjo | Nov, 2024

What's Hot

Label Studio Customized Backend for Semiautomatic Image Segmentation Labeling | by Alison Yuhan Yao | Apr, 2024

Customized backend; GCP Deployment; Data Versioning with GCS Integration

Table of Contents

Goal

Why semiautomatic?

Entering Label Studio

1 frontend + 2 backends

1. Install git and docker & download backend code

2. Set up frontend to get access token

3. Set up backend containers

4. Connect containers

5. Happy labeling!

Related Posts

Leave A Reply Cancel Reply