Labelling: using AI assistance for question creation

[hook is missing]

What's Wayground?

Wayground is an education platform helping teachers drive student outcomes with tools to create content like assessments & lessons, track student understanding, content libraries and more.

Wayground has a premium plan that lets teachers create assessments without any limits, and gives them access to premium question types like Labeling.

Problem statement

The project began as part of a broader effort to improve the premium question type experience across the platform, specifically QT creation experience for teachers and ease-of-use for students, so that teachers built greater trust in premium features over time.

We started with the Labelling QT, which was commonly used by science, biology and social studies teachers to ask questions like "label the parts of a flower" & "Annotate the map with landmarks". It had the highest drop off rates of all premium question types as it was tedious and unintuitive to create.

How Labelling QT originally worked

In the old experience, teachers created a labeling question inside a WYSIWYG-style editor that mirrored what students would eventually see.

[image or illustration of how the old creation UI looked like, maybe a isometric layout of 3-4 creation UI that show the steps that teachers have to take]

In practice, WYSIWYG made question creation awkward and confusing. Using a combination of Hotjar recording analysis, conversations with teachers and usage metrics, we listed concerns with our creation experience.

[bento of problems teachers faced when creating Labelling questions]

most labelling questions had a simple question prompt like "label this image" and having that upfront was unnecessary
Image selected was the first thing most teachers did, but the UI didn't mimic the teacher flow and added complexity
Teachers were leaving the platform to search for images in google, when they can search for images on the platform itself. Our UI didn't do a great job in educating that we had platform google image search support.
Many images that were searched on google images had labels already present on the image.
It was tedious to create all the labels and position them manually.
Sometimes the labels covered important segments of the image.
Teachers wanted to directly manipulate labels on the image, but they couldn't.

Initial ideas

We knew the question creation needs to be image-first, which breaks the WYSIWYG paradigm.

Using vision models to accelerate question creation

We wanted to reduce the time taken to create a labelling question using LLMs. Writing prompts to Generate images from scratch is hard. Throwing a prompt box at the teacher is also not a great experience, teachers are still getting used to prompting LLMs and image generation is expensive.

We tried various approaches for generating labels without requiring teachers to type long prompts.

Given an image, select a category of labels that the teacher wants - for a plant cell, a. cell components, b. tissue layers, c. cell function.
Then select the number of labels to generate
We then skipped the number of labels as VLMs were great at estimating the number of labels generally added to each image.
We realised, the category of labels is essentially the question itself! We can make it more magical by asking the teacher what question they want to ask.
For labels, we tried this pattern where hovering on the question shows you the labels that will be added on the image.

[talk about how we handled, latency, label placement issues, issues with label hallucinations] [add a visual talking about early iterations on label placement]

Label creation experience

V1 approach (shipped 2024)

[flow of the image-first labelling experience which is shown alongside a video of how the question works]

image selection first - google images or upload
improve search quality with labels toggle
once image is selected, AI understands chosen image
system generates a set of plausible labeling questions and corresponding labels
teacher picks a question and labels are added to the image
teacher can edit the labels or pick a different question

V2 experience (shipped 2026)

Why this project was unique

What makes this project meaningful to me is not just that it improved a workflow, but how it used AI.

In 2024, the most obvious way to make something “AI-powered” was to add a prompt box. We deliberately chose not to do that.

Instead, we built AI into the existing teacher workflow:

the teacher picks an image
the system understands the image
it proposes likely questions and labels
the teacher reviews visually
the teacher stays in control

That made the AI feel:

faster
less intimidating
more aligned with teacher intent
more useful in practice

To me, this was a stronger model of AI product design: don’t ask users to speak AI’s language when the system can meet them in theirs.

Impact

The redesign meaningfully improved the creation experience for Labeling.

Quantitative outcomes

Save rate increased from ~48–52% to ~75%
Average creation time reduced from ~8 minutes to ~5.5–6 minutes
AI suggestions were accepted about 1 in 3 times
Labeling question creation volume increased, though not dramatically

Qualitative outcomes

the flow became much easier for new users to understand
teachers had less manual setup work
the question type became a much better demo surface for premium value
internally, it helped shift perception of what useful AI in Wayground could look like

This was also one of the first strong examples at Wayground of AI being deeply integrated into product workflow, rather than existing as a standalone novelty.

[Add metric chart + maybe a small quote/callout from internal reception or demo impact]