Anatomy Illustrator AI
This is a submission for the Google AI Studio Multimodal Challenge
What I Built
I built Anatomy Illustrator AI 🎨, a web app designed to take the headache out of creating educational diagrams.
Have you ever needed a specific anatomical illustration for a presentation or study guide, only to find nothing that quite fits?
You might find a diagram of the heart, but it’s missing the labels you need. Or you find one with the right labels, but the style is all wrong.
Anatomy Illustrator AI solves this problem.
It’s a simple, two-step tool for anyone—students, teachers, and medical creators—to generate beautiful, custom-labeled anatomical diagrams on the fly.
Step 1: You describe the structure you want to see.
Step 2: You list the labels you want to add.
The AI handles the rest, giving you a clean, accurate, and ready-to-use illustration in seconds. It’s like having a professional medical illustrator at your fingertips. 🧠✨
Demo
Here’s a look at the applet in action!
1. The User Interface
The app starts with a clean and focused interface. You have two main inputs: one for describing the anatomical structure and another for listing the labels.
2. Image Generation & Selection
After you submit your prompt, the app uses Imagen 4 to generate two high-quality illustrations. This gives you creative control to pick the one that best matches your vision.
3. AI-Powered Labeling
Once you select an image, Gemini gets to work. It analyzes the image and your text list to add clear, accurate labels with leader lines. The final result is a polished, professional diagram.
How I Used Google AI Studio
Google AI Studio was my sandbox for bringing this idea to life. I used it extensively to test and refine the prompts that power the entire experience.
My application relies on a powerful, two-stage multimodal pipeline:
-
Image Generation: I use the
imagen-4.0-generate-001
model for the initial creation step. I experimented in AI Studio to craft a prompt that consistently produces clean, textbook-quality illustrations with neutral backgrounds, perfect for labeling. -
Image Editing & Labeling: This is where the magic happens. I leverage the
gemini-2.5-flash-image-preview
model. My prompt instructs the model to take an input image and a text string of labels and intelligently add them to the illustration. AI Studio was essential for figuring out how to ask the model to create professional-looking leader lines and legible text.
This project is a perfect example of chaining different AI models together to create a cohesive and powerful user workflow.
Multimodal Features
The core of Anatomy Illustrator AI is its deep integration of multimodal features. It’s not just using text or images; it’s using them together to create something new.
-
Text-to-Image Creation
The journey starts by translating a user’s written concept (e.g., “A cross-section of the human eye”) into a rich visual. This empowers users to create the exact base image they need without any artistic skill. -
Image-and-Text Editing
This is the most critical multimodal feature. The app sends both an image (the user’s selected illustration) and text (the comma-separated labels) to Gemini in a single request.Why is this so powerful?
It enhances the user experience by abstracting away a complex task. Instead of needing a photo editor and a steady hand, the user just provides a list.
The AI understands the visual context of the image and the semantic meaning of the labels, placing them accurately.
This creates a seamless, intuitive, and incredibly useful tool for education and content creation.