Automating Fine-tuning Dataset Creation using Multimodal Generative AI Models

Leveraging the latest multimodal models from Anthropic to automate the tedious and often error-prone process of creating labeled datasets for training and fine-tuning Generative AI models.

Gary A. Stafford
16 min readJul 31, 2024

Introduction

Recently, I co-developed and presented a talk on fine-tuning generative AI foundation models, Build a Personalized Avatar with Amazon Titan Image Generator, with my peer, Deepti Tirumala, at the 2024 AWS New York Summit. The attendance for our talk and the interest from the community afterward were significant.

2024 AWS New York Summit talk
Sample output from a fine-tuned Amazon Titan Image Generator model

In researching the topic for the talk, we fine-tuned dozens of copies of the Amazon Titan Image Generator foundation model and similar text-to-image models from Stability AI, including Stable Diffusion XL (SDXL) and Stable Diffusion 3 Medium (SD3). Fine-tuning was accomplished on several AI platforms, including Amazon Bedrock, Amazon SageMaker, Civitai, and locally with ComfyUI and Stable Diffusion web UI (A1111). We experimented with different dataset sizes…

--

--

Gary A. Stafford

Area Principal Solutions Architect @ AWS | 10x AWS Certified Pro | Polyglot Developer | DataOps | GenAI | Technology consultant, writer, and speaker