Generative AI for Creative Applications Using Storia Lab
In a recent presentation at the Zilliz Unstructured Data Meetup, Mihail Eric, the founder of Storia AI, introduced Storia Lab, a suite of APIs designed to integrate advanced image editing functionalities into your applications.
Watch the replay of Mihail’s meetup talk.
This talk focused on enhancing generative AI imagery, detailing how we can make nuanced edits while preserving the integrity of the original images. Mihail discusses practical solutions for image editing needs, such as background removal, text correction, and stylistic adjustments in images, making advanced image editing accessible to developers via APIs across various platforms.
Let's discuss each solution at a time and how you can integrate Storia Lab APIs with the Milvus vector database for building advanced GenAI applications like multimodal retrieval augmented generation (RAG).
Textify API for Text Correction
The Textify API corrects or changes existing text while maintaining the original font and style. It addresses a common issue in AI-generated visuals where the text appears gibberish or contains errors. Let's take a look at the following example. The image below is generated using MidJourney and misspells Happy in the word Happy Birthday.
Fig 1- Input image Generated via Midjourney
To correct this image, you call the Textify API with parameters that specify the area of the image to correct and the text that needs replacement. Textify then replaces the gibberish with meaningful text while maintaining the original font and style. Look at the output below:
Fig 2- Output image showing gibberish corrected by Textify API
You can see the image now contains the correct happy birthday message. Let's see how Storia Lab handles image background modifications.
Background Removal and Replacement APIs
The background removal API removes image backgrounds using a background removal model. Storia Lab automatically performs this task when you submit an image via the API. This capability is handy for enhancing visibility when displaying images against a neutral or non-distracting backdrop. Look at the following side-by-side image showing the results of removing a background using the background removal API.
Fig 3- Side-by-side comparison showing the results of removing an image background using Storia lab
Storia Lab offers a background replacement API besides the background removal API. This one works differently, as we need a prompt. To change the background of an AI-generated image or your image, you pass the input image and a prompt describing the background you would like the output image to have. Storia Lab’s background replacement model will then reconstruct the background of your image to meet the prompt requirements. Let's take a look at the results:
Fig 4- Left- Input Generated via Midjourney; Right- Output Background replaced using the prompt "modern motion graphics, squares, Gen Z
This model does an excellent job of replacing the background with the requested modern motion graphics, squares, and Gen Z prompts. Apart from backgrounds, you may need to remove some unwanted elements in an image. Let's see how Storia Lab handles this request.
Removing Unwanted Elements Using the Cleanup API
The Defect Cleanup API removes unwanted elements, such as objects, defects, or text, from images by marking the area around them. The cleanup model then processes the image to remove these elements without requiring manual editing. Take a look at this image, for example.
Fig 5- Left- Input of the original photograph Right- Output with the people in the background removed
The background has random people, but we want to focus on the child. To remove the random people, we need to mark the areas containing them. Then, the Storia Lab’s cleanup model will remove the marked elements and recreate those areas to match the rest of the image.
So far, we have been looking at how Storial Lab makes nuance edits to original or AI-generated images. Let us now shift gears and look at another Storia Lab’s generative feature in which a sketch is turned into a realistic picture.
Sketch-to-Image Conversion API
The Sketch-to-Image Conversion API transforms basic sketches into detailed digital art or photorealistic images. It takes a sketch as the input and a prompt describing how the output should be. The more creative you are with the prompt, the better the results. Let’s see the results of turning a living room sketch into a realistic image.
Fig 6- Left- Input of the original sketch Right- Output Output for prompt "photorealistic living room in green, burnt orange and gold accents"
The above image shows a realistic living room generated from a sketch. It contains all the colors specified in the prompt. As you have seen, when it comes to generative models, what you specify in the prompt is what you get as the output. To learn more about prompting, go through this prompt engineering guide to familiarize yourself with the different methods involved.
Apart from the above API solutions, Mihail breezes through the potential of Storia Lab integrating with multimodal applications in the talk. Let's slow that down.
Integrating Storia Lab with Milvus for Advanced Multimodal RAG Applications
Milvus is an open-source vector database designed to handle billion-scale vectors efficiently. It leverages vector search to retrieve various data types. Vector search, also known as similarity or nearest neighbor search, converts data into high-dimensional vectors using an embedding model and finds the most similar vectors based on their proximity in the vector space.
Multimodal RAG applications process multiple data types for retrieval and generation augmentation tasks. Utilizing Milvus's vector search capabilities enhances their efficiency and robustness.
Combining Milvus with Storia Lab's APIs can further enhance multimodal RAG applications, enabling customized content creation, automated workflows, and personalized recommendations.
Here are some practical use cases that demonstrate the synergy between Storia AI and Milvus:
Multimodal Content Creation: This application would allow users to generate and edit images based on text prompts or sketches. Using Milvus, the app searches a vector database for images that match a user's description or sketch. Then, Storia Lab’s APIs enable users to modify or enhance these images, seamlessly integrating textual and visual creativity.
Image Search and Recommendation: Create a dynamic image search engine where users can query images using text or an example image. Milvus manages the vector representations for efficient similarity searching, while Storia Lab's editing capabilities refine the output, providing tailored image recommendations or enhancements.
Visual Content Curation: This application curates and customizes visual content (images/videos) from multiple sources based on user preferences or themes. Milvus retrieves similar content, and Storia Lab’s tools make final adjustments, like background removal or stylistic edits, enhancing the user's curation process.
Multimodal E-commerce: Enhance an e-commerce platform by allowing customers to search for products using text and visual inputs. Milvus supports similarity searches for product images, and Storia Lab's APIs aesthetically enhance these images or correct visual errors, improving the shopping experience.
Creative Design Tools: These comprehensive tools support designers and incorporate image generation, manipulation, and vector search. Designers initiate projects based on sketches or text prompts, find similar existing images with Milvus, and use Storia Lab’s APIs to refine and personalize the designs, fostering a creative and efficient design process.
Visual Content Moderation: For platforms hosting user-generated content, develop a system that uses Milvus for similarity searches to identify and flag inappropriate images. Storia Lab’s editing tools automatically adjust or remove problematic elements, ensuring effective and efficient content moderation.
The above use cases only cover multimodal applications that use images and text. However, it is essential to note that Milvus supports more data types, including videos, audio, etc. It is just a matter of your needs.
Conclusion
Storia AI is an AI-powered image editor that offers various tools to generate and edit images through simple text prompts or clicks. It aims to simplify image editing tasks without requiring advanced skills.
As generative AI continues to evolve and multimodal applications become increasingly prevalent, the synergy between Storia Lab and Milvus can enable developers to unlock new realms of creativity, functionality, and user experiences.
For more details about this topic, watch the replay of Mihail's talk on YouTube.
- Textify API for Text Correction
- Background Removal and Replacement APIs
- Removing Unwanted Elements Using the Cleanup API
- Sketch-to-Image Conversion API
- Integrating Storia Lab with Milvus for Advanced Multimodal RAG Applications
- Conclusion
Content
Start Free, Scale Easily
Try the fully-managed vector database built for your GenAI applications.
Try Zilliz Cloud for Free