On this hands-on tutorial, we’ll unlock the inventive potential of Stability AI’s industry-leading diffusion fashions, Secure Diffusion v1.5, Stability AI’s v2-base, and the cutting-edge Secure Diffusion 3 Medium, to generate eye-catching imagery. Operating completely in Google Colab with a Gradio interface, we’ll expertise side-by-side comparisons of three highly effective pipelines, speedy immediate iteration, and seamless GPU-accelerated inference. Whether or not we’re a marketer trying to elevate our model’s visible narrative or a developer wanting to prototype AI-driven content material workflows, this tutorial showcases how Stability AI’s open-source fashions may be deployed immediately and at no infrastructure price, permitting you to give attention to storytelling, engagement, and driving real-world outcomes.
!pip set up huggingface_hub
from huggingface_hub import notebook_login
notebook_login()
We set up the huggingface_hub library after which import and invoke the notebook_login() operate, which prompts you to authenticate your pocket book session together with your Hugging Face account, permitting you to seamlessly entry and handle fashions, datasets, and different hub assets.
!pip uninstall -y torchvision
!pip set up --upgrade torch torchvision --index-url https://obtain.pytorch.org/whl/cu118
!pip set up --upgrade diffusers transformers speed up safetensors gradio pillow
We first force-uninstalls any present torchvision to clear potential conflicts, then reinstalls torch and torchvision from the CUDA 11.8–suitable PyTorch wheels, and eventually upgrades key libraries, diffusers, transformers, speed up, safetensors, gradio, and pillow, to make sure you have the newest variations for constructing and working GPU-accelerated generative pipelines and net demos.
import torch
from diffusers import StableDiffusionPipeline, StableDiffusion3Pipeline
import gradio as gr
machine = "cuda" if torch.cuda.is_available() else "cpu"
We import PyTorch alongside each the Secure Diffusion v1 and v3 pipelines from the Diffusers library, in addition to Gradio for constructing interactive demos. It then checks for CUDA availability and units the machine variable to “cuda” if a GPU is current; in any other case, it falls again to “cpu”, making certain your fashions run on the optimum {hardware}.
pipe1 = StableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16,
safety_checker=None
).to(machine)
pipe1.enable_attention_slicing()
We load the Secure Diffusion v1.5 mannequin in half-precision (float16) with out the built-in security checker, transfers it to your chosen machine (GPU, if obtainable), after which permits consideration slicing to cut back peak VRAM utilization throughout picture technology.
pipe2 = StableDiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-base",
torch_dtype=torch.float16,
safety_checker=None
).to(machine)
pipe2.enable_attention_slicing()
We load the Secure Diffusion v2 “base” mannequin in 16-bit precision with out the default security filter, transfers it to your chosen machine, and prompts consideration slicing to optimize reminiscence utilization throughout inference.
pipe3 = StableDiffusion3Pipeline.from_pretrained(
"stabilityai/stable-diffusion-3-medium-diffusers",
torch_dtype=torch.float16,
safety_checker=None
).to(machine)
pipe3.enable_attention_slicing()
We pull in Stability AI’s Secure Diffusion 3 “medium” checkpoint in 16-bit precision (skipping the built-in security checker), transfers it to your chosen machine, and permits consideration slicing to cut back GPU reminiscence utilization throughout technology.
def generate(immediate, steps, scale):
img1 = pipe1(immediate, num_inference_steps=steps, guidance_scale=scale).photographs[0]
img2 = pipe2(immediate, num_inference_steps=steps, guidance_scale=scale).photographs[0]
img3 = pipe3(immediate, num_inference_steps=steps, guidance_scale=scale).photographs[0]
return img1, img2, img3
Now, this operate runs the identical textual content immediate via all three loaded pipelines (pipe1, pipe2, pipe3) utilizing the required inference steps and steering scale, then returns the primary picture from every, making it good for evaluating outputs throughout Secure Diffusion v1.5, v2-base, and v3-medium.
def select(choice):
return f"✅ You chose: **{choice}**"
with gr.Blocks() as demo:
gr.Markdown("## AI Social-Submit Generator with 3 Fashions")
with gr.Row():
immediate = gr.Textbox(label="Immediate", placeholder="A vibrant seashore sundown…")
steps = gr.Slider( 1, 100, worth=50, step=1, label="Inference Steps")
scale = gr.Slider( 1.0, 20.0, worth=7.5, step=0.1, label="Steerage Scale")
btn = gr.Button("Generate Photos")
with gr.Row():
out1 = gr.Picture(label="Mannequin 1: SD v1.5")
out2 = gr.Picture(label="Mannequin 2: SD v2-base")
out3 = gr.Picture(label="Mannequin 3: SD v3-medium")
sel = gr.Radio(
["Model 1: SD v1.5","Model 2: SD v2-base","Model 3: SD v3-medium"],
label="Choose your favourite"
)
txt = gr.Markdown()
btn.click on(fn=generate, inputs=[prompt, steps, scale], outputs=[out1, out2, out3])
sel.change(fn=select, inputs=sel, outputs=txt)
demo.launch(share=True)
Lastly, this Gradio app builds a three-column UI the place you possibly can enter a textual content immediate, regulate inference steps and steering scale, then generate and show photographs from SD v1.5, v2-base, and v3-medium facet by facet. It additionally encompasses a radio selector, permitting you to pick out your most well-liked mannequin output, and shows a easy affirmation message when a alternative is made.
In conclusion, by integrating Stability AI’s state-of-the-art diffusion architectures into an easy-to-use Gradio app, you’ve seen how effortlessly you possibly can prototype, examine, and deploy beautiful visuals that resonate on right now’s platforms. From A/B-testing inventive instructions to automating marketing campaign property at scale, Stability AI supplies the efficiency, flexibility, and vibrant group assist to remodel your content material pipeline.
Take a look at the Colab Pocket book. Don’t neglect to comply with us on Twitter and be part of our Telegram Channel and LinkedIn Group. Don’t Neglect to affix our 90k+ ML SubReddit. For Promotion and Partnerships, please speak us.
Nikhil is an intern marketing consultant at Marktechpost. He’s pursuing an built-in twin diploma in Supplies on the Indian Institute of Know-how, Kharagpur. Nikhil is an AI/ML fanatic who’s at all times researching functions in fields like biomaterials and biomedical science. With a powerful background in Materials Science, he’s exploring new developments and creating alternatives to contribute.