Techno Blender
Digitally Yours.

How to Generate Images from Text with Stable Diffusion Models | by Lucas Soares | Sep, 2022

0 62


Quick introduction to text-to-image generation using Hugging Face’s diffusers package

In this article, I will show you how to get started with text-to-image generation with stable diffusion models using Hugging Face’s diffusers package.

A while back I got access to the DALLE-2 model by OpenAI, which allows you to create stunning images from text. So, I started to play around with it and generate some pretty amazing images.

Image by the author. Generated with DALL-E 2.

However, my credits ran out, so I decided to look for alternatives, and came across this incredible article by Hugging Face,

which explains how to run stable diffusion models using their diffuserspackage.

So let’s dive into how to generate images from text using diffusers !

First things first, the steps to generate images from text with the diffusers package are:

  1. Make sure you have GPU access
  2. Install requirements
  3. Enable external widgets on Google Colab (for colab notebooks)
  4. Login to Hugging Face with your user token
  5. Initialize the StableDiffusionPipeline
  6. Move the pipeline to the GPU
  7. Run Inference with Pytorch’s autocast module

So, for this project, since I am following more or less the colab notebook by Hugging Face we will assume you have access to a colab notebook with a GPU enabled. Let’s begin!

1. Make sure you have GPU access

!nvidia-smi# My Output
Image by the author.

Ok, great! Now that we know that we have access to a GPU let’s set up the requirements for this project.

2. Install requirements

There are 5 main requirements for this project:

  • diffusers==0.2.4 — which is the main package for running the pipeline
  • transformers — Hugging Face’s package with many pre-trained models for text, audio and video
  • scipy — Python package for scientific computing
  • ftfy — Python package for handling unicode issues
  • ipywidgets>=7,<8 — package for building widgets on notebooks
  • torch — Pytorch package (no need to install if you are in colab)
  • pillow — Python package to process images (no need to install if you are in colab)

To install everything you actually need in Google Colab, just run:

!pip install diffusers==0.2.4!pip install transformers scipy ftfy!pip install "ipywidgets>=7,<8"

3. Enable external widgets on Google Colab (for colab notebooks)

# enabling widgets (to be able to login to hugging face)from google.colab import outputoutput.enable_custom_widget_manager()

4. Login to Hugging Face with your user token

# login to huggin face (get an access token etc...)from huggingface_hub import notebook_loginnotebook_login()

You should see a widget where you will input your access token from Hugging Face. After you input it, you should see something like this:

# Expected OutputLogin successful Your token has been saved to /root/.huggingface/token Authenticated through git-credential store but this isn't the helper defined on your machine. You might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set this credential helper as the default  git config --global credential.helper store

5. Initialize the StableDiffusionPipeline

import torchfrom diffusers import StableDiffusionPipelinepipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=True)

Here, just as in their colab notebook , we are using the v1-4 model for which we will download the weights. Once that is done we can move on to the next step. Feel free to try out the other models for comparison!

6. Move the pipeline to the GPU

pipe = pipe.to("cuda")

7. Run Inference with Pytorch’s autocast module

from torch import autocastprompt = "photo of a panda surfing"with autocast("cuda"):
image = pipe(prompt)["sample"][0]
image.save(f"panda_surfer.png")image

Output

Image by the author. Generated with code from this colab notebook authored by Hugging Face.

As we can clearly see, the results are incredible. Evidently you will find some variability on the results you get, but there are parameters you can tweak like guidance_scale, number of steps and setting random seeds (for deterministic outputs) that should help you get more consistent results.


Quick introduction to text-to-image generation using Hugging Face’s diffusers package

In this article, I will show you how to get started with text-to-image generation with stable diffusion models using Hugging Face’s diffusers package.

A while back I got access to the DALLE-2 model by OpenAI, which allows you to create stunning images from text. So, I started to play around with it and generate some pretty amazing images.

Image by the author. Generated with DALL-E 2.

However, my credits ran out, so I decided to look for alternatives, and came across this incredible article by Hugging Face,

which explains how to run stable diffusion models using their diffuserspackage.

So let’s dive into how to generate images from text using diffusers !

First things first, the steps to generate images from text with the diffusers package are:

  1. Make sure you have GPU access
  2. Install requirements
  3. Enable external widgets on Google Colab (for colab notebooks)
  4. Login to Hugging Face with your user token
  5. Initialize the StableDiffusionPipeline
  6. Move the pipeline to the GPU
  7. Run Inference with Pytorch’s autocast module

So, for this project, since I am following more or less the colab notebook by Hugging Face we will assume you have access to a colab notebook with a GPU enabled. Let’s begin!

1. Make sure you have GPU access

!nvidia-smi# My Output
Image by the author.

Ok, great! Now that we know that we have access to a GPU let’s set up the requirements for this project.

2. Install requirements

There are 5 main requirements for this project:

  • diffusers==0.2.4 — which is the main package for running the pipeline
  • transformers — Hugging Face’s package with many pre-trained models for text, audio and video
  • scipy — Python package for scientific computing
  • ftfy — Python package for handling unicode issues
  • ipywidgets>=7,<8 — package for building widgets on notebooks
  • torch — Pytorch package (no need to install if you are in colab)
  • pillow — Python package to process images (no need to install if you are in colab)

To install everything you actually need in Google Colab, just run:

!pip install diffusers==0.2.4!pip install transformers scipy ftfy!pip install "ipywidgets>=7,<8"

3. Enable external widgets on Google Colab (for colab notebooks)

# enabling widgets (to be able to login to hugging face)from google.colab import outputoutput.enable_custom_widget_manager()

4. Login to Hugging Face with your user token

# login to huggin face (get an access token etc...)from huggingface_hub import notebook_loginnotebook_login()

You should see a widget where you will input your access token from Hugging Face. After you input it, you should see something like this:

# Expected OutputLogin successful Your token has been saved to /root/.huggingface/token Authenticated through git-credential store but this isn't the helper defined on your machine. You might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set this credential helper as the default  git config --global credential.helper store

5. Initialize the StableDiffusionPipeline

import torchfrom diffusers import StableDiffusionPipelinepipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=True)

Here, just as in their colab notebook , we are using the v1-4 model for which we will download the weights. Once that is done we can move on to the next step. Feel free to try out the other models for comparison!

6. Move the pipeline to the GPU

pipe = pipe.to("cuda")

7. Run Inference with Pytorch’s autocast module

from torch import autocastprompt = "photo of a panda surfing"with autocast("cuda"):
image = pipe(prompt)["sample"][0]
image.save(f"panda_surfer.png")image

Output

Image by the author. Generated with code from this colab notebook authored by Hugging Face.

As we can clearly see, the results are incredible. Evidently you will find some variability on the results you get, but there are parameters you can tweak like guidance_scale, number of steps and setting random seeds (for deterministic outputs) that should help you get more consistent results.

FOLLOW US ON GOOGLE NEWS

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Leave a comment