Stable Diffusion XL 1.0
Overview
The DataCrunch Inference Service offers the Stable Diffusion XL 1.0 endpoint, an advanced solution for generating high-quality images based on textual descriptions. This documentation provides a comprehensive guide to utilizing the service effectively.
Endpoint features
Safety Filter: Option to enable or disable a safety filter for content moderation.
Style Templates: Support for applying predefined styles to the generated images.
Limited LoRA Support: Integration of LoRA (Limited) for custom model adjustments.
Examples of API Usage
The following examples demonstrate how to interact with the service using different features.
Simple Base SDXL (No Refiner)
To generate an image without the refining process set is_ensemble=false
and refiner=false
.
Ensemble of Expert Denoisers
To run in the Ensemble of Experts mode set is_ensemble=true
and refiner=false
.
Initially, the process involves denoising using the base model for a number of steps calculated as num_inference_steps
multiplied by (1 - refiner_ratio)
. Following this, the procedure continues for additional steps determined by multiplying num_inference_steps
by the refiner_ratio
, during which the refiner model is utilized.
For detailed information on parameters and their effects, refer to the Ensemble of Expert Denoisers documentation.
Refine the Denoised Base Image
The two-step pipeline operates as follows: Initially, the image undergoes a full denoising process using the base model. Subsequently, the refiner model is applied in an image-to-image pipeline to the output of the base model.
To enable this pipeline, set is_ensemble
to false
and refiner
to true
.
The number of steps for each model — the base and the refiner — are independently controlled by num_inference_steps
and num_inference_steps_refiner
, respectively. Additionally, distinct guidance_scale
and guidance_scale_refiner
values are utilized for each phase.
API Specification
API Parameters
prompt (
str
, required): Prompt text.height (
int
, optional): Height of the output image. Settingaspect_ratio
overrides this value. Defaults to1024
.width (
int
, optional): Width of the output image. Settingaspect_ratio
overrides this value. Defaults to1024
.num_inference_steps (
int
, optional): Number of inference (denoising) steps. Defaults to50
.guidance_scale (
float
, optional): Scaling factor for guidance. Specifies how much to follow the text prompt. Defaults to4.0
.num_images_per_prompt (
int
, optional): Number of images to generate per prompt. Defaults to1
.seed (
int
, optional): Seed for random number generator. Defaults to42
.negative_prompt (
str
, optional): Negative prompt text.seed_image (
str
, optional): Base64-encoded seed image string.strength (
float
, optional, Range:[0.05, 1.0]
): How much noise is added to theseed_image
before generation. Defaults to0.2
.scheduler (
str
, optional): Scheduler to use. Supported schedulers:DDIM
,K_EULER
,EulerA
,DPMSolverMultistep
,KarrasDPM
,PNDM
,HeunDiscrete
. Defaults toDDIM
.timestep_spacing: (
str
, optional): specifies the timestep spacing for the scheduler. Supported values:linspace
,trailing
,leading
. Defaults tolinspace
.guidance_scale_refiner (
float
, optional): Scaling factor for refiner guidance (corresponds toguidance_scale
). Defaults to1.0
.refiner (
bool
, optional): Whether to use the refiner model. Defaults tofalse
.num_inference_steps_refiner (
int
, optional): Number of inference steps for refiner, applied whenis_ensemble=false
. Defaults to50
.style_selected (
str
, optional): Apply the specified to the provided prompt, see supported styles.is_ensemble (
bool
, optional): Whether to use the Ensemble of Expert Denoisers pipeline. Defaults tofalse
.refiner_ratio (
float
, optional): Requiresis_ensemble=true
. The fraction of thenum_inference_steps
steps to run the refiner for. For example, ifnum_inference_steps=40
, andrefiner_ratio=0.1
then the base model will run for40 * (1-0.1) = 36
steps, and the refiner for40 * 0.1 = 4
steps. Values over0.2
start to produce unnatural-looking images. Defaults to0.2
.aspect_ratio (
str
, optional): Aspect ratio of the output image. Setting this value overrides thewidth
andheight
values.lora_id (
str
, optional): Finetuned LoRA ID to load (LoRA file must exist on DataCrunch platform).lora_name (
str
, optional): Public LoRA to be loaded. Currently only supportedlora_name="offset"
(corresponding to: "sd_xl_offset_example-lora_1.0.safetensors").safety_filter (
bool
, optional): Whether to use NSFW filter. Defaults totrue
.
Last updated