Image Block

Learn the basics of image blocks!

Image block enables visual elaboration, concept exploration, information layering, and structured visual composition. It serves as a bridge between textual and visual information, allowing you to iterate on design ideas, experiment with aesthetic styles, color palettes, and create complex visual narratives by combining multiple images into one.

Models

Text to Image Models

Model

Description

Best For

Flux Dev [Default]

A lightweight text-to-image model optimized for speed and cost-efficiency.

Rapid prototyping and cost-effective image generation for style fusion, moodboards, conceptual prototyping, abstract compositions. Not opimized for realistic content generation.

Flux Pro 1.1

An advanced version of Flux Dev. Higher quality outputs with more detailed and nuanced images.

Commercial use and high-quality marketing visuals.

Google Imagen 4

Google's latest text-to-image diffusion model as of 2024. It’s part of the Imagen family, known for high visual fidelity and text alignment

Generating photorealistic, text-aligned visuals with high compositional accuracy, making it ideal for advertising, cinematic storyboarding, and precise design mockups.

GPT Image

Multimodal model by OpenAI capable of generating coherent and context-aware visuals from detailed text prompts.

Narrative-driven image generation and branding mockups.

Ideogram 3.0

Capable of generating a variety of artistic and illustrative visuals.

Concept art and creative storytelling.

Luma Photon

Photorealistic image generation with a focus on lighting and shading accuracy.

Realistic renders for architectural visualizations and product mockups.

Recraft V3

Capable of generating a variety of artistic and illustrative visuals.

Concept art and creative storytelling.

Stable Diffusion 3.5

An improved version of SD with enhanced text interpretation and faster generation speeds.

General-purpose image generation and rapid iteration.

Image to Image Models

Models

Description

Best For

GPT Image [Default]

A multimodal model built into GPT‑4o that can generate, edit, and transform images based on text and image inputs with high fidelity and contextual understanding .

Editing and transforming existing images using text and visual input. Optimized for tasks like inpainting, style transfer, background replacement, product mockups, and storyboard refinement with contextual precision.

Flux Kontext Max

A powerful image-to-image AI that understands both visual and textual context to perform high‑fidelity edits, extensions, style transfers, typography changes, and scene consistency in a single model .

Transforming or extending existing images. For outpainting scenes, swapping styles, editing text, or maintaining character continuity—using precise natural language commands across multiple iterations.

Flux Canny

Utilizes edge detection to preserve structural composition while generating new images based on a text prompt.

Precise composition control through edge detection. Images with a lot of details.

Flux Depth

Employs depth mapping to maintain spatial relationships in the generated images for accurate perspectives.

High accuracy in perspective and object positioning through 3D spatial consistency.

Flux Redux

Produces slight variations of an input image without the need for a text prompt, allowing for easy refinement.

High creative freedom with loose compositional guidance.

Google Gemini 2.0 Flash

Lightweight, fast multimodal model optimized for speed and efficiency.

Instant Q&A on images or documents, quickly interpret visual input and generate contextually.

Runway Gen-4 Image Reference

Lets users guide image generation using 1–3 uploaded reference images, enabling consistent character design, visual style, or framing across outputs.

Maintaining stylistic or character continuity in iterative image creation for concept art, storyboarding, or branded visual assets.

PreviousVideo to Text NextText to Image

Last updated 15 days ago