Using AI
Analyze Image
This document explains the Analyze Image node, which uses AI vision to extract information and insights from images.
Node Inputs
Required Fields
- Image File: Upload image or PDF (JPG, PNG, GIF, WEBP or PDF)
- Prompt: Question or instruction for analysis. Be detailed here for accurate output
Optional Fields
- Use Link: Enable to use direct image URLs
- Only supports publicly accessible media links (e.g., https://example.com/image.jpg)
- Does not support Google Drive, Dropbox, or other file-sharing links
- URL must point directly to the image file
- Temperature: Controls analysis creativity (0-1)
- 0: More focused, consistent
- 1: More creative, varied
- Cache Response: Save responses for reuse
Show As Input
The node allows you to configure certain parameters as dynamic inputs. You can enable these in the “Configure Inputs” section:
-
Use Link: Boolean
- true/false to use image URL instead of file upload
- When enabled, allows input of publicly accessible image URLs
- Remember: Only direct media links are supported
-
Prompt: String
- The specific question or instruction for analyzing the image
- Example: “Describe the main objects in this image”
-
image_model_preference: String
- Name of the AI model to use for image analysis
- Accepted values: “GPT-4o Vision”, “Claude 3 Haiku”, etc.
-
Cache Response: Boolean
- true/false to enable/disable response caching
- Helps reduce API calls for identical inputs
-
Temperature: Number
- Value between 0 and 1
- Controls analysis consistency and creativity
When enabled as inputs, these parameters can be dynamically set by previous nodes in your workflow. If not enabled, the values set in the node configuration will be used.
Node Output
- Analysis: AI’s detailed response about the image
Node Functionality
The Analyze Image can:
- Processes images with AI vision
- Extracts text from images
- Generates descriptions
- Answers queries about content
- Identifies objects and scenes
- Can read image-based PDFs
Available AI Models
- GPT-4o vision
- GPT-4o mini vision
- Claude 3.5 Sonnet
- Claude 3 Haiku
- Gemini 1.5 Pro
- Gemini 1.5 Flash
Common Use Cases
- Text Extraction:
- Visual Description:
- Object Detection:
Important Considerations
- Advanced models (GPT-4o & Claude 3.5) cost 20 credits, and standard models cost 2 credits per run
- You can drop the credit cost to 1 by providing your own API key under the credentials page
In summary, the Analyze Image node helps extract meaning and information from images using powerful AI vision models.