This document explains the Analyze Image node, which uses AI vision to extract information and insights from images.

Node Inputs

Required Fields

  • Image File: Upload image (JPEG/PNG)
  • Prompt: Question or instruction for analysis. Be detailed here for accurate output

Optional Fields

  • Use Link: Enable to use image URLs instead of files
  • Temperature: Controls analysis creativity (0-1)
  • Cache Response: Save responses for reuse

Show As Input Options

You can expose these fields as inputs:

  • Prompt
  • Temperature

Node Output

  • Analysis: AI’s detailed response about the image

Node Functionality

The Analyze Image can:

  • Processes images with AI vision
  • Extracts text from images
  • Generates descriptions
  • Answers queries about content
  • Identifies objects and scenes

Available AI Models

  • GPT-4o vision
  • GPT-4o mini vision
  • Claude 3.5 Sonnet
  • Claude 3 Haiku
  • Gemini 1.5 Pro
  • Gemini 1.5 Flash

Common Use Cases

  1. Text Extraction:
Prompt: "Extract all text visible in this image"
Use: Scanning documents, reading signs
  1. Visual Description:
Prompt: "Describe this image in detail"
Use: Accessibility, content cataloging
  1. Object Detection:
Prompt: "List all objects in this image"
Use: Inventory, scene analysis

Important Considerations

  1. Advanced models (GPT-4o & Claude 3.5) cost 20 credits, and standard models cost 2 credits per run
  2. You can drop the credit cost to 1 by providing your own API key under the credentials page

In summary, the Analyze Image node helps extract meaning and information from images using powerful AI vision models.