AI Technology News

Your Daily Dose of AI Innovation & Insights

VisualInsight: An End-to-End Image Analysis Application Using Google Generative AI (Gemini) | by Yotam Braun | Jan, 2025

VisualInsight: An End-to-End Image Analysis Application Using Google Generative AI (Gemini) | by Yotam Braun | Jan, 2025

Put Ads For Free On FiverrClerks.com

Below are some of the core services that power VisualInsight.

  1. LLM Service (app/services/llm_service.py)

Handles the interaction with Google Gemini for image analysis.

import google.generativeai as genai
import os
from datetime import datetime
from PIL import Image
from utils.logger import setup_logger

logger = setup_logger()

class LLMService:
def __init__(self):
genai.configure(api_key=os.getenv('GOOGLE_API_KEY'))
self.model = genai.GenerativeModel('gemini-1.5-flash-002')

self.prompt = """
Analyze this Image and provide:
1. Image type
2. Key information
3. Important details
4. Notable observations
"""

def analyze_document(self, image: Image.Image) -> dict:
try:
logger.info("Sending request to LLM")
# Generate content directly with the PIL image
response = self.model.generate_content([
self.prompt,
image
])

return {
"analysis": response.text,
"timestamp": datetime.now().isoformat()
}

except Exception as e:
logger.error(f"LLM analysis failed: {str(e)}")
raise Exception(f"Failed to analyze document: {str(e)}")

What’s Happening Here?

  • I configure our Google Generative AI (Gemini) with an API key.
  • A default prompt outlines the kind of analysis we want.
  • The analyze_document method sends the image to Gemini and returns its text-based analysis.

2. S3 Service (app/services/s3_service.py)

Uploads files to AWS S3 with timestamped keys and generates presigned URLs for private access.

import boto3
import os
from datetime import datetime
from utils.logger import setup_logger

logger = setup_logger()

class S3Service:
def __init__(self):
self.s3_client = boto3.client(
's3',
aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'),
aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY'),
region_name=os.getenv('AWS_REGION', 'us-east-1')
)
self.bucket_name = os.getenv('S3_BUCKET_NAME')

def upload_file(self, file):
"""Upload file to S3 and return the URL"""
try:
# Generate unique filename
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
file_key = f"uploads/{timestamp}_{file.name}"

# Upload to S3
self.s3_client.upload_fileobj(
file,
self.bucket_name,
file_key
)

# Generate presigned URL that expires in 1 hour
url = self.s3_client.generate_presigned_url(
'get_object',
Params={
'Bucket': self.bucket_name,
'Key': file_key
},
ExpiresIn=3600
)

logger.info(f"File uploaded successfully: {url}")
return url

except Exception as e:
logger.error(f"S3 upload failed: {str(e)}")
raise Exception(f"Failed to upload file to S3: {str(e)}")

Figure 6: The AWS S3 bucket that stores uploaded images and analysis results

Core Features:

  • Uses boto3 to interact with AWS S3.
  • Generates a time-stamped key for each file.
  • Creates a presigned URL for private file access without requiring you to open up the entire bucket.

3. The Streamlit Application (app/main.py)

Provides the user interface for file uploads, analysis initiation, and displaying results.

import streamlit as st
import os
from dotenv import load_dotenv
from services.s3_service import S3Service
from services.llm_service import LLMService
from utils.logger import setup_logger
from PIL import Image

# Load environment variables
load_dotenv()

# Setup logging
logger = setup_logger()

# Initialize services
s3_service = S3Service()
llm_service = LLMService()

def main():
st.title("Document Analyzer")

uploaded_file = st.file_uploader("Upload a document", type=['png', 'jpg', 'jpeg'])

if uploaded_file:
# Display image
image = Image.open(uploaded_file)
st.image(image, caption='Uploaded Document', use_column_width=True)

if st.button('Analyze Document'):
with st.spinner('Processing...'):
try:
# Analyze with LLM directly
logger.info("Starting document analysis")
analysis = llm_service.analyze_document(image)

# Upload to S3 for storage
logger.info(f"Uploading file: {uploaded_file.name}")
s3_url = s3_service.upload_file(uploaded_file)

# Display results
st.success("Analysis Complete!")
st.json(analysis)

except Exception as e:
logger.error(f"Error processing document: {str(e)}")
st.error(f"Error: {str(e)}")

if __name__ == "__main__":
main()

  • Streamlit handles the UI: file upload, display, button triggers.
  • LLMService and S3Service are orchestrated together to handle the AI query and file upload.
  • Real-time logs inform you of the status and highlight any issues.
Put Ads For Free On FiverrClerks.com

About The Author

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright © All rights reserved. | Website by EzeSavers.
error: Content is protected !!