VisualInsight: An End-to-End Image Analysis Application Using Google Generative AI (Gemini) | by Yotam Braun | Jan, 2025


Below are some of the core services that power VisualInsight.
- LLM Service (
app/services/llm_service.py
)
Handles the interaction with Google Gemini for image analysis.
import google.generativeai as genai
import os
from datetime import datetime
from PIL import Image
from utils.logger import setup_loggerlogger = setup_logger()
class LLMService:
def __init__(self):
genai.configure(api_key=os.getenv('GOOGLE_API_KEY'))
self.model = genai.GenerativeModel('gemini-1.5-flash-002')
self.prompt = """
Analyze this Image and provide:
1. Image type
2. Key information
3. Important details
4. Notable observations
"""
def analyze_document(self, image: Image.Image) -> dict:
try:
logger.info("Sending request to LLM")
# Generate content directly with the PIL image
response = self.model.generate_content([
self.prompt,
image
])
return {
"analysis": response.text,
"timestamp": datetime.now().isoformat()
}
except Exception as e:
logger.error(f"LLM analysis failed: {str(e)}")
raise Exception(f"Failed to analyze document: {str(e)}")
What’s Happening Here?
- I configure our Google Generative AI (Gemini) with an API key.
- A default prompt outlines the kind of analysis we want.
- The
analyze_document
method sends the image to Gemini and returns its text-based analysis.
2. S3 Service (app/services/s3_service.py
)
Uploads files to AWS S3 with timestamped keys and generates presigned URLs for private access.
import boto3
import os
from datetime import datetime
from utils.logger import setup_loggerlogger = setup_logger()
class S3Service:
def __init__(self):
self.s3_client = boto3.client(
's3',
aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'),
aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY'),
region_name=os.getenv('AWS_REGION', 'us-east-1')
)
self.bucket_name = os.getenv('S3_BUCKET_NAME')
def upload_file(self, file):
"""Upload file to S3 and return the URL"""
try:
# Generate unique filename
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
file_key = f"uploads/{timestamp}_{file.name}"
# Upload to S3
self.s3_client.upload_fileobj(
file,
self.bucket_name,
file_key
)
# Generate presigned URL that expires in 1 hour
url = self.s3_client.generate_presigned_url(
'get_object',
Params={
'Bucket': self.bucket_name,
'Key': file_key
},
ExpiresIn=3600
)
logger.info(f"File uploaded successfully: {url}")
return url
except Exception as e:
logger.error(f"S3 upload failed: {str(e)}")
raise Exception(f"Failed to upload file to S3: {str(e)}")
Figure 6: The AWS S3 bucket that stores uploaded images and analysis results
Core Features:
- Uses boto3 to interact with AWS S3.
- Generates a time-stamped key for each file.
- Creates a presigned URL for private file access without requiring you to open up the entire bucket.
3. The Streamlit Application (app/main.py
)
Provides the user interface for file uploads, analysis initiation, and displaying results.
import streamlit as st
import os
from dotenv import load_dotenv
from services.s3_service import S3Service
from services.llm_service import LLMService
from utils.logger import setup_logger
from PIL import Image# Load environment variables
load_dotenv()
# Setup logging
logger = setup_logger()
# Initialize services
s3_service = S3Service()
llm_service = LLMService()
def main():
st.title("Document Analyzer")
uploaded_file = st.file_uploader("Upload a document", type=['png', 'jpg', 'jpeg'])
if uploaded_file:
# Display image
image = Image.open(uploaded_file)
st.image(image, caption='Uploaded Document', use_column_width=True)
if st.button('Analyze Document'):
with st.spinner('Processing...'):
try:
# Analyze with LLM directly
logger.info("Starting document analysis")
analysis = llm_service.analyze_document(image)
# Upload to S3 for storage
logger.info(f"Uploading file: {uploaded_file.name}")
s3_url = s3_service.upload_file(uploaded_file)
# Display results
st.success("Analysis Complete!")
st.json(analysis)
except Exception as e:
logger.error(f"Error processing document: {str(e)}")
st.error(f"Error: {str(e)}")
if __name__ == "__main__":
main()
- Streamlit handles the UI: file upload, display, button triggers.
- LLMService and S3Service are orchestrated together to handle the AI query and file upload.
- Real-time logs inform you of the status and highlight any issues.
