Case Study — Healthcare & MedTech

AI-Powered OCR Prescription Processing & Medicine Intelligence Platform

Digital Health Platform

A conversational AI health assistant that processes handwritten and printed prescriptions via OCR, identifies medicines through a multi-stage RAG pipeline, manages family health profiles, and delivers intelligent medication reminders.

Client Overview

A digital health startup needed an intelligent prescription processing system that patients could use directly through a messaging platform. The core challenge was replacing traditional OCR - which outputs raw text and struggles with doctor handwriting, faded ink, and non-standard layouts - with a Vision LLM that understands the context of a medical document, not just the characters in it.

The goal was to let users photograph a prescription, have the system extract and identify all medicines with full clinical detail, and attach the results to the correct patient profile - all within a single conversational flow.

The platform also needed to handle multi-patient households, lab report analysis, and automated medication reminder scheduling.

Business Challenges

The client faced several operational and technical challenges before engaging Aviasole:

Traditional OCR engines output raw character strings and cannot interpret the structure or intent of a medical document - doctor shorthand, overlapping text, and poor scan quality produce unusable output
Handwritten prescriptions vary significantly in style, abbreviation, and layout, making rule-based text extraction brittle and unmaintainable
Even when text is extracted correctly, medicine names have spelling variants, regional brand names, and generic equivalents that a simple database lookup cannot resolve
Managing patient identity becomes complex when one user submits prescriptions for multiple family members
Lab reports require a separate processing pipeline with abnormal result detection and historical trend tracking
Doctors seeing a patient for the first time have no consolidated view of their prescription history, lab results, and diagnoses - leading to repeated tests and incomplete consultations
Reminder scheduling needed to stay synchronized across database and cache without creating duplicate notifications

Solution Provided

Aviasole designed and built a full-stack AI health assistant comprising a FastAPI backend, a multimodal prescription agent powered by Google Gemini, and a multi-stage medicine RAG pipeline backed by PostgreSQL and Redis. The system processes prescription images end-to-end -from OCR extraction, to patient identity resolution, medicine matching, scan logging, and reminder creation -within a single API call.

Key Features & Capabilities

Vision LLM & OCR Layer

Uses Google Gemini as a Vision LLM rather than a traditional OCR engine, enabling the model to understand the semantic structure of a prescription - not just extract characters
Reads handwritten and printed prescriptions from photographs taken on any mobile device, handling variable image quality, rotation, and lighting conditions
Identifies and separates medicine names, patient name, doctor name, clinic, diagnosis, and prescription date as distinct structured fields from a single image
Interprets doctor shorthand and abbreviations in context - for example, understanding that “Tab.” means tablet or “BD” means twice daily - rather than passing them as raw strings
Handles mixed-language prescriptions where medicine names may appear in English while surrounding text is in a regional language
Extracts the complete medicine list in one pass, including partially legible names, ensuring nothing is silently dropped

Agentic Processing Pipeline

Operates as a ReAct agent with automatic tool calling, separating the vision extraction step from the medicine intelligence step
The agent invokes the medicine lookup tool exactly once with the full extracted list, keeping the flow auditable and preventing partial saves
Generates a human-readable, language-aware summary for the patient using only the verified tool response - never inferring dosages or side effects from the image independently
Agent conversation history is retained per request, enabling full traceability of what was extracted and what the model resolved

Multi-Stage Medicine RAG Pipeline

Four-stage sequential lookup: in-memory cache, exact database match, trigram fuzzy search, and vector semantic search
Each stage handles a different class of OCR noise -clean names, partial names, misspellings, and phonetic variants respectively
Medicines that pass none of the stages are flagged as pharmaceutical terms and queued for automated scraper follow-up
Matched results include brand name, salt composition, uses, side effects, safety advice, and visual medicine attributes

Family Identity & Profile Management

Resolves the correct family member profile before saving any prescription data
Supports multi-patient households under a single user account
When a patient name is new or ambiguous, the scan is held in a confirmation queue while a medicine preview is still returned to the user -no OCR data is lost
Profile data includes name, age, gender, and linked scan history per family member

Lab Report Processing

Parallel pipeline for diagnostic lab reports using the same multi-stage matching logic applied to test parameter names
Results stored per test with computed normal, high, low, and critical flags
Returns trend data comparing the current report against prior reports for the same patient, surfacing changes in abnormal values over time
Supports multi-page PDF lab reports via a shared scan identifier

Doctor Consultation Report

Generates a structured summary report for the doctor before or during a consultation, compiled entirely from the patient’s scanned data
Consolidates medicines from all past prescriptions, current diagnoses, and latest lab report results into a single document
Surfaces abnormal lab values prominently so the doctor can identify critical findings at a glance without reviewing individual reports
Includes prescription history with doctor names, clinic details, and dates, giving the consulting doctor full context on prior treatments
Report is generated on demand from existing scan and lab data - no manual data entry required from the patient or clinic staff

Medication Reminder Engine

Creates time-aware reminders linked to specific medicines and prescription scans
Reminders are written to both the database and a Redis cache immediately after insertion for low-latency notification delivery
Built-in deduplication prevents duplicate reminders when the same prescription is rescanned
Supports one-time and recurring reminder schedules with configurable time slots

Data Pipeline & Medicine Coverage

Medicine database seeded from a scraped catalog of widely used drugs and loaded into PostgreSQL
A scheduled daily job processes the missing medicines queue, scraping and adding new entries automatically
Embedding rebuild job regenerates vector representations when the underlying model is updated
Admin dashboard provides the operations team direct control over the medicine catalog

Technology Stack

AI & Agent Layer

Google Gemini Vision LLM for prescription image understanding, structured extraction, and context-aware OCR
ReAct agent pattern with automatic tool calling for auditable, step-by-step processing
pgvector for approximate nearest-neighbour semantic search

Backend

Python and FastAPI for the API server
Connection-pooled PostgreSQL for all relational data
Redis for medicine name cache and reminder state

Database

PostgreSQL with trigram indexing for fuzzy text matching
pgvector extension for vector embedding storage and search
Automatic schema migration on startup

Frontend

React admin dashboard for drug catalog and missing medicine management

Cloud & Infrastructure

AWS deployment with scalable compute and storage
Scheduled background jobs for scraping and embedding maintenance

AI-Ready Enhancements

Multilingual Vision LLM extraction for prescriptions written in regional scripts and mixed-language documents
Fine-tuned OCR model for low-quality or damaged prescription photographs
AI-driven drug interaction detection across a patient’s full prescription history
Intelligent abnormal lab result alerts with clinical context surfaced through the agent
AI Agents for proactive health reminders based on extracted diagnosis and prescription patterns
Predictive refill reminders based on dosage duration and prescription history

Business Impact

Vision LLM replaced brittle traditional OCR, enabling reliable extraction from handwritten, printed, and mixed-language prescriptions
End-to-end processing from photograph to structured medicine data completed within a single API call
Multi-stage RAG pipeline resolves brand names, generics, abbreviations, and OCR noise without any manual matching rules
Family profile management supports multi-patient households under a single account
Doctor consultation reports give physicians a consolidated view of prescriptions, diagnoses, and lab results without any manual preparation
Missing medicines are logged and automatically scraped, improving database coverage continuously
Reminder deduplication prevents duplicate notifications when the same prescription is rescanned

Outcome

The platform launched as a fully automated health assistant capable of reading prescription photographs - including handwritten doctor notes - and returning clinically structured medicine data in seconds. Replacing rule-based OCR with a Vision LLM was the foundational shift that made the rest of the system reliable: accurate extraction fed accurate matching, which fed accurate reminders. The family identity system enabled household-level health record management without requiring separate accounts per family member, and the architecture is designed to extend to additional document types, languages, and notification channels.

Services Used

Agentic AI Systems
Generative AI Integration
AI-Powered SaaS Development
Data Engineering & AI Infrastructure

Technologies

Python / FastAPIGoogle GeminiPostgreSQLpgvectorRedisReactAWS

Key Results

4-Stage RAG cascade for medicine lookup

~95% Medicine identification accuracy

2 Types Prescriptions and lab reports processed

Zero OCR data lost during identity confirmation

AI-Powered OCR Prescription Processing & Medicine Intelligence Platform

Services Used

Technologies

Key Results

Ready to TransformYour Business?

Ready to Transform
Your Business?