Why longevity?
- Longevity care is longitudinal and data-heavy. Patients accumulate lab panels, imaging, and clinical notes over years.
- Clinicians and individuals need fast, explainable triage: What’s abnormal? What changed? What should I read next?
- A generalizable pipeline that works “in a weekend” helps teams experiment faster and validate real-world impact.
What we shipped at the hackathon
- A Next.js/React UI with a dead-simple uploader and a results table.
- Client-side text extraction for PDFs and images to support mixed inputs quickly.
- A structured LLaMA prompt that returns summary, keywords, categories, abnormal flags, suggested filename, and PubMed titles.
- Supabase Storage for raw files; Postgres table
documents
for structured metadata. - A Supabase Edge Function to process stored PDFs server-side (useful for background jobs and batch workflows).
Architecture
- Upload → Extract text → LLM analysis → Persist → Render
- Two processing paths:
- Client-led: immediate feedback, great for demos and small files.
- Server-led (Edge Function): scalable, secure, and good for background/batch processing.
Key moving parts:
- Frontend: Next.js + Tailwind
- OCR/Parsing:
pdf-parse
,tesseract.js
- AI: LLaMA chat completions API with a rigid, parse-friendly prompt
- Backend: Supabase (Storage for blobs, Postgres for metadata)
- Serverless: Supabase Edge Function for server-side PDF processing
Product walk-through
Home page
<h1 className="text-3xl font-bold leading-tight text-gray-900">
Medical Document Analysis
</h1>
</div>
</header>
<main>
<div className="max-w-7xl mx-auto sm:px-6 lg:px-8">
<MedicalDocUploader />
- Clean landing with a single CTA: upload a document.
Upload → extract → analyze → persist (client path)
- Extracts text conditionally based on file type.
- Calls LLaMA with a structured prompt to ensure predictable parsing.
- Uploads original file to Supabase Storage and inserts metadata into
documents
.
const handleFileUpload = useCallback(async (event: React.ChangeEvent<HTMLInputElement>) => {
try {
setIsUploading(true);
setError(null);
const file = event.target.files?.[0];
if (!file) return;
// Extract text based on file type
const text = file.type === 'application/pdf'
? await extractTextFromPDF(file)
: await extractTextFromImage(file);
// Analyze the text with Llama
const analysis = await analyzeWithLlama(text, file.name);
// Upload to Supabase Storage
const { data: uploadData, error: uploadError } = await supabase.storage
.from('medical-documents')
.upload(analysis.renamed_file, file);
if (uploadError) throw uploadError;
// Store metadata in Supabase
const { data: metaData, error: metaError } = await supabase
.from('documents')
.insert({
filename: file.name,
renamed_file: analysis.renamed_file,
file_url: uploadData.path,
summary: analysis.summary,
keywords: analysis.keywords,
categories: analysis.categories,
word_count: countWords(text),
report_type: detectReportType(text),
threshold_flags: analysis.threshold_flags,
pubmed_refs: analysis.pubmed_refs,
ai_notes: analysis.ai_notes,
status: 'processed',
version: 1
})
.select()
.single();
if (metaError) throw metaError;
setDocuments(prev => [...prev, metaData]);
} catch (err) {
setError(err instanceof Error ? err.message : 'An error occurred');
} finally {
setIsUploading(false);
}
}, []);
The prompt that makes it reliable
- The LLM is only as useful as its prompt structure. We force a schema, so parsing is straightforward and less brittle than free-form responses.
const analyzeWithLlama = async (text: string, originalFilename: string) => {
const prompt = `Analyze this medical document and provide a detailed analysis in the following format:
1. Summary: Provide a clear, plain-English summary
2. Keywords: Extract key medical terms and their values (if any)
3. Categories: Classify into these categories: ${VALID_CATEGORIES.join(", ")}
4. Filename: Suggest a clear, descriptive filename
5. Threshold Flags: Identify any abnormal values and mark as "high", "low", or "normal"
6. PubMed References: Suggest relevant PubMed articles (just article titles)
7. Additional Notes: Any important medical guidance or observations
Document text:
${text}
Please format your response exactly as follows:
Summary: [summary]
Keywords: [key:value pairs]
Categories: [categories]
Filename: [filename]
Flags: [abnormal values]
References: [article titles]
Notes: [additional guidance]`;
Server-side processing (Edge Function)
- Useful for background jobs, webhook-driven processing, or scaling beyond client limits.
- Downloads file from Supabase Storage, extracts text, calls the same LLaMA prompt, inserts
documents
row.
// Prepare LLaMA prompt
const prompt = `Analyze this medical document and provide a detailed analysis in the following format:
1. Summary: Provide a clear, plain-English summary
2. Keywords: Extract key medical terms and their values (if any)
3. Categories: Classify into these categories: ${VALID_CATEGORIES.join(", ")}
4. Filename: Suggest a clear, descriptive filename
5. Threshold Flags: Identify any abnormal values and mark as "high", "low", or "normal"
6. PubMed References: Suggest relevant PubMed articles (just article titles)
7. Additional Notes: Any important medical guidance or observations
Document text:
${text}
Please format your response exactly as follows:
Summary: [summary]
Keywords: [key:value pairs]
Categories: [categories]
Filename: [filename]
Flags: [abnormal values]
References: [article titles]
Notes: [additional guidance]`
// Insert into Supabase
const { data: insertData, error: insertError } = await supabase
.from('documents')
.insert(documentData)
.select()
.single()
Implementation details
Database schema (Supabase Postgres)
Use JSONB where the structure can vary or expand over time.
-- documents table
create table if not exists public.documents (
id bigint generated always as identity primary key,
created_at timestamp with time zone default now() not null,
user_id uuid null,
filename text not null,
renamed_file text not null,
file_url text not null,
summary text not null,
keywords jsonb not null default '{}'::jsonb,
categories text[] not null default '{}',
word_count integer not null,
report_type text not null,
threshold_flags jsonb not null default '{}'::jsonb,
pubmed_refs jsonb not null default '{}'::jsonb,
ai_notes text not null default '',
status text not null check (status in ('uploaded','processed','failed')),
user_notes text null,
version integer not null default 1
);
-- Optional: RLS policies for multi-tenant setups
alter table public.documents enable row level security;
-- Example policies (tune for your auth model)
create policy "Allow read to authenticated users"
on public.documents for select
to authenticated
using (true);
create policy "Insert own documents"
on public.documents for insert
to authenticated
with check (auth.uid() = user_id);
create policy "Update own documents"
on public.documents for update
to authenticated
using (auth.uid() = user_id);
Storage bucket
- Create
medical-documents
bucket. - Lock down access if you’re storing PHI; consider signed URLs and RLS on the
storage.objects
table.
Environment configuration
- Never hardcode secrets client-side. Use environment variables and server-side access.
- For local dev, rely on
.env.local
and do not commit it.
Example variables to configure:
- SUPABASE_URL
- SUPABASEANONKEY (client reads ok if your RLS is correct)
- SUPABASESERVICEROLE_KEY (server-only; never expose to browser)
- LLAMAAPIKEY (server-only)
For the Edge Function, set:
SUPABASE_URL
SUPABASE_ANON_KEY
(or service role if needed)LLAMA_API_KEY
Deploying the Edge Function
- Install the Supabase CLI
- Link your project
- Deploy function
supabase functions deploy process-medical-pdf
supabase functions list
supabase functions serve process-medical-pdf --no-verify-jwt
Wire the function behind an HTTP trigger or call it from your app to process files already stored in the bucket.
UX considerations
- Drag-and-drop uploader with clear accept types.
- Progress and error state visibility.
- Terse, readable summaries with expandable details.
- Badges for categories and flags for abnormalities.
Reliability strategies
- Structured prompt → predictable parsing.
- Keep LLM temperature moderate (0.3–0.7) to reduce variance.
- Validate parsed JSON fields; default to safe fallbacks.
- Track
version
andstatus
to support re-processing and migrations.
Security and compliance
- Treat all uploads as potentially sensitive (PHI).
- Don’t expose secrets in the browser. Move LLaMA calls server-side if needed.
- Consider de-identification or redaction at upload.
- Encrypt at rest (Supabase handles storage encryption), and use HTTPS for all calls.
- RLS across
documents
and signed URLs for downloads.
Performance and cost
- OCR (
tesseract.js
) can be CPU-heavy; pre-processing images helps (deskew, denoise, contrast). - Use server-side processing for large PDFs or batch jobs.
- Cache repeated LLM calls when re-processing the same file or version.
What we’d build next
- Normalized lab values with medical ontologies (e.g., LOINC) and unit conversions.
- Trend analysis across time and change detection.
- Confidence scoring and a reviewer checklist for clinical safety.
- Human-in-the-loop editing with audit trails.
- Export to FHIR-compatible bundles.
Demo script (5 minutes)
- Upload a lab report PDF.
- Show immediate “Processing document…” state.
- Reveal results: summary, categories, abnormal flags, and PubMed suggestions.
- Click through the stored file link (signed URL if private).
- Open Supabase Studio to show the corresponding
documents
row.
Credits
- Built at the Caltech Longevity Hackathon by our team in a sprint focused on turning complex medical paperwork into fast, explainable insights.