Under the hood

Every trick we taught the AI.

You sent the paperwork. Here's exactly how we turn it into answers — extraction, validation, decisions, and a learning system that gets sharper with every document.

What are you extracting?

Alembic handles any document type. Here are the ones our customers use most.

Invoices on Autopilot

Vendor name, line items, PO number, due date, payment terms — extracted and validated before you finish your coffee. Let the data flow straight into your accounting system, matched and ready to pay.

Contracts Without the Squinting

Auto-renew clauses, termination dates, liability caps, payment schedules — surfaced instantly so you never miss a deadline that costs you money. Alembic reads the fine print so you can make decisions instead of highlighting PDFs at midnight.

Claims, Sorted in Seconds

Policy number, claimant details, incident date, damage estimates, coverage limits — pulled from even the messiest scanned forms. Process claims faster, catch inconsistencies earlier, and stop losing hours to manual data entry on every filing.

Compliance Without the Chaos

License numbers, expiration dates, filing deadlines, regulatory classifications — tracked and structured automatically across every document in the stack. Stay audit-ready without dedicating someone's entire week to spreadsheet maintenance.

Expenses That File Themselves

Merchant, amount, date, category, tax — extracted from crumpled receipts, email confirmations, and blurry phone photos alike. Your team stops hoarding shoeboxes of paper, and month-end close gets a whole lot shorter.

Your Documents, Your Schema

Medical records, shipping manifests, permit applications, academic transcripts — if it has data, Alembic extracts it. Just tell the AI what fields you need and it builds the extraction on the fly. No templates, no training sets.

How it actually works

Six layers between your documents and structured data. Each one built to save you time and earn your trust.

Intelligent Extraction

Your documents have answers. Alembic reads them like a person would.

Most extraction tools strip your PDFs down to raw text and pray the formatting survives. Alembic sends the actual document — layout, tables, handwriting, all of it — directly to AI agents that see the page the way you do. Describe what you need in plain English, and the system designs the schema, extracts the data, and gets smarter with every correction.

  • Vision-first processing — PDFs, scans, and images go straight to AI, not through lossy text conversion that destroys table structure
  • Schema-free onboarding — tell the AI what data you need in conversation, and it builds and refines the extraction schema for you
  • Multi-model agents assign the right AI to each task — fast models for simple fields, powerful models for complex reasoning
  • Every correction you make becomes a permanent memory pattern, so the same mistake never happens twice
PDF
Order Form with LegalZoom — Productiv Order Form
Verified
pdf · 3m ago · $0.27
Customer Information
10 fields
company_name
LegalZoom
street_address
101 N. Brand Blvd
city
Glendale
state_province
CA
$
Contract Terms
6 fields
order_start_date
May 1, 2026
payment_terms
Net 30
auto_renew
No
total_fees
$14,280.00
Decisions & Triage

High confidence gets approved. Low confidence gets flagged. You handle the interesting stuff.

Alembic doesn't just extract data and dump it in your lap. It makes decisions — auto-approving values it's certain about, flagging the ones that need a second look, and asking you directly when something is genuinely ambiguous. Your review queue only contains the extractions that actually need a human brain.

  • Confidence-driven automation approves clean extractions instantly so you only review what matters
  • Flagged values come with the AI's reasoning — you see why it's uncertain, not just that it is
  • One-click approve, reject, or correct — no context switching, no hunting through documents
  • Configurable thresholds let you dial between full automation and full manual review
Software Licensing Agreements
20 documents · Last upload 1d ago
Needs attention · 1
PDF
Order Form with Reddit
65%
Error
Needs review · 4
PDF
AppCues Renewal 2025.docx
65%
Needs Review
PDF
TheyDo — Renewal 2024.docx
72%
Needs Review
PDF
Order Form with Culture Amp
75%
Needs Review
Source Traceability

Every number has a receipt.

When your CFO asks "where did this figure come from?" you shouldn't have to dig through a filing cabinet. Every value Alembic extracts is pinned to its exact location — the page, paragraph, table cell, or line item in the source document. Click any field, see the proof. For regulated industries, that's not a nice-to-have — it's the whole point.

  • Field-level source mapping links every extracted value to its exact position in the original document
  • Visual highlighting shows you precisely where the AI found each data point — no ambiguity
  • Full extraction audit trail records what was extracted, when, by which model, and whether a human modified it
  • Export source mappings alongside your data for compliance documentation and downstream audit requirements
PDF
Invoice-2026-0847.pdf
Traced
vendor_name
Meridian Consulting LLC
pg 1, para 2 ↗
invoice_total
$14,280.00
pg 2, table 1 ↗
due_date
2026-04-15
pg 1, header ↗
po_number
PO-2026-0847
pg 1, ref line ↗
payment_terms
Net 30
pg 2, footer ↗
billing_contact
Missing — ask user?
AI Standups

Your AI agent shows up every morning with a plan.

Every extraction space gets its own AI agent — and it doesn't just wait for instructions. It runs standups, proactively surfacing what needs attention, resolving what it can on its own, and bringing you only the decisions that require a human call. Priority-coded diamonds tell you at a glance what's urgent, what's routine, and what's already handled. When everything's caught up, it celebrates. You will find this unreasonably satisfying.

  • Per-space AI orchestrator monitors your documents, triages issues by severity, and resolves routine problems autonomously
  • Priority diamonds give you an instant visual read — critical in red, important in amber, routine in green
  • One-click decisions on flagged items — approve the AI's recommendation or override it, right from the standup view
  • When your queue hits zero, the agent lets you know with a moment of genuine delight — because clearing your inbox should feel like something
DISTLR Intelligence
3 need you
Invoices 3 documents need vendor verification Decide
Contracts Auto-renew clause ambiguous on 2 docs Review
Receipts 47 processed, 2 low-confidence dates 2 handled
Claims 12 resolved by memory patterns Handled
All caught up.
Governance & Rules

Trust the AI. Then verify it with your own rules.

AI extraction is only as useful as the guardrails around it. Alembic lets you define validation rules, approval workflows, and compliance checks that run on every extraction — before data ever leaves the system. Set the policies once, and they enforce themselves across every document, every space, every time.

  • Custom validation rules catch domain-specific errors — date ranges, value thresholds, required field combinations, format constraints
  • Approval workflows route sensitive extractions through designated reviewers before data is finalized
  • Learning system turns reviewer corrections into permanent rules, so governance improves automatically over time
  • Full activity logging for compliance — every extraction, decision, correction, and approval is recorded and exportable
Validation Rules
4 active
if invoice_total > $10,000Require approval
if confidence < 85%Flag for review
if vendor = newCompliance check
if due_date < today + 7dPriority escalation
+ Add rule
API & Integrations

Use the UI. Or don't. The API does everything.

Alembic runs fully headless. Every capability you see in the dashboard — uploading documents, configuring schemas, triggering extractions, reviewing results — is available through the REST API. If your system can make an HTTP request, it can run Alembic. For teams that live in their own tools, Alembic becomes invisible infrastructure that just delivers clean data where you need it.

  • 32 REST API endpoints cover the full lifecycle — upload, extract, review, approve, export, and configure, all programmatically
  • Webhooks push extraction results and status changes to your systems in real time — no polling, no delays
  • SSE streaming lets you watch extractions happen live, with field-by-field progress for long documents
  • Batch operations process hundreds of documents in a single call, with per-document status tracking and error handling
GET/api/v1/spaces/{id}/documents/{docId}
// Response
{
  "vendor_name": "Meridian Consulting LLC",
  "invoice_total": 14280.00,
  "due_date": "2026-04-15",
  "payment_terms": "Net 30",
  "status": "approved",
  "confidence": 0.98,
  "source": {
    "page": 1,
    "paragraph": 2,
    "verified": true
  }
}

Seeing is faster than reading.

You just scrolled through everything Alembic can do. Now upload a document and watch it happen in about forty seconds.

Extract your first document free
No credit card · No setup · 5 free documents
Compare plans