One Postgres for everything
Job queue, app state, embeddings, caches, audit log — all in one database. One backup, one connection pool, one metric set.
Turns inbound RFQ emails into priced, ready-to-send quotation drafts. The operator goes from "type everything" to "review and send".
Watches a shared sales inbox. When email arrives the system decides whether it is a Request For Quotation, parses the customer and the line items, matches each line to a product in your catalog, looks up any prior quotation for the same customer in your ERP, prepares a draft quotation, and hands the operator a dashboard with scored matches. The operator reviews, fixes anything that looks wrong, and sends.
Before: sales reps copy-pasted item descriptions from emails into spreadsheets. Looking up SKUs took minutes per line — some RFQs had 80 lines. Customers waited 2–7 days for a quote. Many bought elsewhere. The backlog had no visibility — stuck RFQs were forgotten until the customer chased them. After: median time-to-quote dropped to under 30 minutes for cases the system handles cleanly. 107 days of live operation at writing.
A worker polls the shared sales mailbox every 30 seconds. New emails get stored and queued for analysis.
An LLM decides if the email is an RFQ. If yes, a second LLM call extracts the customer, the line items (description, quantity, specs), the project name, and the due date — all into a strict JSON schema.
For each line item: first try an exact SKU match. If no match, do a vector search across 3,500 product variants. Take the top 20 candidates and have an LLM score each against a structured rubric. Pick the best, record a score 0-100, and lock if score ≥ 85.
Search the ERP (read-only) for recent quotations from the same customer. If a similar one exists, surface it next to the new draft so the operator can see context.
Dashboard shows scored matches, easy wins, and gaps. Operator fixes any wrong matches, the system learns from the override, then sends a PDF quotation via email.
Live operation
107 days
Median time-to-quote
< 30 min
Catalog size
~3,500 variants
Daily LLM cost
$0.15-$0.40
Cache hit rate
~70%
These are not mockups. Every screenshot below is from the system running in production.






For anyone evaluating the system from an engineering angle: why these choices, and what was traded off.
Job queue, app state, embeddings, caches, audit log — all in one database. One backup, one connection pool, one metric set.
Embeddings find the right neighborhood fast and cheap. LLMs pick the right answer from 20 candidates. Putting them in series gives both.
Every match scored 0-100. Auto-select fires only above 85. Below 30 the line is flagged "no match." Operators sort by score and triage from the bottom up.
Never write. A wrong write into the ERP costs far more than any benefit. Sync runs every 6 hours; staleness is acceptable.
Re-running a classification, a match, or a send must not duplicate. The job table enforces this with a unique index.
If I rebuilt this today I would skip the embeddings-as-JSON-arrays approach and use pgvector from day one. The cosine math in JS works at this scale but pgvector would let the catalog grow without re-architecting. I would also add a built-in evaluation harness so changes to the matcher can be measured before they ship.
Share the workflow and the systems you use today. Within 24 hours we reply with scope, KPIs, timeline, and a SAR estimate.
Start nowTurns inbound RFQ emails into priced, ready-to-send quotation drafts. The operator goes from "type everything" to "review and send".
Watches a shared sales inbox. When email arrives the system decides whether it is a Request For Quotation, parses the customer and the line items, matches each line to a product in your catalog, looks up any prior quotation for the same customer in your ERP, prepares a draft quotation, and hands the operator a dashboard with scored matches. The operator reviews, fixes anything that looks wrong, and sends.
Before: sales reps copy-pasted item descriptions from emails into spreadsheets. Looking up SKUs took minutes per line — some RFQs had 80 lines. Customers waited 2–7 days for a quote. Many bought elsewhere. The backlog had no visibility — stuck RFQs were forgotten until the customer chased them. After: median time-to-quote dropped to under 30 minutes for cases the system handles cleanly. 107 days of live operation at writing.
A worker polls the shared sales mailbox every 30 seconds. New emails get stored and queued for analysis.
An LLM decides if the email is an RFQ. If yes, a second LLM call extracts the customer, the line items (description, quantity, specs), the project name, and the due date — all into a strict JSON schema.
For each line item: first try an exact SKU match. If no match, do a vector search across 3,500 product variants. Take the top 20 candidates and have an LLM score each against a structured rubric. Pick the best, record a score 0-100, and lock if score ≥ 85.
Search the ERP (read-only) for recent quotations from the same customer. If a similar one exists, surface it next to the new draft so the operator can see context.
Dashboard shows scored matches, easy wins, and gaps. Operator fixes any wrong matches, the system learns from the override, then sends a PDF quotation via email.
Live operation
107 days
Median time-to-quote
< 30 min
Catalog size
~3,500 variants
Daily LLM cost
$0.15-$0.40
Cache hit rate
~70%
These are not mockups. Every screenshot below is from the system running in production.






For anyone evaluating the system from an engineering angle: why these choices, and what was traded off.
Job queue, app state, embeddings, caches, audit log — all in one database. One backup, one connection pool, one metric set.
Embeddings find the right neighborhood fast and cheap. LLMs pick the right answer from 20 candidates. Putting them in series gives both.
Every match scored 0-100. Auto-select fires only above 85. Below 30 the line is flagged "no match." Operators sort by score and triage from the bottom up.
Never write. A wrong write into the ERP costs far more than any benefit. Sync runs every 6 hours; staleness is acceptable.
Re-running a classification, a match, or a send must not duplicate. The job table enforces this with a unique index.
If I rebuilt this today I would skip the embeddings-as-JSON-arrays approach and use pgvector from day one. The cosine math in JS works at this scale but pgvector would let the catalog grow without re-architecting. I would also add a built-in evaluation harness so changes to the matcher can be measured before they ship.
Share the workflow and the systems you use today. Within 24 hours we reply with scope, KPIs, timeline, and a SAR estimate.
Start now