I think one of the most important and interesting questions regarding using pgve...

hot_gril · on June 26, 2023

I'm a big fan of Postgres, and my app backends are usually very Postgres-heavy. But yeah, I don't get why you'd want to use Postgres for AI inference unless there's some performance reason.

riku_iki · on June 27, 2023

it would be much easier to write query:

select infer_class_from_text_using_ml(some_text) from some_doc

compared to building infra with different services orchestrations.

hot_gril · on June 27, 2023

I'm assuming the cost/effort of getting whatever customized Postgres instance supports efficient (i.e not using the CPU alone) ML inference is more than orchestrating something else.

riku_iki · on June 28, 2023

someone has opportunity to prepackage and even sell such product (postgres has BSD license, so you can totally close sources and sell as your own product).

Also, I think there are many cases when smaller models can totally run on CPU.

hot_gril · on June 28, 2023

Right, but they can (and do) also prepackage ML inference services that don't involve Postgres. I think the small model use case would be the biggest reason, same as how I've sometimes used a Postgres table as a cache in a pinch.

riku_iki · on June 28, 2023

> Right, but they can (and do) also prepackage ML inference services that don't involve Postgres.

but then you would need some dataprocessing/warehousing infra integrated to produce dataset for inference and then do something with inference results. Having one db with everything packaged would reduce complexity significantly.

ML training workflow also could be integrated into this DB, so you could have few queries to generate data for training, model training, generate data for inference, produce inference results and do something with inference results.

hot_gril · on June 28, 2023

I was just looking at inference, meaning you need some separate infra to do the training either way. If the DB does the training too, then yeah. I think we're pretty far from that, and Postgres by itself might not be the platform for it. There are platforms with SQL-like interfaces that use databases underneath, but there's a lot happening on top.

tarasglek · on June 30, 2023

Postgresml.org extension can do training. End to end ml workflows within postgres... Posting this as a happy user.

riku_iki · on June 28, 2023

I think it is all few months work project, given you can run any arbitrary py/c code in extension in postgres, so it is a matter of data wrangling: convert pg table to recordio, run train/inference, convert results back to pg table, wrap logic into pg/py or pg/c functions.