Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: PgTyped – Typesafe SQL in TypeScript and Postgres (github.com/adelsz)
269 points by alde on May 25, 2020 | hide | past | favorite | 88 comments


There are some similar projects, like sqlx [1] for Rust. My problem with these is that they don't help to solve the actually hard problems.

While nice to have, preventing bugs with static SQL is usually easy to do by writing a few tests. Most of the SQL related bugs I have encountered were due to queries with dynamic/conditional joins, filters and sorting - and almost every project using a database needs those.

Approaches like this don't help there. That requires heavy-weight solutions that are more cumbersome to use and need a strong type system, like diesel [2] (Rust), Slick [3] (Scala) and some similar Haskell projects.

[1] https://github.com/launchbadge/sqlx

[2] https://github.com/diesel-rs/diesel

[3] https://scala-slick.org/


> preventing bugs with static SQL is usually easy to do by writing a few tests

I've heard the same argument about TypeScript vs JavaScript and it's something dynamic typing proponents often say but in practice I find immense value in having the types autocompleted and checked in the editor - and I've worked plenty on both sides, current project is substantial RoR codebase, I've worked with Python and node.js backends on mature codebases. Eventually all these languages have some sort of static type hinting efforts to improve tooling - typescript being most successful.

The best thing I saw in this space was F# type providers which didn't require a pre-build step - the language had a mechanism for writing custom type providers that would look up the data source during compilation - unfortunately I didn't get to use it on any real world projects.


F# also has support for analyzers that can achieve similar functionality in case you don't want to take a dependency on a type provider.

https://github.com/Zaid-Ajaj/Npgsql.FSharp.Analyzer

https://github.com/aaronpowell/FSharp.CosmosDb#fsharpcosmosd...


ORMs like Diesel are definitely very useful. The problem I have with them is that their ORM abstraction often leaks. Fixing these abstraction leaks is a hard problem [1]. Ofc, there has been attempts to reconcile relational DBs with OOP like languages, but they are not very popular. [2]

PgTyped and some similar libs try to solve a simpler problem (typing static queries) and can be used to build more complex solutions when needed.

Writing query result/param type assertions by hand and using tests to guarantee type synchronization between DB and code wasn't maintainable on most projects I have seen.

[1] https://en.wikipedia.org/wiki/Object-relational_impedance_mi...

[2] https://en.wikipedia.org/wiki/The_Third_Manifesto


I may just not have enough context on the issue, but i've not had significant issues with SqlAlchemy's hybrid ORM/Query Builder approach.


While I like SQLAlchemy a lot, it has to be said that it leaks like a sieve.

Personally I think the problem is not that ORMs are leaky abstractions, it's that they often pretend not to be. SQLAlchemy just leaks unapologetically, if you struggle with ORM you can drop right down to using an DSL to construct SQL or just write SQL directly without feeling like you have to fight the ORM and mess with undocumented internals you shouldn't be touching.


Actually, after my quick scan of the readme, I think pgTyped takes a different approach than the other tools listed. It is a YeSQL-style tool with build-time code generation. The code generation is based on prepared statement metadata rather than table metadata. Like other YeSQL tools, the function name comes from a DocComment annotation in the .sql but the query params and the result set columns come from the prepared statement.


I'd be curious to hear more about the issues with joins, and dynamic conditionals. I've been working on a type provider lib for sql in kotlin[1].

The join problems I've seen are either the joined table has changed, or altered. The return type of a field may change.

The hard problem I encountered was doing things like json aggs, multiple joins, etc. I'm trying to address this by doing type safe aggregate/join functions. Secondly is query compilation. Compiling the output record of advanced queries into an automatic data class.

1: https://gitlab.com/AnimusDesign/kotlin-frm


While the scenario you present is a legitimate reason for using a query builder, it doesn't justify the investment in using a query builder for the vast majority of db calls. The edge case is not an appropriate goal to impose the cost of using a DSL for the majority of cases. Instead, custom roll an implementation for those complicated calls and stick with a parameterized sql library.


> While nice to have, preventing bugs with static SQL is usually easy to do by writing a few tests

Why not both?


I really like the unique approach of the annotated SQL files and can definitely see some use cases where it would be good to declutter the SQL from the code. For me personally, I'd be hesitant to add another build tool to my already bloated toolchain. Could create a special Babel-style "import" type that automatically transforms your code (JIT)? It could remove some of the friction in adoption (for Babel users at least).

Another one in a similar vein with strict typing and really nice SQL interpolation for Postgres: https://github.com/gajus/slonik


Can't say enough good things about slonik. Have been using it in production for a while now and I love it. For people like me who believe "abstracting away SQL" is a mistake, but still want protections against SQL injections and a nice API, slonik is a godsend.


Same, I really like this approach. Some of the benefits are:

- better separation of concerns

- better integration with SQL tools (syntax highlighting, autocompletion, etc)

- way easier to run/test/debug your queries into a database client

- better languages analysis of your projects (e.g. % of SQL in your GitHub/GitLab repo)

--

If anyone interested in applying this approach in your Python projects, I recommend this package: https://github.com/mcfunley/pugsql


I wonder if the JIT compiling will work asynchronously - from the readme it's getting the types from the live database schema.


Its actually doing the with what looks like a custom async messaging queu. Pretty cool.


I came here to mention a similar approach, which last time I looked was a very compelling experiment[1], but its original author has actually built out a real library, Zapatos[2] which looks very very good.

[1]: https://github.com/jawj/mostly-ormless

[2]: https://jawj.github.io/zapatos/


Looks like Zapatos still requires the user to manually specify param/result types for custom SQL queries?


Zapatos author here. Yes, it does. But for most of what you’d use an ORM for, you probably won’t need custom queries.


Hey I’m not sure this is the best venue, but I’m trying to make the case for getting my org off of sequelize, and your library is right in line with my goals. The hardest sell is going to be publicly visible test coverage. Would you welcome a dedicated effort from an early adopter to introduce tests?


Yes, that would be very welcome. I suggest you keep me in the loop from the start to make sure we end up with something we’re both happy with.


As a maintainer of a similar project[0], it's great to see another entry in this space.

sqlc currently has great support for Go and experimental support for Kotlin. I'm planning on adding TypeScript support in the future, so it's great to see that others in the TypeScript community find this workflow useful.

[0] https://github.com/kyleconroy/sqlc


Looks pretty cool. What I really want though is a library that let's me write plain SQL queries which are then mapped into nested objects in a smart way without too much manual work (I know Postgres can do JSON stuff, but the queries look pretty complicated for what little they actually do).

Say `SELECT * FROM user LEFT JOIN post ON user.id = post.id` would be mapped to `[{userId: 1, name: renke1, posts: [{postId: 2, title: "foo"]]`.

You probably need some kind of meta data to figure out how tables and thus objects relate to each other though.

Basically, I want to be able to leverage the full power of modern databases without being constrainted by typical ORM limitations. Also, I don't need features like lazy loading, sessions, caches and things like that.

A great advantage is that you can (provided you have some test data) easily test your queries while you develop a new feature (think IntelliJ IDEA where you can simply execute an SQL query on the fly).


GrahpQL might be an answer.Though an incomplete answer at this point due to a mismatch between plain GraphQL and SQL;

And the real issue is how to define and where to place a single source of truth for the schema an operations. So far we saw approaches where:

- GraphQL schema is generated from SQL tables. Makes total sense for a project or a company that looks to capitalize on customers with existing databases (e.g., PostGraphile, Hasura);

- SQL schema is generated from a GraphQL schema;

- SQL schema and TypeScript CRUD resolvers are generated from GraphQL schema (graphback);

- a language is introduced and GraphQL and SQL are generated from that language (Prisma);

- a library and a set of decorators are used to define both GraphQL schema and a typed ORM schema within a standard language (e.g, TypeGraphQL + TypeScript + some ORM such as TypeORM).


The system that I’ve most enjoyed using recently has been hasura. It’s basically a plug and play graphql api that sits on top of a Postgres db. I love it. You can call your database directly from your clients, with row level permissions, and real time subscriptions. It’s just great.

hasura.io


You never mentioned the language, so maybe Dapper? https://github.com/StackExchange/Dapper/


Writing literal SQL in JS would IMO need more tools than just a preprocessor like you describe.

The few times I tried it (mostly in tests to check that the ORM is working properly) the #1 thing I was missing is a prettier-plugin that automatically formats SQL in the same way it currently works for `html` tagged templates.

I completely agree to the 'constrained by ORM' and 'useless features' part though. Postgres `json_agg` is a godsend and I love to be able to reason over simple joins and queries.

BTW, my own approach to use `json_agg`, `json_build_object` and json columns within a typesafe query-building DSL is this: https://github.com/hoeck/typesafe-query-builder

But its mostly for replacing simple ORM fetches, it wont do complex analytical queries. For that I'd like to write SQL directly as query-DSLs tend to quickly stop being usable in that situation.


One of my favorite features of WebStorm is the (official) database plugin which highlights SQL queries inside JS strings AND has autocomplete and refactoring support that actually uses the live database schema.


But unfortunately returned values have no type definition unless manually provided.


You should check objection.js[1] as I think it would get you what you describe and probably more. It is a relational query builder built on top of knex, which is a query builder.

[1] https://vincit.github.io/objection.js/


I wrote the original typescript bindings for objection. One of the library authors is a contributor to knex.js, which objection uses for query building. Both are quite nice.

With either library, you aren't writing naked SQL, but something like `query.where("updated_at", ">", Date.now())`. Tsc comes into play when you send typed objects into your .where, .insert, ... methods and what you get back from the query.

It's close enough to SQL that you don't lose expressiveness, you're not behind walls of magick due to orm, and you don't have to worry about Bobby Droptables.


Thanks, I'll check it out. But on first glance it doesn't seem to let me write actual SQL queries. Also, it doesn't seem to be typesafe, or is it?


It says it has typescript support.


I didn't check those typings, but I don't think the model itself is type safe from the looks of it.


It is. See my ggp reply.


We’ve accomplished this with a combination of postgraphile and graphql-codegen. Our setup is basically the same as the postgraphile starter app. [0]

It’s pretty awesome.

[0] https://github.com/graphile/starter


If you were willing to give up a bit of magic, you could probably build this as a thin layer over PgTyped.

The API could be something like this:

Query.sql

  SELECT * FROM user LEFT JOIN post ON user.id = post.id
Application.ts

  const results = await Query()
  const nested = nest(results, {
    parentFields: ['userId', 'name'],
    childFields: ['postId', 'title'],
    childName: 'posts'
  )
If you wanted, you wouldn't really have to specify child fields, since they'd just whatever wasn't a parent field. It'd take a bit more work to get it to do multiple levels of nesting, but after a point it doesn't make sense to write queries that return so much duplicate data anyway.


> but after a point it doesn't make sense to write queries that return so much duplicate data anyway

This is what I'm constantly wondering. At what point does it stop being good to return the user table results again and again and just switch to, for example, an IN query to get the posts?


Zapatos can generate the hairy JSON stuff for you, including lateral joins that are equivalent to your example query, which would be:

    const result = await db.select(‘user’, db.all, { lateral: { posts: db.select(‘post’, { userId: db.parent(‘id’) }) } }).run(pool);
And result will have the structure you asked for, and be automatically typed as such.

(Sorry, can’t manage helpful indentation from my phone).

See: https://jawj.github.io/zapatos/index.html#joins-as-nested-js...


Thanks, Zapatos looks really nice. That library seems to be basically what I want (aside from writing SQL for joins, but I think it's just not possible in the way I imagined). The documentation is really nice. Much of what is written in it aligns with my thinking. I'll definitely give it a try!


I just don't get why people write queries like that in a convoluted way which is hard to decipher and possibly optimize with so many noises whose knowledge becomes useless on the next language/framework of choice than a straight forward SQL whose knowledge can live for decades.


I’m not exactly sure what you’re getting at here. The whole point of this library is that one call = one predictable query, and the SQL it generates is certainly no more convoluted than required to produce the desired result.


What was so hard to grasp with my comment?

If your query was meant to say,

SELECT * FROM users JOIN posts ON posts.userId = users.id

the ORM has so many unnecessary characters just to get to this straight forward SQL and you have to learn how to make sure the way you write will generate the SQL that you want that uses the proper index everytime you change framework is just something I wouldn't do.

I didn't say the SQL generated is convoluted but the way you need to write so many characters that doesn't hold a meaning is convoluted.


The problem is that there's no obvious, simple way to bring the results of this query, structured and typed appropriately (with posts nested inside users), into our running code.

So we have to choose what we think is the least-worst compromise.

Since I wrote the library I am obviously anything but impartial, but the compromise I outlined above is one I'm pretty happy with (and it's much easier to read if you add a few line breaks). It's a bit noisier than your JOIN query, but for the price of that noise you get type safety and an appropriate JSON structure for the result.


Thanks! A grouping feature will definitely be useful, I have been thinking about a good way to add it to pgtyped.

Will grouping fields by tables they belong to good enough? Or is there some different grouping logic you have in mind?


That's hard to say, because I am not sure what I really want.

…but let's say I have this result (from an arbitrary query).

  | user.id | user.name | post.id | post.title |
  +---------+-----------+---------+------------+
  | 1       | renke1    | 1       | first      |
  | 2       | alde      | 2       | second     |
  | 2       | alde      | 3       | third      |
Now I would like to tell the library: hey, an user can have many posts (1:n), please map this to nested objects.

Of course I don't want to write `SELECT user.id, user.name … FROM …` but just `SELECT * FROM …` (because a table may have a dozen of columns and I don't want to spell out every single one). So the query might have to be rewritten on-the-fly to make the correct projection (otherwise it would be hard to know to which object a value belongs).

I am not sure if that's something your library should do though.

And thinking even more about it, I think this approach wouldn't really work for views (and probably other things) where it's not really clear from which tables the data actually comes from (at least not by only looking at the query).

I guess what I really want is library that takes my SQL query, reads my mind and gives me back some nested objects… and let's not talk about inserts…


but select * is an anti-pattern. SQL queries should only return the columns that you need not that you might need -- usually. Lazy loading and sessions etc., like SQLAlchemy does to get around the N+1 query problem be damned.


I didn't mean the `SELECT STAR` in a literal sense, but more like, please select only the stuff we need. But indeed with PgTyped, as far as I understand it, it wouldn't work because it creates interface from SQL queries.

To achieve what I want you need to do it like all the other ORMs where you have some kind of description of your model and how it maps to tables and columns. Unlike the usual ORM I want to write SQL queries which are then rewritten in an intelligent manner.

So something like query(`SELECT STAR FROM user LEFT JOIN post …`, UserWithPostsModel). Since the library would know the target model, it could rewrite the `SELECT STAR` to something that only asks for the data it needs.

In other words I want to execute arbitrary queries that a mapped into ad-hoc models (unlike typical ORMs where the model usually maps directly to tables).

  STAR = *


Ahh I understand you now.


Postgres is pretty good at JSON. I think this would work for your example (assuming the implied schema), presumably in PgTyped too:

SELECT user.*, json_agg(post) as posts FROM user LEFT JOIN post ON user.id = post.id GROUP BY user.id

It's not as simple as you'd like (and you have to wrap the whole thing in a subquery + another json_agg if you really want Postgres to return a single JSON string) but it's not actually a lot more complicated, and avoids having to rely on magic that's likely brittle and hard to debug (or only really useful in a small subset of cases).


It exists! You want Scala Slick plain SQL http://scala-slick.org/doc/3.0.0/sql.html

See tsql string interpolation with typed result sets!


You'd be constrained by whatever is trying to parse SQL queries to figure out what object structure to return instead.



I mean no disrespect to the TypeORM author(s), but I was burned several times by unexpected behavior that resulted in serious data integrity issues. I would caution that anyone adopting this library test any usage heavily, especially anything to do with relations


Yes. My brief experience with TypeORM made me feel like it was written by people who really understand TypeScript, but maybe don’t understand SQL so well. It was a brief experience because I quickly lost trust in it.


What version did you use? Have the issues been raised and addressed? Curious because I am using it in production and now I'm worried. Would love more information about reproducing the issues as well.


MongoDB support is a work in progress for them too but otherwise a fun library.


Is there anything like this for Rust or C++?

I like the idea of code generation instead of doing the work at runtime (like in ORMs). This is like making your database schema the IDL spec.


There's Diesel[0] for Rust which is a full ORM. It's by Siân Griffin[1] who, as I understand it, is also behind a lot of how rail's ActiveRecord works.

0: https://diesel.rs/

1: https://twitter.com/sgrif


Just to clarify a bit for other readers since I've worked with diesel for while, diesel isn't a "full" orm, as there are no real helpers provided to you outside of "we can map the result of a db query into a struct(s) that you specify" and some really nice guarantees for compile time queries. Other than that, your struct is a pretty dumb mapped representation and it's on the implementers of the application code to provide sugar for better access patterns. For people coming from something like active record, this is (in my opinion) closer to Arel than ActiveRecord, or closer to sqlalchemy core than sqlalchemy orm. As an example, you won't necessarily be able to do `MyStruct.join(OtherStruct)` and have it magically figure out how to query the database and map the results out of the box.


Clarification: compile time query building, not querying. Due to inlining from the compiler, you can almost entirely construct the query at compile time and shave it down to a few string concatenations.


https://github.com/launchbadge/sqlx perhaps

Also hugsql for clojure and pugsql for python.


Sqlx looks nice! I had some ideas about porting pgtyped to rust and utilizing macros to do query type inference at build time, but was worried that such db-connected macros will slow down the build process. Nice to see that it worked out for sqlx.


Lots of comments here about similar projects in a different language, but the fact that this targets TypeScript is explicitly what makes it interesting to me. Using regular Javascript database libraries, even ones that have type definitions, require a lot of double typing.

I've been relatively satisfied with TypeORM, but one thing that's been a hurdle for me to some extent is its reliance on experimental decorators, and the resulting incompatibility with Babel - which in turn makes it harder to integrate with the wider ecosystem, e.g. Next.js.

As far as I can see on first glance, there's nothing here yet that makes it incompatible with Babel, so my tip would be to make it an explicit goal to keep it that way :)


As usual for Babel, there’s a plugin for that: https://github.com/leonardfactory/babel-plugin-transform-typ...


Yeah, but then you still have the same problem of ecosystem divergence. (In addition to the fact that I don't like using unstandardised features in the first place...)


Opening a can of worms for sure, but the reliance on Babel in the JS community is not a good thing to me. It's another reason why I prefer TypeScript, as in TSC, not whatever equivalent babel happens to support.


I do agree that sticking to TSC keeps things a lot simpler, and that should be the default setup.

Pandora's box has been opened though. Majority of the JS community seem to have adopted Babel in their workflow.

Even as I question it, I must admit it's pretty sweet to have a transpiler in the buid toolchain. It frees one from the browser/backward compatibility question, more or less, and opens up the language to be extensible - for better or worse.

For example, Babel macros ¹ is an interesting concept, using the language to extend the language during compile time.

To bring it back to the topic at hand (PgTyped, Typesafe SQL) - one thing that I'd like to see in TSC is a way to output type definitions (either inlined metadata or external JSON schema), to be consumed for run-time type checking. That would "complete the circle" for me.

TypeScript "compiler plugins" seem to be on the roadmap, or at least under consideration ². Some may see that as opening a proverbial can of worms, down a similar road to Babel. I wouldn't disagree, but it'd be so useful!

¹ https://github.com/kentcdodds/babel-plugin-macros

² https://github.com/Microsoft/TypeScript/issues/16607


Oh agreed on all fronts :). Being able to add plugins is amazing, and there have been times I've almost added ttypescript for its plugin support. But then I remember the road that leads you down, and avoid it.


You might find io-ts [1] interesting. It allows to write composable parsers that do both runtime checks and have correct result types.

Opt-in runtime checks for SQL queries are also on the roadmap for PgTyped. You are welcome to open an issue to track our progress there if you think this feature will be useful for you.

[1] https://github.com/gcanti/io-ts


Thank you - yes, I've dabbled with io-ts, though I'm still learning how to wield its power.

> Opt-in runtime checks for SQL queries

Ah right, I do see in the repo for PgTyped that it's on the roadmap. That does sound useful!

I'm hoping that one of these days, TypeScript will support a more generic runtime type check, somehow closer to the language/TSC-level rather than userland libraries, plugins/transforms, or a parallel compiler/build step.

At the moment, I use a fork of typescript-json-schema ¹ for my purposes.

¹ https://github.com/YousefED/typescript-json-schema


Would be interesting to hear what about it it is that makes it a bad thing - unless it's just the fact that it's a single project?


I'm in the camp that Babel 6 was "useless by default" in the bad sense. It's great that when you use tsc you know exactly what features are supported or not, and don't have to change anything, or worry about compatibility between things. And I'm starting to all the babel-preset-* packages have just created a bunch of busywork for the maintainers.

Then there's the issue of tooling requiring Babel. react-hot-loader should not require Babel. graphql tooling should not require Babel.

I'd argue when you're knee-deep in Babel, you're no longer writing JS, you're writing BabelScript. Only your BS capabilities are different than Jane's BS capabilities, which are also different from Bob's setup over there.


Ah, I hear you - I'm not a fan of enabling a bunch of arbitrary Babel plugins either. But many metaframeworks (Next.js, Create React App) use Babel behind the scenes and use its TypeScript plugin for out-of-the-box TypeScript support. That's very useful, as long as the libraries you use don't use any of the Babel-incompatible features - which is most of the time, but not for TypeORM.

So yes, it's exactly because I don't want to be configuring all kinds of language support for my projects that I'd recommend being able to work with (plain!) Babel with a TypeScript plugin.


My favorite SQL library has been Go-Jet in Go: https://github.com/go-jet/jet

It has a different approach from PgTyped, which generates type-safe TypeScript code from SQL, whereas Go-Jet generates type-safe SQL from Go code

I'd love to try something along the lines of PgTyped and see how the two solutions compare though


Almost every top-level comment is someone shilling another project, usually in Golang as if that's even related.

Let's have some Show HN etiquette.


This is similar to sqlc for Golang: https://github.com/kyleconroy/sqlc

If you're looking for the ability to generate type-safe SQL – given you write SQL correctly – this project is pretty good.

Aalso a fan of SQLBoiler (https://github.com/volatiletech/sqlboiler) for Golang, for simple type safety:

`models.Accounts(models.AccountWhere.ID.EQ(id)).One(ctx, db)`.

Though SQLBoiler breaks with left joins, as it auto-generates your structs and maps results 1-1 with table definitions. In this case you have to custom type something, either using sqlc or squirrel.


Looks cool! And the header image looks awesome! Did you use any tool to do it?


Thanks! Not really, just basic vector shapes and an isometric projection grid to make sure perspective is right.


You might also want to check Kanel out! https://github.com/kristiandupont/kanel


Looks a little like the Typescript equivalent of Xo (https://github.com/xo/xo) for Go. Especially with Go, getting help with some initial scaffolding can be a huge timesaver. I'm assuming it's a similar gain for Typescript.


A better equivalent in Go is https://github.com/kyleconroy/sqlc


thanks! I'll try this - one of my pet peeves with Xo is handling nullable types and more advanced types like JSONB required editing the generated code significantly to make it work. Hopefully sqlc solves that.


So basically Dapper TS? I’m very interested!


I sort of have something like this for PHP and MySQL. https://github.com/ellisgl/GeekLab-GLPDO2


How is it different from Slonik? https://github.com/gajus/slonik


It’s cool, and I don’t fault the author for working on something that obviously gives him joy, but save yourself a bunch of trouble and avoid this kind of thing. The queries showcased are the least interesting of the set of queries you’ll ultimately end up with I’m a mature project.


an other lib with light ORM: https://www.npmjs.com/package/pogi


lol. This reminds me Hibernate xml mappings :D




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: