> the software implementation is much less trivial Aren't most geospatial tools ...

jandrewrogers · 2025-05-04T04:08:40 1746331720

Many do use simple geometry. This causes endless headaches for people who are not cartographers, they don’t expect that. The good geospatial tools usually support spheroidal models but it is not the default, you have to know to explicitly make sure it uses that (many people assume that is the default).

An additional issue is that the spheroidal implementations have undergone very little optimization, perhaps because they are not the defaults. So when people figure out how to turn them on, performance is suddenly terrible. Now you have people that believe spheroidal implementations are terribly slow, when in reality they just used a pathologically slow implementation. Really good performance-engineered spheroidal implementations are much faster than people assume based on the performance of open source implementations.

jofer · 2025-05-04T14:34:01 1746369241

For what it's worth, you _can't_ use spherical approaches for most data. They're only used for points, in practice. Your spatial data is inherently stored/generated in ways that don't allow spherical approaches as soon as you start working with polygons, let alone things like rasters.

Yes, spherical representations of polygon data exist, but the data you import has already been "split" and undoing that is often impossible, or at best non-trivial. And then rasters are fundamentally impossible to represent that way.

Analysis uses projections for that reason. Spherical approaches aren't fundamentally "better" for most use cases. They're only strictly better if everything you're working with is a point.

There's more to geospatial than point datasets.

groggo · 2025-05-04T19:44:06 1746387846

A good point. Certainly for raster analysis it doesn't make sense.

But any type of vector data could be modeled on a sphere, right? Points, shapes, lines. And I saw "better" because even the best suited projection will have some small amount of distortion.

Either way, most things use planer geometry so projections are necessary, and you need to have some understanding of how all that works

jofer · 2025-05-04T22:18:25 1746397105

You can model polygons on a sphere, but the issue is that the data you're starting with is already in a cartesian representation. You actually can't easily convert between the two for complex geometries in cases where they cross the antimeridian/poles. So trying to do anything other than points is difficult in practice, unless you're natively generating data from scratch in a spherical representation, which is rate.

Demiurge · 2025-05-04T02:19:48 1746325188

This is not really a problem, unless you’re trying to simulate some 3D space orbits, physics. The crossover from geo INFORMATION systems to geo simulation systems is a bit rough, but the projections and calculations on projected cartesian space are enough for many typical questions, like distance, area, routing. However, even topology support starts getting specialized, and the use cases are more niche. I think it’s asking a bit too much from a database/storage layer to do efficient calculations outside of those supported by GEOS. At this point, you might want to import the relevant data into higher level applications.

jandrewrogers · 2025-05-04T04:25:29 1746332729

Speaking for myself, I was not referring to any kind of simulation systems. This is a standard requirement of many operational geospatial data models, and there are a lot of these in industry. Anything that works from a projection is a non-starter, this causes demonstrable issues for geospatial analysis at scale or if any kind of precision is required. Efficient calculation just means efficient code, there is nothing preventing this from existing in open source beyond people writing it. Yes, you may be able to get away with it if your data model is small, both geographically and data size, but that does not describe every company.

It is entirely possible to do this in databases. That is how it is actually done. The limitations of GEOS are not the limitations of software, it is not a particularly sophisticated implementation (even PostGIS doesn’t use it for the important parts last I checked). To some extent you are affirming that there is a lack of ambition in this part of the market in open source.

urschrei · 2025-05-04T11:16:16 1746357376

I wouldn't say it's correct to say that GEOS isn't particularly sophisticated. A lot of (certainly not all) GEOS algorithms are now ported from JTS, the primary author of which is Martin Davis (aka Dr JTS), who works at Crunchy Data, who provide the PostGIS extension. So the chain (again, mostly) goes JTS -> GEOS -> {PostGIS, Shapely} -> … . Martin's work is at the cutting edge of open-source GIS-focused computational geometry, and has been for a long time (of course, industry has its own tools, but that's not what we're talking about).

I can sort of see your point about the merits of global, spheroidal geometry, certainly from a user's perspective. But there's no getting around the fact that the geometry calculations are both slower (I'm tempted to say "inherently"…) and far more complex to implement (just look at how painful it is to write a performant, accurate r- or r*-tree for spherical coordinates) along every dimension. That's not going to change any time soon, so the projection workflow probably isn't going anywhere.

Demiurge · 2025-05-04T13:36:53 1746365813

You're right in that pretty much anything can be done via an API exposed via a database function. However, as they say... if it can be done, does mean it should? Now, I agree that having more sophisticated multi-dim calculations would be cool, but I've just rarely ran into needing or even wanting to do this, over many projects, some involving accurate simulations. In practice, database has always been for storing and querying data, which can be extremely accurate. I am probably the first person to abuse SQL and I've written some 3D ECEF rotation code in SQL, but it was a hack for a deadline, not because it was the right thing to do. All the projects I've worked with, had external models or components that did the "precise work" using complex code that I would never dare to make dependent on any database.

I'm actually curious, speaking for yourself, what kind of analysis you're doing where something like NAD83, or UTM does not give you enough precision? Is this actually "real world" geospatial data? If I have a soil model, I have a very localized analysis, and if I have a global climate model, we're talking kilometers for grid cells. In all these cases, the collected data has built in geolocation error MUCH grater than most decent projections...

So, what analysis are you doing where you need centimeter precision at global scale of thousands of kilometers? Sounds really interesting. The only time I've seen this, is doing space flight simulations where the error really accumulates into the future.