More

exDM69 · 2026-01-11T09:23:38 1768123418

Using printf in shaders is awesome, it makes a huge difference when writing and debugging shaders. Vulkan and GLSL (and Slang) have a usable printf out of the box, but HLSL and D3D do not.

Afaik the way it works in Vulkan is that all the string formatting is actually done on the CPU. The GPU writes only writes the data to buffers with structs based on the format string.

All the shader prints are captured by tools such as Renderdoc, so you can easily find the vertex or pixel that printed something and then replay the shader execution in a debugger.

I only wish that we would've had this 20 years ago, it would have saved me so much time, effort and frustration.

pjmlp · 2026-01-11T19:49:11 1768160951

Maybe because Pix is quite good?

exDM69 · 2026-01-12T10:04:42 1768212282

Using printf and debuggers are complementary.

Finding which pixel to debug, or just dumping some info from the pixel under mouse cursor (for example) is better done with a simple printf. Then you can pick up the offending pixel/vertex/mesh/compute in the debugger if you still need it.

You get both, a debugger and printf related tooling in Renderdoc and it's better than either of those alone.

I've been writing a lot of GPU code over the past few years (and the few decades before it) and shader printf has been a huge productivity booster.

exDM69 · 2026-01-08T12:13:13 1767874393

Maybe this would be a suitable application for "Fibonacci hashing" [0][1], which is a trick to assign a hash table bucket from a hash value. Instead of just taking the modulo with the hash table size, it first multiplies the hash with a constant value 2^64/phi where phi is the golden ratio, and then takes the modulo.

There may be better constants than 2^64/phi, perhaps some large prime number with roughly equal number of one and zero bits could also work.

This will prevent bucket collisions on hash table resizing that may lead to "accidentally quadratic" behavior [2], while not requiring rehashing with a different salt.

I didn't do detailed analysis on whether it helps on hash table merging too, but I think it would.

[0] https://probablydance.com/2018/06/16/fibonacci-hashing-the-o... [1] https://news.ycombinator.com/item?id=43677122 [2] https://accidentallyquadratic.tumblr.com/post/153545455987/r...

SkiFire13 · 2026-01-08T12:47:17 1767876437

> This will prevent bucket collisions on hash table resizing

Fibonacci hashing is really adding another multiplicative hashing step followed by dropping the bottom bits using a shift operation instead of the top bits using an and operation. Since it still works by dropping bits, items that were near before the resize will still be near after the resize and it won't really change anything.

attractivechaos · 2026-01-08T14:17:41 1767881861

Exactly. And khashl uses Fibonacci hashing. Without salting, it has the same problem.

exDM69 · 2026-01-07T17:51:07 1767808267

> why is it not trivial to add a path stage as an alternative to the vertex stage?

Because paths, unlike triangles are not fixed size or have screen space locality. Paths consist of multiple contours of segments, typically cubic bezier curves and a winding rule.

You can't draw one segment out of a contour on the screen and continue to the next one, let alone do them in parallel. A vertical line segment on the left hand side going bottom to top of your screen will make every pixel to the right of it "inside" the path, but if there's another line segment going top to bottom somewhere the pixel and it's outside again.

You need to evaluate the winding rule for every curve segment on every pixel and sum it up.

By contrast, all the pixels inside the triangle are also inside the bounding box of the triangle and the inside/outside test for a pixel is trivially simple.

There are at least four popular approaches to GPU vector graphics:

1) Loop-Blinn: Use CPU to tessellate the path to triangles on the inside and on the edges of the paths. Use a special shader with some tricks to evaluate a bezier curve for the triangles on the edges.

2) Stencil then cover: For each line segment in a tessellated curve, draw a rectangle that extends to the left edge of the contour and use two sided stencil function to add +1 or -1 to the stencil buffer. Draw another rectangle on top of the whole path and set the stencil test to draw only where the stencil buffer is non-zero (or even/odd) according to the winding rule.

3) Draw a rectangle with a special shader that evaluates all the curves in a path, and use a spatial data structure to skip some. Useful for fonts and quadratic bezier curves, not full vector graphics. Much faster than the other methods for simple and small (pixel size) filled paths. Example: Lengyel's method / Slug library.

4) Compute based methods such as the one in this article or Raph Levien's work: use a grid based system with tessellated line segments to limit the number of curves that have to be evaluated per pixel.

Now this is only filling paths, which is the easy part. Stroking paths is much more difficult. Full SVG support has both and much more.

> In fact, you could likely use the geometry stage to create arbitrarily dense vertices based on path data passed to the shader without needing any new GPU features.

Geometry shaders are commonly used with stencil-then-cover to avoid a CPU preprocessing step.

But none of the GPU geometry stages (geometry, tessellation or mesh shaders) are powerful enough to deal with all the corner cases of tessellating vector graphics paths, self intersections, cusps, holes, degenerate curves etc. It's not a very parallel friendly problem.

> Why is this not done?

As I've described here: all of these ideas have been done with varying degrees of success.

> Is the CPU render still faster than these options?

No, the fastest methods are a combination of CPU preprocessing for the difficult geometry problems and GPU for blasting out the pixels.

exDM69 · 2026-01-07T12:09:09 1767787749

> NV_path_rendering solved this in 2011.

By no means is this a solved problem.

NV_path_rendering is an implementation of "stencil then cover" method with a lot of CPU preprocessing.

It's also only available on OpenGL, not on any other graphics API.

The STC method scales very badly with increasing resolutions as it is using a lot of fill rate and memory bandwidth.

It's mostly using GPU fixed function units (rasterizer and stencil test), leaving the "shader cores" practically idle.

There's a lot of room for improvement to get more performance and better GPU utilization.

exDM69 · 2025-12-20T10:33:27 1766226807

Yes, sadly this isn't a part of standard C or C++.

It is available as a language extension in Clang and GCC and widely used (e.g. by the Linux kernel).

Unfortunately it is not supported by the third major compiler out there so many projects can't or don't want to use it.

exDM69 · 2025-12-17T12:19:15 1765973955

> tons of drivers that dont implement the extensions that really improve things.

This isn't really the case, at least on desktop side.

All three desktop GPU vendors support Vulkan 1.4 (or most of the features via extensions) on all major platforms even on really old hardware (e.g. Intel Skylake is 10+ years old and has all the latest Vulkan features). Even Apple + MoltenVK is pretty good.

Even mobile GPU vendors have pretty good support in their latest drivers.

The biggest issue is that Android consumer devices don't get GPU driver updates so they're not available to the general public.

pjmlp · 2025-12-17T13:36:07 1765978567

Neither do laptops, where not using the driver from the OEM with whatver custom code they added can lead to interesting experiences, like power configuration going bad, not able to handle the mixed GPU setups, and so on.

exDM69 · 2025-12-12T10:41:56 1765536116

The drones here aren't your neighbor's kids' quadrotors. Some sightings over airports have been large (>2m) fixed wing aircraft travelling at 200 km/h. Even the quads are pretty fast. And they can appear out of nowhere, taking off from the ground near the target.

Shooting them down from the ground is next to impossible. They don't hover around waiting for someone to come by with a shotgun in their hand, catching them by land (ie. chasing them in a car) is not feasible.

Just to give an idea how hard it is to hit airborne targets from the ground with traditional guns: I once spent an afternoon shooting at a slow moving fixed wing target drone with tracer rounds from a 12.7mm anti-aircraft machine gun. There were about 50 of us taking turns, each with a few hundred rounds to shoot at the damn thing and the target aircraft didn't get a single hit.

My guess is that the drones are conducting signals intelligence, listening to radar signals and radio comms around sensitive installations (airports, military bases) and surveying the response time to a sighting.

tim333 · 2025-12-12T12:01:57 1765540917

We need interceptor drones like https://www.youtube.com/watch?v=1m-xH6OLY_Y

something that could help Ukraine would be to make thousands in the EU. There's already a factory going up in the UK.

tim333 · 2025-12-13T01:49:18 1765590558

Actually a couple of drone factories (https://thedefensepost.com/2025/11/24/stark-uk-drone-uav-pro... and https://militarnyi.com/en/news/ukrspecsystems-in-the-united-...)

but the interceptors are Project Octopus:

>The initial pilot batch, consisting of up to 1,000 drones, will be built in the UK at state-owned facilities. The Octopus drone will become the first Ukrainian combat drone to be serially produced in a NATO country, with Ukraine retaining full intellectual property and technological control. https://euromaidanpress.com/2025/10/26/uk-to-build-pilot-bat...

and they are being secretive about the design.

zelphirkalt · 2025-12-12T11:32:41 1765539161

If you watch some videos from Ukraine, you will see, that shotguns can hit them with much increased chances. So if possible without endangering civilians around the place where drones are sighted, I say get finally started, take the shotguns out and take out Russian resources. Just for the time of war make private drone flying without special permission and preflight registration illegal, then take down any drone that moves and is not registered to have a flight at the time. This also won't be giving away very critical military knowledge either.

One thing I don't know about shotguns is, how dangerous falling projectiles are. How much velocity they accumulate. That could be a real problem with this approach.

exDM69 · 2025-12-12T11:40:00 1765539600

I've seen my fair share of frontline combat videos from Ukraine.

The hard part isn't shooting a drone when it is in shotgun range. It's getting the shooter close enough to the drone to have a chance of taking the shot in the first place.

For example the drones mentioned in the article can fly at 2.5km altitude at 140km/h.

zelphirkalt · 2025-12-12T11:59:25 1765540765

I guess the only solution then is to already have people in places where it counts. I would suspect military bases have more than enough people. But then again the drones can just fly too high, at which point it becomes a cost/benefit tradeoff, or futuristic laser weapons.

exDM69 · 2025-12-12T12:25:53 1765542353

Anything involving people on the ground is just too slow.

It takes radars, interceptor drones, sensor networks, etc. Stuff like this is in active development but not widely deployed yet.

exDM69 · 2025-12-10T17:39:00 1765388340

> Rust has native SIMD support

std::simd is nightly only.

> while in standard C there is no way to express that.

In ISO Standard C(++) there's no SIMD.

But in practice C vector extensions are available in Clang and GCC which are very similar to Rust std::simd (can use normal arithmetic operations).

Unless you're talking about CPU specific intrinsics, which are available to in both languages (core::arch intrinsics vs. xmmintrin.h) in all big compilers.

exDM69 · 2025-12-08T11:24:30 1765193070

> If the DoD enforces the requirement for Ada, Universities, job training centers, and companies will follow

DoD did enforce a requirement for Ada but universities and others did not follow.

The JSF C++ guidelines were created for circumventing the DoD Ada mandate (as discussed in the video).

p_l · 2025-12-08T15:15:11 1765206911

TL;DR Ada programmers were more expensive

adolph · 2025-12-08T15:50:28 1765209028

Since when was expense a problem for defense spending?

In the video, the narrator also claims that Ada compilers were expensive and thus students were dissuaded from trying it out. However, in researching this comment I founds that the Gnat project has been around since the early 90s. Maybe it wasn't complete enough until much later and maybe potential students of the time weren't using GNU?

  The GNAT project started in 1992 when the United States Air Force awarded New 
  York University (NYU) a contract to build a free compiler for Ada to help 
  with the Ada 9X standardization process. The 3-million-dollar contract 
  required the use of the GNU GPL for all development, and assigned the 
  copyright to the Free Software Foundation.

https://en.wikipedia.org/wiki/GNAT

0xffff2 · 2025-12-08T18:41:35 1765219295

Take a look at job adds for major defense contractors in jurisdictions that require salary disclosure. Wherever all that defense money is going, it's not engineering salaries. I'm a non-DoD government contractor and even I scoff at the salary ranges that Boeing/Lockheed/Northrup post, which often feature an upper bound substantially lower than my current salary while the job requires an invasive security clearance (my current job doesn't). And my compensation pales in comparison to what the top tech companies pay.

jll29 · 2025-12-08T17:06:20 1765213580

The DOD could easily have organized Ada hackathons with a lot of prize money to "make Ada cool" if they had chosen to in order to get the language out of the limelight. They could also have funded developing a free, open source toolchain.

jandrese · 2025-12-08T19:17:16 1765221436

Ada would never have been cool.

Ironically I remember one of the complaints was it took a long time for the compilers to stabilize. They were such complex beasts with a small userbase so you had smallish companies trying to develop a tremendously complex compiler for a small crowd of government contractors, a perfect recipe for expensive software.

I think maybe they were just a little ahead of their time on getting a good open source compiler. The Rust project shows that it is possible now, but back in the 80s and 90s with only the very early forms of the Internet I don't think the world was ready.

skepti3 · 2025-12-08T19:31:51 1765222311

Out of curiosity:

1: If you had to guess, how high is the level of complexity of rustc?

2: How do you think gccrs will fare?

3: Do you like or dislike the Rust specification that originated from Ferrocene?

4: Is it important for a systems language to have more than one full compiler for it?

jandrese · 2025-12-08T19:41:30 1765222890

Given how much memory and CPU time is burned compiling Rust projects I'm guessing it is pretty complex.

I'm not deep enough into the Rust ecosystem to have solid opinions on the rest of that, but I know from the specification alone that it has a lot of work to do every time you execute rustc. I would hope that the strict implementation would reduce the number of edge cases the compiler has to deal with, but the sheer volume of the specification works against efforts to simplify.

skepti3 · 2025-12-08T19:19:09 1765221549

> They could also have funded developing a free, open source toolchain.

If the actual purpose of the Ada mandate was cartel-making for companies selling Ada products, that would have been counter-productive to their goals.

Not that compiler vendors making money is a bad thing, compiler development needs to be funded somehow. Funding for language development is also a topic. There was a presentation by the maker of Elm about how programming language development is funded [0].

[0]: https://youtube.com/watch?v=XZ3w_jec1v8

adolph · 2025-12-08T22:27:24 1765232844

Is the Gnat compiler not sufficiently free and open source? It does not fulfill the comment calling for "toolchain" however.

Edit: Thanks for that video. It is an interesting synthesis ad great context.

p_l · 2025-12-08T22:28:37 1765232917

GNAT exists because DoD funded a free, open source toolchain.

p_l · 2025-12-08T20:34:40 1765226080

Since on paper government cares about cost efficiency and you have to consider that in your lobbying materials.

Also it enables getting cheaper programmers who where possible might be isolated from the actual TS materiel to develop on the cheap so that the profit margin is bigger.

It gets worse outside of the flight side JSF software - or so it looks like from GAO reports. You don't turn around a culture of shittiness that fast, and I've seen earlier code in the same area (but not for JSF) by L-M... and well, it was among the worst code I've seen. Including failing even basic requirement of running on a specific version of a browser at minimum.

exDM69 · 2025-12-08T09:04:39 1765184679

Correct me if I'm wrong but during this timeframe (circa 2005), Java was not open source at all. OpenJDK was announced in 2006 and first release was 2008, by which time the days Java in the browser were more or less over.