More

47 · 2025-08-21T03:36:00 1755747360

This isn't theoretical. In my 20 years in retail and logistics, I've seen these libraries repeatedly fail in production. Real world examples include:

* Invoices: Totals get pushed to a new page with no repeated <thead> header. This is a classic failure of CSS table rendering across page breaks. properties like page-break-inside: avoid are notoriously inconsistent in browser print to PDF engines. Line items get split mid row because the engine doesn't understand the semantic integrity of the data.

* Bills of Lading & Manifests: These documents are infamous for unpredictable page breaks. One page cuts a row in half, the next duplicates headers, the next drops content entirely. This often stems from complex flexbox or grid layouts that the PDF rendering engine struggles to paginate deterministically.

* Shipping Labels: A barcode or QR code shifting by a few pixels is often a DPI or scaling artifact. The browser rendering at a logical 96 DPI doesn't translate perfectly to a 300 or 600 DPI thermal printer format, introducing rounding errors that are catastrophic for scanners. Addresses drift outside the printable area because CSS margins (margin, padding) can be interpreted differently by the print media engine versus the screen engine.

* Digital Forms: This is a classic failure of absolute vs. relative positioning. When you overlay HTML form fields on a scanned PDF background (a common requirement), the HTML box model's flow layout simply cannot guarantee pixel-perfect alignment with the fixed grid of the underlying image. I've seen teams resort to printing, using white out, and hand filling forms because the software couldn't align (x, y) coordinates.

* Tickets & Passes: Scanner rejection due to incorrect sizing is often due to the browser engine's "print scaling" or "fit-to-page" logic, which can be difficult to disable and varies between environments (e.g., a local Docker container vs. an AWS Lambda function with different system fonts or libraries installed).

This always turns into a long tail of support tickets. The only truly reliable solution is to bypass the HTML/CSS rendering model entirely and build the document on a canvas with an absolute coordinate system. This means using libraries like FPDF (PHP), ReportLab (Python), or lower-level tools like iText/PDFBox (Java), where you aren't "converting" a document, you are drawing it. You place text at (x, y), draw a line from (x1, y1) to (x2, y2), and manage page breaks and object placement explicitly.

It's not cheap. The initial build cost is high because every layout is effectively a small, “programmaticd CAD project”. You can't just "throw HTML at it". But the payoff in reliability is immense. It becomes a set and forget system that produces identical documents every time, which stops the endless firefighting.

Yes, two years later it can be painful to update when the original developer is gone. But I would take that trade off any day over constantly battling with imprecise, non deterministic tools. In twenty years of building systems where documents are mission critical, "close enough" rendering was almost never good enough.

aszen · 2025-08-21T10:47:37 1755773257

Yeah exactly we were using fpdf heavily but now switched to Typst since its faster to iterate complex documents on.

itsgabriel · 2025-08-21T18:28:32 1755800912

Have you looked at something like Latex or Typst? They come with their own layout engine, so potentially less tedious work like specifying exact positions.

47 · on Dec 5, 2024

Diátaxis is a great way to structure documentation, but I think its real value is in simplifying how we think about writing docs.

It shifts the focus from trying to cram everything into one ‘perfect document’ to recognizing that different users have different needs.

Like, tutorials are for learning by doing, guides are for solving specific problems, reference is for quick lookups, and explanations dive into the ‘why.’

That clarity alone can make one write useful docs.

That being said, sticking too rigidly to any system can be a trap.

3abiton · on Dec 5, 2024

Isn't the documentation task highly dependent on the goal and prospective users. Or is there a unified paradigm?

47 · on July 11, 2024

A great companion read is Martin Fowler’s “Accounting Patterns”[1]. Having built and maintained systems that manage financial events for over a decade, I wish I had read these patterns earlier.

[1] https://martinfowler.com/apsupp/accounting.pdf

arsenico · on July 11, 2024

I think Fowler's work is an underrated must-read for anyone who works in domains related to moving money. Makes any kind of engineering practices and architecture principles logical and make sense.

jakjak123 · on July 13, 2024

Yes, and I was lucky enough to read his stuff on finance within 6 months of starting work! He has some very good design ideas for many things, just as always treat it like things in your toolbox and not dogma.

47 · on Aug 31, 2021

https://helloeffie.com/

Never used it myself but i have been curious if it actually works.

nose · on Aug 31, 2021

A hot shower can achieve similar results: https://www.reddit.com/r/ShowerScience/comments/302aqp/showe...

pronik · on Aug 31, 2021

More like "is it released after all those years". I've been very interested in the concept, but that's the last I've heard from them in the last 3-4 years.

dtgriscom · on Sept 1, 2021

Their blog was last updated almost two years ago.

47 · on June 20, 2021

I think it is correct as Doppelgänger are supposed to be the evil version of oneself.

ant6n · on June 20, 2021

Username checks out https://memory-alpha.fandom.com/wiki/47

47 · on Nov 1, 2018

Article is on the mission to engineer remarkably better furniture experiences. To accomplish this goal we manage ourselves relationship with the factories and suppliers, ocean shipping, warehousing, customer service, quality assurance, operations, transportation network and final mile delivers.

We are building software systems to make an impact on each and every aspect of above mentioned areas. We are fast growing startup and we were recently named Canada's fast growing startup[0]. Come help us build remarkably better furniture experiences.

We are hiring for following positions:

Software Engineer, Senior Software Engineer, Principal Software Engineer

See more details at https://www.article.com/careers

[0] https://www.canadianbusiness.com/lists-and-rankings/growth-5...

47 · on June 1, 2018

Article is on the mission to engineer remarkably better furniture experiences. To accomplish this goal we manage ourselves relationship with the factories and suppliers, ocean shipping, warehousing, customer service, quality assurance, operations, transportation network and final mile delivers.

We are building software systems to make an impact on each and every aspect of above mentioned areas. We are 5 year old startup and we are growing at exponential rate. Come help us build remarkably better furniture experiences.

We are hiring for following positions:

Software Engineer

Principal Software Engineer

Product Manager

See more details at https://www.article.com/careers

47 · on May 1, 2018

Article is on the mission to engineer remarkably better furniture experiences. To accomplish this goal we are manage our own factories, ocean shipping, warehousing, customer service, quality assurance, operations, transportation network and final mile delivers.

We are building software systems to make an impact on each and every aspect of above mentioned areas. We are 5 year old startup and we are growing at exponential rate. Come help us build remarkably better furniture experiences.

We are hiring for following positions:

Software Engineer Front End Engineer Principal Software Engineer Product Manager

See more details at https://www.article.com/careers

47 · on April 27, 2018

If you really care about your customer you should be worried about false positive. I hope as a business you do not cancel customer orders because your fraud detection system has flagged them.

Depending on your scale you may using 3rd parties like Sift science, Stripe Radar or Roll your own fraud detection system.

Flagging orders as potential fraud is the easier part these days. The difficult part is how to come up with a process to verify these flagged orders. This process need to be simple and quick. Because essentially you are saying to your customer we think you are a fraud and can you prove that your not.

Banks merchant checks to verify flagged orders is extremely cumbersome. They require you to call a special phone number (which is different for each bank) provide customer Name, Billing Address, Billing Phone and Credit Information. Then they can only give you a response whether it is a match or not. They can't tell you whether it has been reported stolen or anything else for privacy reason. At scale this is a very time consuming process. It becomes even more cumbersome if you are security conscious business and do not store customer credit card information. In that case you have to communicate with the customer asking them to call you to provide your credit card information again.

There are solutions like 3D Secure but they are not widely supported and adds its own problems. It is high time credit card companies start providing merchant with a 2nd factor check for transaction. For example maybe once a transaction is placed with a merchant. They can trigger a 2nd factor check where by the bank automatically send a code to their email/phone number on file. If the customer is able to provide a correct code merchant can proceed with the order.

Fraud detection will always remain a point of contention between customer and businesses. I just hope business make sensible decision based on their situation. For example I have seen legitimate customer with all the above cases mentioned in the article.

kristianc · on April 27, 2018

The OP has written extensively about this subject in the past, and I get the sense that he is intimately aware of the risk of false positives, however catching a high volume of fraud could for him literally be the difference between staying in business and not. His fraud tolerance is going to be much much lower than a large vendor.

dcbadacd · on April 28, 2018

Reading all of these issues I'm really flabbergasted that you have such issues. Like, my bank offers me temporary non-physical credit cards with small limits for 1€/month/piece and that's what I use to do all my online purchases with, do US banks really not have that option? Second thing that I often use (where possible) wire transfer, it requires my ID-card and the payment is done in seconds.

This thread has honestly made me really appreciate what I have available to me compared to some countries.

dylz · on April 28, 2018

Very few banks have that option, and the ones that do are bordering on user hostile, and the temporary cards don't have usable/tolerable features for this.

A wire costs $50-100 (or more for international) per transaction, no matter what the amount.

A bank transfer (ACH) can take several weeks or more depending on how much both banks trust each other and the type of account you have. Here's a fun read: https://engineering.gusto.com/how-ach-works-a-developer-pers...

malka · on April 28, 2018

> For example maybe once a transaction is placed with a merchant. They can trigger a 2nd factor check where by the bank automatically send a code to their email/phone number on file. If the customer is able to provide a correct code merchant can proceed with the order.

That is not what 3dsecure provides ? with 3d secure, I receive a code from my through SMS, I then transmit this code to the payement processor.

47 · on April 18, 2018

Do you provide local database? Making a web service call for every request seems like a performance bottle neck.

meritt · on April 18, 2018

I've always had good luck with Maxmind's local database [1] offering. It bewilders me how many companies today create SaaS offerings and refuse to offer on-prem versions. It's like they intentionally want to avoid customers with serious needs (speed and security being the most common need for on-prem) who are willing to pay serious amounts of money.

[1] https://www.maxmind.com/en/geoip2-databases

rvnx · on April 18, 2018

Let's say it like this:

- https://api.ipdata.co/1.1.1.1

City name: Research

- https://www.maxmind.com/en/geoip2-precision-demo?ip=1.1.1.1

City name: Research

Oh, that must be a very coincidence.

Nuh nuh nuh, nobody would ever want to launch a SaaS with a database they have stolen.

lightbyte · on April 18, 2018

They aren't using a copy of Maxminds DB, see:

https://www.maxmind.com/en/geoip2-precision-demo?ip=185.10.6... (No data)

https://api.ipdata.co/185.10.68.114 (lots of data)

kawsper · on April 18, 2018

I wonder if Maxmind have put in some "Trap Streets" in their dataset. https://en.wikipedia.org/wiki/Trap_street

jonathan-kosgei · on April 18, 2018

Hi, unfortunately we don't. However performance is very important to us which is why we have 11 endpoints around the world. And average ~65ms response times see status.ipdata.co.

jimktrains2 · on April 19, 2018

65ms feels like a lot. I guess you can cache it so you're only calling once per IP.

jonathan-kosgei · on April 19, 2018

I understand what you mean, but every other provider from the tests I've done comes in at double our speeds some even over plain HTTP. We only serve requests over HTTPS

More importantly the performance is consistent and you'd get the same performance wherever you were in the world.

jimktrains2 · on April 20, 2018

Why compare to another API endpoint? You should compare it against accessing a local database.

jonathan-kosgei · on April 20, 2018

Comparing a network call vs filesystem i/o wouldn't be a useful comparison for someone deciding between different third part API providers.

jimktrains2 · on April 20, 2018

Why not? I can load an IP database locally, so it's an option and a "competitor" to a 3rd party API. That's what I'm comparing it to, not against other APIs.

jonathan-kosgei · on April 20, 2018

It's pretty obvious that hitting your local disk is going to be a lot faster than making a network call. Make the right decision for your use case.

JoeAltmaier · on April 20, 2018

Maybe in this case. But historically, exactly this decision point has oscillated. It's caused reversals in distributed computer design for decades. First the networks were slow and disks were (relatively) fast - so each machine had one. Then networks went Ethernet, and disks started disappearing. Disks got down to a few ms access time and they came back. Then Gigabit came around. Then SSD.

Today I'd say it depends upon exactly what your network data source latency measures out to. The answer could go either way.

jonathan-kosgei · on April 20, 2018

That's a pretty interesting history and put in that perspective I see what you mean and totally agree with you.