This isn't theoretical. In my 20 years in retail and logistics, I've seen these libraries repeatedly fail in production. Real world examples include:
* Invoices: Totals get pushed to a new page with no repeated <thead> header. This is a classic failure of CSS table rendering across page breaks. properties like page-break-inside: avoid are notoriously inconsistent in browser print to PDF engines. Line items get split mid row because the engine doesn't understand the semantic integrity of the data.
* Bills of Lading & Manifests: These documents are infamous for unpredictable page breaks. One page cuts a row in half, the next duplicates headers, the next drops content entirely. This often stems from complex flexbox or grid layouts that the PDF rendering engine struggles to paginate deterministically.
* Shipping Labels: A barcode or QR code shifting by a few pixels is often a DPI or scaling artifact. The browser rendering at a logical 96 DPI doesn't translate perfectly to a 300 or 600 DPI thermal printer format, introducing rounding errors that are catastrophic for scanners. Addresses drift outside the printable area because CSS margins (margin, padding) can be interpreted differently by the print media engine versus the screen engine.
* Digital Forms: This is a classic failure of absolute vs. relative positioning. When you overlay HTML form fields on a scanned PDF background (a common requirement), the HTML box model's flow layout simply cannot guarantee pixel-perfect alignment with the fixed grid of the underlying image. I've seen teams resort to printing, using white out, and hand filling forms because the software couldn't align (x, y) coordinates.
* Tickets & Passes: Scanner rejection due to incorrect sizing is often due to the browser engine's "print scaling" or "fit-to-page" logic, which can be difficult to disable and varies between environments (e.g., a local Docker container vs. an AWS Lambda function with different system fonts or libraries installed).
This always turns into a long tail of support tickets. The only truly reliable solution is to bypass the HTML/CSS rendering model entirely and build the document on a canvas with an absolute coordinate system. This means using libraries like FPDF (PHP), ReportLab (Python), or lower-level tools like iText/PDFBox (Java), where you aren't "converting" a document, you are drawing it. You place text at (x, y), draw a line from (x1, y1) to (x2, y2), and manage page breaks and object placement explicitly.
It's not cheap. The initial build cost is high because every layout is effectively a small, “programmaticd CAD project”. You can't just "throw HTML at it". But the payoff in reliability is immense. It becomes a set and forget system that produces identical documents every time, which stops the endless firefighting.
Yes, two years later it can be painful to update when the original developer is gone. But I would take that trade off any day over constantly battling with imprecise, non deterministic tools. In twenty years of building systems where documents are mission critical, "close enough" rendering was almost never good enough.
Have you looked at something like Latex or Typst? They come with their own layout engine, so potentially less tedious work like specifying exact positions.
Diátaxis is a great way to structure documentation, but I think its real value is in simplifying how we think about writing docs.
It shifts the focus from trying to cram everything into one ‘perfect document’ to recognizing that different users have different needs.
Like, tutorials are for learning by doing, guides are for solving specific problems, reference is for quick lookups, and explanations dive into the ‘why.’
That clarity alone can make one write useful docs.
That being said, sticking too rigidly to any system can be a trap.
A great companion read is Martin Fowler’s “Accounting Patterns”[1]. Having built and maintained systems that manage financial events for over a decade, I wish I had read these patterns earlier.
I think Fowler's work is an underrated must-read for anyone who works in domains related to moving money. Makes any kind of engineering practices and architecture principles logical and make sense.
Yes, and I was lucky enough to read his stuff on finance within 6 months of starting work! He has some very good design ideas for many things, just as always treat it like things in your toolbox and not dogma.
More like "is it released after all those years". I've been very interested in the concept, but that's the last I've heard from them in the last 3-4 years.
Article | Software Engineer | Vancouver, BC | ONSITE, VISA | C$90 - C$160 | https://www.article.com
Article is on the mission to engineer remarkably better furniture experiences. To accomplish this goal we manage ourselves relationship with the factories and suppliers, ocean shipping, warehousing, customer service, quality assurance, operations, transportation network and final mile delivers.
We are building software systems to make an impact on each and every aspect of above mentioned areas. We are fast growing startup and we were recently named Canada's fast growing startup[0]. Come help us build remarkably better furniture experiences.
We are hiring for following positions:
Software Engineer,
Senior Software Engineer,
Principal Software Engineer
Article | Software Engineer, Product Manager | Vancouver, BC | ONSITE, VISA | C$80 - C$150 | https://www.article.com
Article is on the mission to engineer remarkably better furniture experiences. To accomplish this goal we manage ourselves relationship with the factories and suppliers, ocean shipping, warehousing, customer service, quality assurance, operations, transportation network and final mile delivers.
We are building software systems to make an impact on each and every aspect of above mentioned areas. We are 5 year old startup and we are growing at exponential rate. Come help us build remarkably better furniture experiences.
Article | Software Engineer, Product Manager | Vancouver, BC | ONSITE, VISA | C$90 - C$140 | https://www.article.com
Article is on the mission to engineer remarkably better furniture experiences. To accomplish this goal we are manage our own factories, ocean shipping, warehousing, customer service, quality assurance, operations, transportation network and final mile delivers.
We are building software systems to make an impact on each and every aspect of above mentioned areas. We are 5 year old startup and we are growing at exponential rate. Come help us build remarkably better furniture experiences.
We are hiring for following positions:
Software Engineer
Front End Engineer
Principal Software Engineer
Product Manager
If you really care about your customer you should be worried about false positive. I hope as a business you do not cancel customer orders because your fraud detection system has flagged them.
Depending on your scale you may using 3rd parties like Sift science, Stripe Radar or Roll your own fraud detection system.
Flagging orders as potential fraud is the easier part these days. The difficult part is how to come up with a process to verify these flagged orders. This process need to be simple and quick. Because essentially you are saying to your customer we think you are a fraud and can you prove that your not.
Banks merchant checks to verify flagged orders is extremely cumbersome. They require you to call a special phone number (which is different for each bank) provide customer Name, Billing Address, Billing Phone and Credit Information. Then they can only give you a response whether it is a match or not. They can't tell you whether it has been reported stolen or anything else for privacy reason. At scale this is a very time consuming process. It becomes even more cumbersome if you are security conscious business and do not store customer credit card information. In that case you have to communicate with the customer asking them to call you to provide your credit card information again.
There are solutions like 3D Secure but they are not widely supported and adds its own problems. It is high time credit card companies start providing merchant with a 2nd factor check for transaction. For example maybe once a transaction is placed with a merchant. They can trigger a 2nd factor check where by the bank automatically send a code to their email/phone number on file. If the customer is able to provide a correct code merchant can proceed with the order.
Fraud detection will always remain a point of contention between customer and businesses. I just hope business make sensible decision based on their situation. For example I have seen legitimate customer with all the above cases mentioned in the article.
The OP has written extensively about this subject in the past, and I get the sense that he is intimately aware of the risk of false positives, however catching a high volume of fraud could for him literally be the difference between staying in business and not. His fraud tolerance is going to be much much lower than a large vendor.
Reading all of these issues I'm really flabbergasted that you have such issues. Like, my bank offers me temporary non-physical credit cards with small limits for 1€/month/piece and that's what I use to do all my online purchases with, do US banks really not have that option? Second thing that I often use (where possible) wire transfer, it requires my ID-card and the payment is done in seconds.
This thread has honestly made me really appreciate what I have available to me compared to some countries.
Very few banks have that option, and the ones that do are bordering on user hostile, and the temporary cards don't have usable/tolerable features for this.
A wire costs $50-100 (or more for international) per transaction, no matter what the amount.
> For example maybe once a transaction is placed with a merchant. They can trigger a 2nd factor check where by the bank automatically send a code to their email/phone number on file. If the customer is able to provide a correct code merchant can proceed with the order.
That is not what 3dsecure provides ? with 3d secure, I receive a code from my through SMS, I then transmit this code to the payement processor.
I've always had good luck with Maxmind's local database [1] offering. It bewilders me how many companies today create SaaS offerings and refuse to offer on-prem versions. It's like they intentionally want to avoid customers with serious needs (speed and security being the most common need for on-prem) who are willing to pay serious amounts of money.
Hi, unfortunately we don't. However performance is very important to us which is why we have 11 endpoints around the world. And average ~65ms response times see status.ipdata.co.
I understand what you mean, but every other provider from the tests I've done comes in at double our speeds some even over plain HTTP. We only serve requests over HTTPS
More importantly the performance is consistent and you'd get the same performance wherever you were in the world.
Why not? I can load an IP database locally, so it's an option and a "competitor" to a 3rd party API. That's what I'm comparing it to, not against other APIs.
Maybe in this case. But historically, exactly this decision point has oscillated. It's caused reversals in distributed computer design for decades. First the networks were slow and disks were (relatively) fast - so each machine had one. Then networks went Ethernet, and disks started disappearing. Disks got down to a few ms access time and they came back. Then Gigabit came around. Then SSD.
Today I'd say it depends upon exactly what your network data source latency measures out to. The answer could go either way.
* Invoices: Totals get pushed to a new page with no repeated <thead> header. This is a classic failure of CSS table rendering across page breaks. properties like page-break-inside: avoid are notoriously inconsistent in browser print to PDF engines. Line items get split mid row because the engine doesn't understand the semantic integrity of the data.
* Bills of Lading & Manifests: These documents are infamous for unpredictable page breaks. One page cuts a row in half, the next duplicates headers, the next drops content entirely. This often stems from complex flexbox or grid layouts that the PDF rendering engine struggles to paginate deterministically.
* Shipping Labels: A barcode or QR code shifting by a few pixels is often a DPI or scaling artifact. The browser rendering at a logical 96 DPI doesn't translate perfectly to a 300 or 600 DPI thermal printer format, introducing rounding errors that are catastrophic for scanners. Addresses drift outside the printable area because CSS margins (margin, padding) can be interpreted differently by the print media engine versus the screen engine.
* Digital Forms: This is a classic failure of absolute vs. relative positioning. When you overlay HTML form fields on a scanned PDF background (a common requirement), the HTML box model's flow layout simply cannot guarantee pixel-perfect alignment with the fixed grid of the underlying image. I've seen teams resort to printing, using white out, and hand filling forms because the software couldn't align (x, y) coordinates.
* Tickets & Passes: Scanner rejection due to incorrect sizing is often due to the browser engine's "print scaling" or "fit-to-page" logic, which can be difficult to disable and varies between environments (e.g., a local Docker container vs. an AWS Lambda function with different system fonts or libraries installed).
This always turns into a long tail of support tickets. The only truly reliable solution is to bypass the HTML/CSS rendering model entirely and build the document on a canvas with an absolute coordinate system. This means using libraries like FPDF (PHP), ReportLab (Python), or lower-level tools like iText/PDFBox (Java), where you aren't "converting" a document, you are drawing it. You place text at (x, y), draw a line from (x1, y1) to (x2, y2), and manage page breaks and object placement explicitly.
It's not cheap. The initial build cost is high because every layout is effectively a small, “programmaticd CAD project”. You can't just "throw HTML at it". But the payoff in reliability is immense. It becomes a set and forget system that produces identical documents every time, which stops the endless firefighting.
Yes, two years later it can be painful to update when the original developer is gone. But I would take that trade off any day over constantly battling with imprecise, non deterministic tools. In twenty years of building systems where documents are mission critical, "close enough" rendering was almost never good enough.