How ZenPayroll processes billions of dollars in annual payroll

tonyhb · on April 17, 2015

It's really cool that they're very public about the things they've built (ACH formats, fixed width file gems etc.). They seem like a very friendly and open bunch of chaps with their customer's needs put first.

I'm surprised to see that they're solely ruby-based, though. Especially for an SOA setup. I tried building a payroll system in rails one time and didn't like it. That a dynamically typed language with strong metaprogramming was handling people's money really didn't feel right. Then again, tools like Grape (which looks awesome) must definitely outweigh that non-issue.

panh29 · on April 18, 2015

Disclosure: I am an engineer here with ZenPayroll.

In terms of language choice, it is actually quite nice to see definitions of taxes, rules, and forms for different states and counties in a dynamic language like Ruby without verbosity of other languages like Java. As mentioned in another block post[1], when launching to new states, we paused and built tools to help us automate the verbose tasks, and Ruby really helps us there. We have good test coverage so we have high level of confidence when refactoring (which we do frequently).

In terms of JRuby, we are only using it on one of the components that needs to talk to a Oracle DB using JDBC.

We are early in the service oriented architecture path and will be learning (hard) lessons a long the way with mistakes and from new talented engineers joining us. What I am most impressed with this team of engineers is a collaborative culture and willingness to go extra miles for customers and for ourselves.

That said, I wouldn't be surprised that one day ZenPayroll technical stack would be polyglot.

[1] http://engineering.zenpayroll.com/benefits-of-writing-a-dsl/

teacup50 · on April 18, 2015

... except that the choice isn't between Java or type-unsafe code handling people's money.

That's a silly (and false) dichotomy.

4ydx · on April 18, 2015

Amazing that you can do this with confidence. It seems like a huge amount of work though. When it comes to money, I would think that compile time guarantees would be advisable :)

tonyhb · on April 18, 2015

That's really awesome. You guys do seem like an actual family with the way you work together... the culture is refreshing to see!

Glad that the choice is working out for you. The DSL definitely seems like a solid and interesting approach.

joevandyk · on April 18, 2015

Doing SOA in Ruby isn't so bad. Nice thing about services is that you can keep the the total lines of code low in each service. Ruby is better when you don't have giant apps.

wasd · on April 18, 2015

JRuby is also an amazing piece of technology. There are plenty of instances where you can change the engine and get all the performance enhancements of the JVM whilst the flexibility of Ruby.

Note: I have no idea if ZenPayroll uses or has considered JRuby. This comment is specifically about improving the performance of Ruby.

Mister_Snuggles · on April 18, 2015

While not JRuby, the big advantage of a JVM-based language is access to the Java ecosystem.

I built a system where Python was a beautiful fit and ran it on Jython so that I could use JDBC drivers to access an Oracle database. From another comment, it sounds like ZenPayroll came to the same conclusion with Ruby/JRuby.

jim_h · on April 17, 2015

I also work on a payroll system (Rails/postgresql) based that's used for internal company use with hundreds of employees and multiple states. The hardest part is the taxes and benefits that's handled by external vendors.

sergiotapia · on April 17, 2015

I was also surprised. You hear finance and the last thing you imagine is Rails but there ya go!

mandeepj · on April 18, 2015

Few questions if someone either from ZenPayroll or having similar experience can answer -

1. How you handle maintenance like upgrades in applications hosted on multiple machines? Do you have a particular maintenance time period where you bring down the whole application? For some applications, it is not possible to shutdown the website completely. Even if you try to upgrade the application servers one by one there might be active users on that server. How do you handle this situation?

2. The database tier is the hardest to scale. How do you do it? There is an option to go master\slave route but it is hard. For performance purposes we can't always talk with database. How you sync data in case you plan to use cache.

Thanks.

volkadav · on April 18, 2015

(I'm not a ZP employee, but I have worked in environments of significant scale a fair amount. The below is just my opinion...)

#1: The usual approach is something like remove app servers from the load balancers' rotation, upgrade them, run some sort of tests to ensure the upgrade is ok, add them back into the LB rotation. Some environments might use user traffic as the test step (e.g. send 5% of your users at the newly upgraded machines, watch monitoring to see how response times and error counts behave).

#2: The answer here really depends on what scale you're operating at. DB replication (e.g. read slaves if your traffic is more read-heavy that write heavy) and sufficiently high-spec machines combined with intelligent use of caching can take you a fairly long way. I hesitate to give hard numbers, but off the cuff probably 99% of sites will never need more than this. There's not really a recipe to follow for that, the data needs and design/ops parameters of an organization tend to be fairly specific to that org; more or less it's good schema design, thinking hard about how fresh data has to be at the edge, caching what you can with careful management of expiry times, and having an ops team comfortable with managing replication. Beyond that scale you start getting into globe-spanning multiple-datacenter voodoo rocket science. :) Ask amazon, google or facebook (some of their work has been mentioned publicly, things like cassandra from fb and various papers published by google/amazon).

G650 · on April 17, 2015

Just amazing.

pbreit · on April 18, 2015

Perhaps. But orders of magnitude more payroll has been conducted by supposedly prehistoric banks for the past century or 2.

enraged_camel · on April 18, 2015

Yes, but how many of those banks openly and transparently discuss their technology stack and processes?

smackfu · on April 18, 2015

A bunch of old PL/I programs that run once very pay period is not very exciting.

zdw · on April 18, 2015

Ah, but can they generate http://ledger-cli.org files?