This is a great article and I love HAproxy. Even so much that I wrote a REST-like wrapper for it a couple of years ago to turn HAproxy into an API gateway. It touched many of the points in the article. Maybe useful for people in the field https://github.com/magneticio/vamp-router
Am I wrong, or is there still nothing that handles security in the API Gateway?
For example, suppose you want to do OAuth in the gateway instead of building it into every API. I couldn't see anything to handle a use case like that. The best I see is that it can probably handle one-way SSL, but I'm not clear on how it would handle mutual authentication or verification of client certs (pinning, CRLs, or OCSP).
It's a lot of work to build that into APIs separately and a lot of people who use API Gateways prefer to have the gateway itself handle security.
Marco, CTO of Kong here. Static mTLS has been supported since 3 years almost, while dynamic mTLS is being supported with Kong 1.0 (required for Service Mesh). Kong itself has been used in highly regulated industries, including financial, healthcare and governmental (including military branches) institutions.
When it comes to enforcing strong security OpenID Connect or JWT authentications are almost always better options. Since Kong is built on top of NGINX, anything NGINX supports by extension Kong also supports + Kong Plugins.
Thanks for response. Please correct me if I am wrong - I'm making a few deductions / assumptions based on previous research evaluating OS gateways.
MTLS & Cert Pinning: It looks like GH is out of date? Could you send me a link to the docs for configuring kong + mtls please?
I am sure that kong is deployed in highly regulated industries. Are they using open src version of the product? regardless of whether they are using paid or open src - are they using Kong for user/identity management or a 3rd party service and hooking that in with OIDC?
OIDC/JWT: yes i agree that these options are typically a better but that means I need 3rd party IdP to issue tokens, rather than Kong handling user/token mgmt? My understanding is that Kong does not issue JWTs, simply validates the signature?
OIDC support to my understanding is only avail in Enterprise version rather than CE version? So that means I need community plugin if I want OIDC - and this community plugin will not be supported by Kong?
With JWT - i believe Kong simply validates the signature using the Public Key / Shared Secret. Are these secrets stored encrypted / securely within Kong?
If I simply wanted Kong to handle my user mgmt whether basic auth / api key or I had a legacy system which still required support, then I would need to accept the fact that Kong will store credentials such as usernames / passwords in plaintext?
> Kong itself has been used in highly regulated industries,
Was it used in any application that actually was covered by any regulation enforcing authentication and authorization to the point that kong could directly determine if an implementation does or does not comply with the regulation?
Or was it simply used in highly regulated industries like car stickers are used in the highly regulated auto industry?
> Was it used in any application that actually was covered by any regulation enforcing authentication and authorization to the point that kong could directly determine if an implementation does or does not comply with the regulation?
Yes. Kong Enterprise running on the execution path of in-flight information across distributed and open systems must be audited in the context of enforced regulations, especially banking and healthcare. We do work regularly with our customers to make sure Kong is compliant within their specific use-case and make those audits successful.
You can do this and expand HAProxy capabilities with LUA.
For example, authenticating via OAuth was already done by a HAProxy contributor and he detailed this here: https://bl.duesterhus.eu/20180119/
Just how pluggable is HAProxy, or is there anything else out there which allows higher async control over the routing?
My need is some rather customized request routing, currently done by Python (fast enough to do thousands of RPS in just Python with PyPy) but it'd nice to use a standard product.
For every incoming request, I want to decide where it gets routed on a per-request basis to but not necessarily immediately (so some callback / event based / out of process interface is needed) and I want to be notified when the request finishes or fails.
Basically the request for /foo/bar may need to spawn a new foo-bar management process which can take a while to warm up, and appear at some unknown future destination.
I had a look at e.g. Traefik, Kong, Envoy but they didn't seem like quite the right fit.
Just a note, Kong is built on OpenResty and has a wide ecosystem of plugins already built in lua with it's own development kit to speed up development time of custom functionality.
Cannot urge how many times it's saved my ass. Not to mention it's very easy to scale!
Kong has ran well in production for us, but development can be slow due to the pure LUA scripting. I think some of the problems they have are fixed in an updated version, but there is no update path that doesn't involve re-writing every custom code we have.
We're switching to openresty + LUA proxying to a co-located elixir server that will handle the access requests.
I'd be very interested in knowing what you've written that you have to re-write _every_ release, especially since you can use the OpenResty API. You'd have to do it for every OpenResty release as well. Doesn't sound right to me!
However, I do say, whatever is easier for you go for it, at the end of the day, it's business value that matters the most! If the refactor increases your business value, do it! If not? Then... don't touch it!
Not sure if it would be useful for your case but it's a new-ish API gateway that I've been playing around with, although haven't done anything serious or production-level with it.
You can use the `auth_request` module in NGINX to create a subrequest to an HTTP service that does authentication and/or dynamic routing.
The subrequest can return response headers that are available in `$upstream_http_*` variables, which can br used to dynamically route the original request.
But nginx does not do basic load balancing related features such as health checking, persistence (some apps still need that), DNS service discovery, stats, observability, etc...
You can customize HAProxy (1.6+) routing with embedded Lua. Lua support may need to be enabled during compilation if not already provided by OS package.
Here are some resources:
Marco here, CTO at Kong [1]. It seems like what you are trying to implement would be straightforward in a Kong plugin, may I understand what limitations/problems did you run into?
Tyk provides pluggable middleware written in Python, it’s very likely you get something like this going because the plugins grant access to the gateway redis cache so you can store data as well as advanced rewrites that can do some pretty complex routing.
Well this looks doable with Lua, as an http action probing an api that would let it now if the destination service is available and where.
One point about routing, do you mean that server for handling /foo/bar may not exist yet in haproxy's configuration? If so, we could enforce the destination IP at runtime, based on the response provided at the step above.
"If that weren’t enough, driving alone through such a place can seem like if you were to be buried, would anyone ever know?"
Well this was...unexpectedly dark. Jokes aside, I love how, for a software that gets so much shit for being overly complicated, your config samples look clean as hell. Nice work dude!
Not the OP, but a young greybeard with opinions :)
I've set up a couple of API Gateways. One was a hand-rolled Go server; this could have probably been done using an existing off-the-shelf server, but there were some odd-ball requirements and it was pretty straightforward to just build the functionality. It did service discovery through Consul, applied some routing rules and the "special" logic, and that was about it.
Every other time I've just use nginx. Why? Most of the API Gateway projects out in the wild are bloody complicated! I just looked at the docs for EnvoyProxy and the "Architecture Overview" pane is taller than my 1080p monitor! Yes, at massive scale with thousands of services, something like this is probably the right solution. When you've got a handful of services, an automated rolling deploy across a small cluster of nginx servers is A-OK.
I've looked at a bunch of different packages, and every time my conclusion is that they introduce a massive amount of complexity that could be beneficial in the large, but will likely require a dedicated team to understand all of the intricacies of the whole system. KISS.
Envoy Proxy is pretty complex. It is really designed for machine configuration. This is one reason projects like Ambassador API Gateway (https://www.getambassador.io) exist -- it translates decentralized declarative Kube config into Envoy configuration (non-trivial exercise).
That said, Envoy has some great features such as distributed tracing, a robust runtime API for dynamic configuration, gRPC load balancing, etc.
He works for Haproxy Technologies, so probably has not considered switching :)
Edit: Probably also worth mentioning that some of the net functionality gap between the two has been closed. Haproxy added service discovery and h2 support some time after this was published: https://www.envoyproxy.io/docs/envoy/latest/intro/comparison
Corporate firewall rule writer: "I can block all proxy access by not letting folks connect to a site with the word proxy in it." What could go wrong with that?
Yes, I use all of those features and more. The built in stats are not as pretty as those shown and you don't get graphs but you can obviously do your own (and I do - the logs are very good)
I found it easier to create a centralized graphql server that stitches together all of my rest services with resolvers. I define all of the types, resolvers, and mutations for each of the services in the central graphql gateway and it clients don't need to know where they get data from. Why wouldn't you just use this approach?
The biggest problem with GraphQL in this regard is that it is a single endpoint. API gateways often define authorization rules, throttling rates, and caching times differently for each route. Consequently, you may need to write authorization, throttling, and caching logic in a separate layer or perhaps even in your microservices themselves.
Piggybacking on this thread, what would currently be the most secure frontend proxy? Preferably something that is implemented in a memory-safe language, and has some credibile story about focus on security and being battle tested.
Hello, HAProxy has a long history of being secure [1]. In a standard configuration it also segregates itself and spawns within a chroot.
Booking.com [2] uses HAProxy for edge delivery over other software load balancers.
Github [3][4] has used it to mitigate DDoS attacks and StackOverflow [5] has used it to detect and protect against bot threats.
Finally, phk, the author of Varnish states [6] (in regards to implementing SSL/TLS): “When I look at something like Willy Tarreau's HAProxy I have a hard time to see any significant opportunity for improvement.”
I'm still worried about the TLS implementation. I think in your [6] phk considered just incremental improvements, not switching away from C - after all the whole post is about not wanting to implement TLS at all in his own C codebase, having anticipated problems like Heartbleed. (Also at the time of writing, 2015 the Go TLS stack might have been too new to rely on?)
Traefik[1] is written in Golang, which is technically a (depending on your viewpoint) memory-safe language. It has a pretty decent adoption rate, has a reasonable story for being built on battle-tested libraries, and has a pretty good reputation.
Haskell has yesod[2] but I don't think there's a variant of that dedicated to nginx- or haproxy-esque reverse proxy duties.
Traefik performance sucks... In my bench, it is like 6 times slower than haproxy on an AWS instance.
It eats up a huge amount of memory and burns all the cups....
As well, traefik configuration is not flexible enough to match haproxy power...
[Caveat emptor: I’m the CEO of Tyk] Tyk is an open source API gateway written in Go, it’s been battle tested in production PCI compliant environments and offers a large selection of security features.
It’s also customisable in other languages other than Lua (Python, JS, and anything gRPC) ;-)
Tyk and Kong are similar in terms of what they do, though with a Tyk we bake a lot of key functionality into the core instead of asking the community to build them for us.
We also don’t have the concept of proprietary plugins, so where Kong is “Open Core” (for example if you want to use openID connect you need to buy enterprise, with us it’s just par for the course), we bake everything gateway-related into the open source version and don’t hide the ball. Our “value add” is in our dashboard GUI (proprietary) and multi-cloud/multi-DC server (also proprietary).
Also, in Tyk you can model your api routing as a file, with Kong you need to specify all routes as API calls to the gateway, so backing up/version controlling your APIs is difficult without using a community-provided solution. (Though don’t get me wrong, both Tyk Gateway and Dashboard are entirely API driven, so you can do everything programmatically or declaratively).
Lastly - we have a compatability promise of “no breaking changes within major versions”. It’s harder to do, but makes our users happy :-)
Ah, and in terms of extensibility, we provide middleware and event hooks that can be hooked into with any gRPC compatible language and offer native binary (FFI) Support in Python and Lua, we also have a baked in ECMAScript interpreter which is fast (it’s written in Go), but being an interpreter doesn’t have the expressiveness of some of the other extension options)
In terms of other implementations, in open source there’s not many thst have quite the breadth of functionality we offer.
To be fair though, the other solutions out there (especially coming out of the Go/K8s/CNCF communities are very impressive.
I had to muck around a lot more to get HAProxy (2-3 days) do what I could do with Kong out of the box (1-2 hours) and Kong was easier to scale for me (simple terraform scripts).
To be honest I'd sat Kong is slightly harder to set up as it wants a database (PostgeSQL or Cassandra) but offers more functionality out of the box. HAProxy is more focused on being … well a proxy for example it can handle straight TCP too.
Its like the PostgreSQL vs ${NoSQL de jour}. HAProxy works fine to great as a default choice unless you've some requirement that makes Kong more attractive.
I’ve no experience with Kong but HAproxy is one of the simplest and sturdiest daemon server I have ever used. With just a couple tweaks to the config file, I know exactly what it will do and what it can’t do.
Has HAProxy cleaned up their dynamic reload issues? I am a pretty big fan having used HAProxy in production, but compared to the relative simplicity of "nginx -s reload" the restart complexity of HAProxy always gives me pause.
This is an improvement but it still seems like HAProxy is offloading a small (but annoying) amount of responsibility to users when handling restart requests. You still have to track the existing process' PID and give it to the new process, give the new process the same "-x [socket_file]" argument, and make sure the new process comes up before making any subsequent changes to the HAProxy config file.
Does HAProxy have any plans to wrap this process juggling to make things as stupidly easy as nginx's reload behavior?
The current method integrates with both init scripts and systemd. You shouldn't need to manually pass any of that information. Also, the -x option is not required if you are running HAProxy in master/worker mode in which case sending a SIGUSR2 to the master process would be sufficient. With that said, we do have further improvements planned for an upcoming release. Stay tuned :)