Using HAProxy as an API Gateway, Part 1

tnolet · on Oct 2, 2018

This is a great article and I love HAproxy. Even so much that I wrote a REST-like wrapper for it a couple of years ago to turn HAproxy into an API gateway. It touched many of the points in the article. Maybe useful for people in the field https://github.com/magneticio/vamp-router

Natsu · on Oct 2, 2018

Am I wrong, or is there still nothing that handles security in the API Gateway?

For example, suppose you want to do OAuth in the gateway instead of building it into every API. I couldn't see anything to handle a use case like that. The best I see is that it can probably handle one-way SSL, but I'm not clear on how it would handle mutual authentication or verification of client certs (pinning, CRLs, or OCSP).

It's a lot of work to build that into APIs separately and a lot of people who use API Gateways prefer to have the gateway itself handle security.

bedis9 · on Oct 2, 2018

This is part of part-2 of our blog post about api gateway with haproxy. It will feature token validation in Lua and much more fancy things..

rzzzt · on Oct 2, 2018

Kong does handle security in that layer: https://docs.konghq.com/0.14.x/auth/

dannyfriar82 · on Oct 3, 2018

im not sure Kong supports mtls or cert pinning. https://github.com/Kong/kong/issues/2048

i dont think it's able to hash your secret tokens either - rather stores them in plaintext. https://github.com/Kong/kong/issues/1237

this kind of rules Kong out for any serious applications AFAIK - especially considering that an API Gateway is supposed to be a security product.

istio, tyk, traefik support that out of box.

fosk · on Oct 3, 2018

Marco, CTO of Kong here. Static mTLS has been supported since 3 years almost, while dynamic mTLS is being supported with Kong 1.0 (required for Service Mesh). Kong itself has been used in highly regulated industries, including financial, healthcare and governmental (including military branches) institutions.

When it comes to enforcing strong security OpenID Connect or JWT authentications are almost always better options. Since Kong is built on top of NGINX, anything NGINX supports by extension Kong also supports + Kong Plugins.

dannyfriar82 · on Oct 3, 2018

Thanks for response. Please correct me if I am wrong - I'm making a few deductions / assumptions based on previous research evaluating OS gateways.

MTLS & Cert Pinning: It looks like GH is out of date? Could you send me a link to the docs for configuring kong + mtls please?

I am sure that kong is deployed in highly regulated industries. Are they using open src version of the product? regardless of whether they are using paid or open src - are they using Kong for user/identity management or a 3rd party service and hooking that in with OIDC?

OIDC/JWT: yes i agree that these options are typically a better but that means I need 3rd party IdP to issue tokens, rather than Kong handling user/token mgmt? My understanding is that Kong does not issue JWTs, simply validates the signature?

OIDC support to my understanding is only avail in Enterprise version rather than CE version? So that means I need community plugin if I want OIDC - and this community plugin will not be supported by Kong?

With JWT - i believe Kong simply validates the signature using the Public Key / Shared Secret. Are these secrets stored encrypted / securely within Kong?

If I simply wanted Kong to handle my user mgmt whether basic auth / api key or I had a legacy system which still required support, then I would need to accept the fact that Kong will store credentials such as usernames / passwords in plaintext?

geezerjay · on Oct 3, 2018

> Kong itself has been used in highly regulated industries,

Was it used in any application that actually was covered by any regulation enforcing authentication and authorization to the point that kong could directly determine if an implementation does or does not comply with the regulation?

Or was it simply used in highly regulated industries like car stickers are used in the highly regulated auto industry?

fosk · on Oct 3, 2018

> Was it used in any application that actually was covered by any regulation enforcing authentication and authorization to the point that kong could directly determine if an implementation does or does not comply with the regulation?

Yes. Kong Enterprise running on the execution path of in-flight information across distributed and open systems must be audited in the context of enforced regulations, especially banking and healthcare. We do work regularly with our customers to make sure Kong is compliant within their specific use-case and make those audits successful.

KassKrout · on Oct 4, 2018

You can do this and expand HAProxy capabilities with LUA. For example, authenticating via OAuth was already done by a HAProxy contributor and he detailed this here: https://bl.duesterhus.eu/20180119/

tnolet · on Oct 2, 2018

Correct, my old project did not include authentication.

Erwin · on Oct 2, 2018

Just how pluggable is HAProxy, or is there anything else out there which allows higher async control over the routing?

My need is some rather customized request routing, currently done by Python (fast enough to do thousands of RPS in just Python with PyPy) but it'd nice to use a standard product.

For every incoming request, I want to decide where it gets routed on a per-request basis to but not necessarily immediately (so some callback / event based / out of process interface is needed) and I want to be notified when the request finishes or fails.

Basically the request for /foo/bar may need to spawn a new foo-bar management process which can take a while to warm up, and appear at some unknown future destination.

I had a look at e.g. Traefik, Kong, Envoy but they didn't seem like quite the right fit.

Using e.g. Proxygen seems like overkill.

sciurus · on Oct 2, 2018

Take a look at OpenResty, which is nginx scriptable with lua.

https://openresty.org/en/

https://github.com/openresty/lua-nginx-module#access_by_lua

graphememes · on Oct 2, 2018

Just a note, Kong is built on OpenResty and has a wide ecosystem of plugins already built in lua with it's own development kit to speed up development time of custom functionality.

Cannot urge how many times it's saved my ass. Not to mention it's very easy to scale!

sb8244 · on Oct 2, 2018

Kong has ran well in production for us, but development can be slow due to the pure LUA scripting. I think some of the problems they have are fixed in an updated version, but there is no update path that doesn't involve re-writing every custom code we have.

We're switching to openresty + LUA proxying to a co-located elixir server that will handle the access requests.

graphememes · on Oct 2, 2018

I'd be very interested in knowing what you've written that you have to re-write _every_ release, especially since you can use the OpenResty API. You'd have to do it for every OpenResty release as well. Doesn't sound right to me!

However, I do say, whatever is easier for you go for it, at the end of the day, it's business value that matters the most! If the refactor increases your business value, do it! If not? Then... don't touch it!

saganus · on Oct 2, 2018

Check out Express Gateway: https://www.express-gateway.io/

Not sure if it would be useful for your case but it's a new-ish API gateway that I've been playing around with, although haven't done anything serious or production-level with it.

caleblloyd · on Oct 3, 2018

You can use the `auth_request` module in NGINX to create a subrequest to an HTTP service that does authentication and/or dynamic routing.

The subrequest can return response headers that are available in `$upstream_http_*` variables, which can br used to dynamically route the original request.

http://nginx.org/en/docs/http/ngx_http_auth_request_module.h...

Erwin · on Oct 3, 2018

Thanks, nginx looks promising, either through this lighter weight integration, or by writing something similar with LUA and OpenResty/Kong.

bedis9 · on Oct 3, 2018

But nginx does not do basic load balancing related features such as health checking, persistence (some apps still need that), DNS service discovery, stats, observability, etc...

imhoguy · on Oct 2, 2018

You can customize HAProxy (1.6+) routing with embedded Lua. Lua support may need to be enabled during compilation if not already provided by OS package. Here are some resources:

https://www.arpalert.org/src/haproxy-lua-api/1.8/index.html https://www.techietown.info/2016/10/haproxy-with-lua-support... https://github.com/zareenc/haproxy-lua-examples

fosk · on Oct 2, 2018

Marco here, CTO at Kong [1]. It seems like what you are trying to implement would be straightforward in a Kong plugin, may I understand what limitations/problems did you run into?

[1] https://konghq.com/

jively · on Oct 3, 2018

Tyk provides pluggable middleware written in Python, it’s very likely you get something like this going because the plugins grant access to the gateway redis cache so you can store data as well as advanced rewrites that can do some pretty complex routing.

bedis9 · on Oct 2, 2018

Well this looks doable with Lua, as an http action probing an api that would let it now if the destination service is available and where. One point about routing, do you mean that server for handling /foo/bar may not exist yet in haproxy's configuration? If so, we could enforce the destination IP at runtime, based on the response provided at the step above.

bedis9 · on Oct 2, 2018

Hi,

I did write this blog post. Feel free to ask if you have any questions.

PM_ME_YOUR_CAT · on Oct 2, 2018

"If that weren’t enough, driving alone through such a place can seem like if you were to be buried, would anyone ever know?"

Well this was...unexpectedly dark. Jokes aside, I love how, for a software that gets so much shit for being overly complicated, your config samples look clean as hell. Nice work dude!

krn · on Oct 2, 2018

An honest question: why HAProxy should be prefered over EnvoyProxy[1] for building API Gateways?

[1] https://www.envoyproxy.io/

tonyarkles · on Oct 2, 2018

Not the OP, but a young greybeard with opinions :)

I've set up a couple of API Gateways. One was a hand-rolled Go server; this could have probably been done using an existing off-the-shelf server, but there were some odd-ball requirements and it was pretty straightforward to just build the functionality. It did service discovery through Consul, applied some routing rules and the "special" logic, and that was about it.

Every other time I've just use nginx. Why? Most of the API Gateway projects out in the wild are bloody complicated! I just looked at the docs for EnvoyProxy and the "Architecture Overview" pane is taller than my 1080p monitor! Yes, at massive scale with thousands of services, something like this is probably the right solution. When you've got a handful of services, an automated rolling deploy across a small cluster of nginx servers is A-OK.

I've looked at a bunch of different packages, and every time my conclusion is that they introduce a massive amount of complexity that could be beneficial in the large, but will likely require a dedicated team to understand all of the intricacies of the whole system. KISS.

rdli · on Oct 2, 2018

Envoy Proxy is pretty complex. It is really designed for machine configuration. This is one reason projects like Ambassador API Gateway (https://www.getambassador.io) exist -- it translates decentralized declarative Kube config into Envoy configuration (non-trivial exercise).

That said, Envoy has some great features such as distributed tracing, a robust runtime API for dynamic configuration, gRPC load balancing, etc.

Disclosure: I work on Ambassador.

bedis9 · on Oct 2, 2018

All of this is on it's way. depending on what you mean by distributed tracing, it may already be doable

tyingq · on Oct 2, 2018

He works for Haproxy Technologies, so probably has not considered switching :)

Edit: Probably also worth mentioning that some of the net functionality gap between the two has been closed. Haproxy added service discovery and h2 support some time after this was published: https://www.envoyproxy.io/docs/envoy/latest/intro/comparison

sombrerochapeau · on Oct 2, 2018

is the site down? edit: nevermind, my corporate firewall blocked it because it thinks it's a "proxy" website.

fchief · on Oct 2, 2018

Corporate firewall rule writer: "I can block all proxy access by not letting folks connect to a site with the word proxy in it." What could go wrong with that?

nodze · on Oct 2, 2018

Works perfectly fine here

unixhero · on Oct 2, 2018

Can the open source version of HAProxy also be used to achieve what is described in this tutorial?

gerdesj · on Oct 2, 2018

Yes, I use all of those features and more. The built in stats are not as pretty as those shown and you don't get graphs but you can obviously do your own (and I do - the logs are very good)

bedis9 · on Oct 2, 2018

I could not have answered in a better way.

Just adding we love and do open source.

sf_barbary · on Oct 3, 2018

I found it easier to create a centralized graphql server that stitches together all of my rest services with resolvers. I define all of the types, resolvers, and mutations for each of the services in the central graphql gateway and it clients don't need to know where they get data from. Why wouldn't you just use this approach?

yellow · on Oct 3, 2018

The biggest problem with GraphQL in this regard is that it is a single endpoint. API gateways often define authorization rules, throttling rates, and caching times differently for each route. Consequently, you may need to write authorization, throttling, and caching logic in a separate layer or perhaps even in your microservices themselves.

bedis9 · on Oct 3, 2018

Graphql seems to be limited because of its single endpoint design. So some features are applied globally while you may want them per route.

fulafel · on Oct 3, 2018

Piggybacking on this thread, what would currently be the most secure frontend proxy? Preferably something that is implemented in a memory-safe language, and has some credibile story about focus on security and being battle tested.

rogerdonut · on Oct 3, 2018

Hello, HAProxy has a long history of being secure [1]. In a standard configuration it also segregates itself and spawns within a chroot.

Booking.com [2] uses HAProxy for edge delivery over other software load balancers.

Github [3][4] has used it to mitigate DDoS attacks and StackOverflow [5] has used it to detect and protect against bot threats.

Finally, phk, the author of Varnish states [6] (in regards to implementing SSL/TLS): “When I look at something like Willy Tarreau's HAProxy I have a hard time to see any significant opportunity for improvement.”

[1] https://www.haproxy.org/#secu

[2] https://events.static.linuxfound.org/sites/events/files/slid...

[3] https://www.youtube.com/watch?v=xxs7CoLMXt8

[4] https://githubengineering.com/glb-part-2-haproxy-zero-downti...

[5] https://events.static.linuxfound.org/sites/events/files/slid...

[6] https://varnish-cache.org/docs/trunk/phk/ssl_again.html

fulafel · on Oct 3, 2018

Thanks for the well founded argument.

I'm still worried about the TLS implementation. I think in your [6] phk considered just incremental improvements, not switching away from C - after all the whole post is about not wanting to implement TLS at all in his own C codebase, having anticipated problems like Heartbleed. (Also at the time of writing, 2015 the Go TLS stack might have been too new to rely on?)

rhymenoceros · on Oct 3, 2018

Traefik[1] is written in Golang, which is technically a (depending on your viewpoint) memory-safe language. It has a pretty decent adoption rate, has a reasonable story for being built on battle-tested libraries, and has a pretty good reputation.

Haskell has yesod[2] but I don't think there's a variant of that dedicated to nginx- or haproxy-esque reverse proxy duties.

[1] https://github.com/containous/traefik

[2] https://www.yesodweb.com/

bedis9 · on Oct 3, 2018

Traefik performance sucks... In my bench, it is like 6 times slower than haproxy on an AWS instance. It eats up a huge amount of memory and burns all the cups.... As well, traefik configuration is not flexible enough to match haproxy power...

But this is not the purpose of the thread.

jively · on Oct 3, 2018

[Caveat emptor: I’m the CEO of Tyk] Tyk is an open source API gateway written in Go, it’s been battle tested in production PCI compliant environments and offers a large selection of security features.

It’s also customisable in other languages other than Lua (Python, JS, and anything gRPC) ;-)

rhymenoceros · on Oct 3, 2018

How does Tyk compare with Kong and other "API gateway" implementations?

jively · on Oct 3, 2018

Tyk and Kong are similar in terms of what they do, though with a Tyk we bake a lot of key functionality into the core instead of asking the community to build them for us.

We also don’t have the concept of proprietary plugins, so where Kong is “Open Core” (for example if you want to use openID connect you need to buy enterprise, with us it’s just par for the course), we bake everything gateway-related into the open source version and don’t hide the ball. Our “value add” is in our dashboard GUI (proprietary) and multi-cloud/multi-DC server (also proprietary).

Also, in Tyk you can model your api routing as a file, with Kong you need to specify all routes as API calls to the gateway, so backing up/version controlling your APIs is difficult without using a community-provided solution. (Though don’t get me wrong, both Tyk Gateway and Dashboard are entirely API driven, so you can do everything programmatically or declaratively).

Lastly - we have a compatability promise of “no breaking changes within major versions”. It’s harder to do, but makes our users happy :-)

Ah, and in terms of extensibility, we provide middleware and event hooks that can be hooked into with any gRPC compatible language and offer native binary (FFI) Support in Python and Lua, we also have a baked in ECMAScript interpreter which is fast (it’s written in Go), but being an interpreter doesn’t have the expressiveness of some of the other extension options)

In terms of other implementations, in open source there’s not many thst have quite the breadth of functionality we offer.

To be fair though, the other solutions out there (especially coming out of the Go/K8s/CNCF communities are very impressive.

elkinthewoods · on Oct 3, 2018

Full disclosure - I work for Tyk :)

One of our users recently did a write up on their experience using us with particular focus on plugins: https://bitsofinfo.wordpress.com/2018/06/28/migrating-to-tyk...

Author: https://github.com/bitsofinfo

bedis9 · on Oct 3, 2018

[flagged]

dang · on Oct 4, 2018

Please don't do this here.

bedis9 · on Oct 3, 2018

[flagged]

detaro · on Oct 3, 2018

bit hypocritical to post that on comments on a submission of the ad you've written, isn't it?

Bombthecat · on Oct 3, 2018

First question would be : paid or open source?

And do you want just a gateway or also api management?

rubyn00bie · on Oct 2, 2018

Anyone have any idea how using HAProxy compares to say Kong?

graphememes · on Oct 2, 2018

I had to muck around a lot more to get HAProxy (2-3 days) do what I could do with Kong out of the box (1-2 hours) and Kong was easier to scale for me (simple terraform scripts).

baylisscg · on Oct 3, 2018

To be honest I'd sat Kong is slightly harder to set up as it wants a database (PostgeSQL or Cassandra) but offers more functionality out of the box. HAProxy is more focused on being … well a proxy for example it can handle straight TCP too.

Its like the PostgreSQL vs ${NoSQL de jour}. HAProxy works fine to great as a default choice unless you've some requirement that makes Kong more attractive.

jaequery · on Oct 2, 2018

I’ve no experience with Kong but HAproxy is one of the simplest and sturdiest daemon server I have ever used. With just a couple tweaks to the config file, I know exactly what it will do and what it can’t do.

rhymenoceros · on Oct 3, 2018

Has HAProxy cleaned up their dynamic reload issues? I am a pretty big fan having used HAProxy in production, but compared to the relative simplicity of "nginx -s reload" the restart complexity of HAProxy always gives me pause.

rogerdonut · on Oct 3, 2018

Hello, HAProxy 1.8 brought hitless reloads and has solved all of these issues. You can find information on this here [1][2]

[1] https://www.haproxy.com/blog/truly-seamless-reloads-with-hap...

[2] https://www.haproxy.com/blog/hitless-reloads-with-haproxy-ho...

rhymenoceros · on Oct 3, 2018

This is an improvement but it still seems like HAProxy is offloading a small (but annoying) amount of responsibility to users when handling restart requests. You still have to track the existing process' PID and give it to the new process, give the new process the same "-x [socket_file]" argument, and make sure the new process comes up before making any subsequent changes to the HAProxy config file.

Does HAProxy have any plans to wrap this process juggling to make things as stupidly easy as nginx's reload behavior?

rogerdonut · on Oct 3, 2018

The current method integrates with both init scripts and systemd. You shouldn't need to manually pass any of that information. Also, the -x option is not required if you are running HAProxy in master/worker mode in which case sending a SIGUSR2 to the master process would be sufficient. With that said, we do have further improvements planned for an upcoming release. Stay tuned :)

abledon · on Oct 2, 2018

Why not just use Kong