If you're interested in more of the technical details of how a CRS (automatic bug finding system) works, I recommend watching this presentation from my colleague Artem Dinaburg.
You should also keep your eye on https://github.com/trailofbits -- we are releasing the final component of our CRS as open-source in a very short time. Manticore, our symbolic execution framework, will be up there soon! I'm happy to give you early access if you get in touch with me on Twitter.
I'm dealing with a very old, large, somewhat rotting codebase. it's barley tested. Is fuzzing something for me to improve code quality, or are tests the lower hanging fruit?
Disclaimer: I have not looked at your codebase, so this should only be taken as my 2 cents. I might have a different recommendation if I had greater familiarity with your exact problem.
I'm biased, but I would start with a fuzzer. Fuzzers have two key differences for your scenario that makes me lean towards them:
They provide the will to act. You mentioned the codebase is old and rotting, so a normal bug may not rise to enough attention to fix. Finding security bugs may give greater justification to undertake an effort to start maintaining it properly.
1 fuzzer exercises more than 1 test. I think you'll get more bang for your buck by integrating a fuzzer, whether it is libFuzzer, AFL, Radamsa, or anything else. Fuzzers are not targeted to a single unit and will, ideally, find bugs all over the place from a single, simple starting point.
That said, there are good arguments for diving into the codebase and writing tests too. In fact, writing tests may make your fuzzer more effective.
You only asked fuzzer vs tests, but I'd offer that you should update your build settings before fuzzing. New compilers have a lot more diagnostics than they used to. But then I'd set up libFuzzer or AFL.
> Is fuzzing something for me to improve code quality,
Is it C/C++ and does it include parsers? Then definitely yes. If not it depends.
> or are tests the lower hanging fruit?
Again: Is it C/C++? Then familiarize yourself with the sanitizer features of gcc and clang, primarily address sanitizer. Lots of "rotting" code shows memory safety errors just by running them with asan.
When experimenting with libFuzzer to test an audio processing library, I was impressed by the results and also the ease of setup. In-process fuzzing is really the best option for that use-case, which is why I chose libFuzzer over AFL.
An open-source alternative of Microsoft's SAGE/Springfield would be cool. I'm sure there are things to come with the efforts in CRS you mentioned.
Looking forward to where this goes and hope that your 2- and 5-year outlooks hold true.
> An open-source alternative of Microsoft's SAGE/Springfield would be cool
We're working on one (!) and I hope we can offer it for free to non-commercial projects. For now, there is Microsoft Springfield [1] for Windows software and Google OSS-Fuzz [2] for open-source software. It is extraordinarily hard to not only get the tech for something like that working but bring it to market.
As noted in the video, nearly all the individual pieces of our CRS are open-source but you actually do not want a "CRS." The competition DARPA designed for them involved more than what is necessary to provide value to a development team, e.g., you don't want something that writes IDS signatures, considers "gameplay" or resource contention, or attempts to write automatic patches. You want something that accurately finds and reproduces bugs. We open-sourced the tools we wrote to do that or used tools that were already open-source, like Grr, Manticore, Radamsa, KLEE, and Z3.
> We're working on one (!) and I hope we can offer it for free to non-commercial projects.
That's good to hear. Hope you can find a way to monetize it for commercial projects.
Getting the tech right certainly seems to be a hard problem with Google's Konstantin Serebryany calling the symbolic execution route a rocket science. In my view the problem is coming up with a solid solution instead of just heuristics (as with all multi-approach methods: when to switch modes?) and making sure the tech is usable to test arbitrary complex pieces of software.
Does anyone know of a list of categorized and recommended fuzzers for different purposes, or just more specifically for smart fuzzing web API's and how to get started with it? Search results for this kind of stuff are hard to parse cause they're either dated or the use cases are very specific one-offs.
I realize answering my questions with the given broadly-defined tools may be the required manual expertise they refer to in the presentation. But I'm just looking for a foothold somewhere at the least.
Burp Suite is planning on adding native support for continuous integration... integration in the second half of 2017.
If you're reading between the lines: there are _very few_ security testing tools that are built well. So you're asking the wrong question. You don't need a huge list. There are only a small handful of fuzzers or analysis tools I would recommend at all, and Burp is it for web testing.
Most projects out there are hobby projects from people trying to learn something new and ignoring what has already been done. They don't serve a very useful purpose other than as a learning or teaching tool.
We used tried and true basics for our CRS: Radamsa, KLEE, our own open-source binary lifter, and a Python symbolic execution framework built around Z3. Nothing new, or hip, or magic.
Burp suite does not do a good job at fuzzing APIs - not biased but true. APIs require more structured fuzzers that expose application level problems - not like burp's fuzzers which are working on raw HTTP requests which was useful sometime ago when you had to find bugs in the actual server implementation. This is not relevant in the web application security space anymore apart from the fields of research which is exactly what most web shops not interested to do. You can still use Burp for that but the user needs to do all the heavy lifting by hand. How does Burp do recursive XML or JSON fuzzing? It doesn't. You can write a plugin for that but that defeats the purpose of using an off the shelf tool.
Yep, it's maybe a little more complicated than I let on. We went through the same process you described on a recent engagement and here was the outcome:
Big thanks for this reply. I was looking to cut through all the noise you mentioned. I'm finding all I was looking for in the Radamsa readme for starting and next steps. Use cases expressed in unix pipes > clear_as_day.txt. Fuzzing newb at this point but I'll be using your original and other linked presentations for context while working through basics.
There are some commercial efforts but not much in the open source space which in my opinion is a bug problem because APIs should be fuzzed properly - especially when written with dynamic scripting languages.
Thanks for clarifying this and your info on Burp's capabilities above. What are some of the commercial options you just mentioned? I'm dealing with a large app not written with C but does have API endpoints over http. For a start on some black-box fuzzing, I'm thinking I could use Radamsa's client/server capabilities.
If you're interested in more of the technical details of how a CRS (automatic bug finding system) works, I recommend watching this presentation from my colleague Artem Dinaburg.
"Making A Scalable Automated Hacking System"
* https://www.youtube.com/watch?v=pOuO5m1ljRI
* https://github.com/trailofbits/presentations/blob/master/Cyb...
You should also keep your eye on https://github.com/trailofbits -- we are releasing the final component of our CRS as open-source in a very short time. Manticore, our symbolic execution framework, will be up there soon! I'm happy to give you early access if you get in touch with me on Twitter.