What is CI and why use it?

jipiboily · on July 17, 2014

JP, author of the post here - let me know if you have any additions / questions! :)

enraged_camel · on July 17, 2014

I would like to have seen a section for when continuous integration is NOT needed, and would be overkill. For example, I have a solo project where my workflow is: make changes to code, run the test suite, fix any failures that come up, and then push to heroku. I tried CircleCI but I found that the only benefit it offered me was offloading the testing and deployment to another environment. In some ways it actually complicated my workflow, not simplified it.

exelius · on July 17, 2014

You're right; CI is pretty much a waste for a single person dev shop. But it's a godsend when working on larger projects: it centralizes processes that should ideally happen automatically, and increases developer confidence in the quality of the codebase by removing the deployment variables from the situation. Any team working on a reasonably large project should be using CI in some form.

For example, if you have a staging environment, and a developer pushes a change to that environment and wants you to make changes and test against his changes, you have to coordinate the deployment to make sure you don't overwrite his changes and vice versa. With CI, all check-ins would get automatically deployed to the staging environment using static scripts, so every deployment is exactly the same. It saves a lot of developer time, and removes the "which developer is a better sysadmin" variable from the equation.

ABS · on July 17, 2014

maybe a link to the seminal article by Martin Fowler from 2006 could be useful for those who want to learn more? http://martinfowler.com/articles/continuousIntegration.html

Of course CI is older than 2006 but that article is the most referenced one, and for good reasons IMHO

shangxiao · on July 17, 2014

No mention of CloudBees?

jipiboily · on July 17, 2014

As mentioned in the post, there are TONS of options, I just mentioned a few ones :)

rubiquity · on July 17, 2014

Thanks for the article. I apologize in advance if this is a hijack, but I've really been trying to understand the craze of CI and feel this is a good place to hopefully get answers.

CI is something I have a constant struggle with. I primarily work in small teams and I tend to work with other developers that are diligent about running tests and know that "if the tests aren't green when I merge master into my topic branch, I don't push to master."

When I did work on a larger team, maybe CI was useful for the other programmers that weren't as diligent. Though, the CI box eventually got quite slow and you didn't get feedback until 15 minutes later (this was Jenkins) versus 2-3 minutes locally.

Unless I'm missing something, I guess when you distill CI down to what it does you get:

- Runs your test suite

- Lets you know if a failure happens

That's not a ton of value in my eyes, which is why I assume all of these hosted CI solutions are now also doing CD upon successful builds. You could also get CD without CI by using something like git hooks that kick off deploy scripts after certain events.

GitHub showing the status of your CI on the PR screen is cool, but while I do use GitHub, I mostly use GitHub as a code repository. The most I do in the web UI is clone repos and open pull requests. If you merge a branch into master that correlates to an open PR, GitHub automatically closes the PR for you.

I know we're in the era of "Automate ALL the things!" and I'm good about that, but typing `rake|make|etc` into my console isn't a huge pain point in my life right now. Is CI a luxury we've been brainwashed to love by the programming hegemony?

Someone please let me know what I'm missing.

sjtgraham · on July 17, 2014

Essentially, no one but you cares that "it's green on my box". Your clone is not the "source of truth".

Also not everyone runs tests before pushing, shocking I know! CI SCM integration makes it immediately clear if a branch can be safely merged. This is such a win for open source projects too. Often I'll get a PR, after seeing the diff and the green build I know I can safely merge. Compare this with adding the contributor's fork as a remote, fetching their changes, running the test suite, merging, pushing. I know which one I prefer.

I'm even more excited about CD, there should be a process for deploying to production and I'm not talking about `git push heroku master`. Code deployed to production should be code reviewed and built by CI before it is automatically deployed. IMO this significantly reduces the probability of deploying bad code.

novum · on July 17, 2014

CI means knowing that your product is always in a releasable state. Green tests is only one small piece of that.

I have an iOS app or two that I work on independently, as side projects. Even though I am the sole developer on these projects, CI saves me dozens or hundreds of hours of time. My CI script[0] downloads provisioning profiles from Apple's Developer Center, installs them, runs my tests, builds my app, archives the resulting IPA and .dSYM.zip to Amazon S3, uploads the build to Testflight, notifies my testers, then sends my iOS devices a push notification with an ultimate success or failure message.

And it does all of this every day at 4am, while I'm sleeping, and I don't have to think about it.

How much longer would all of this take if I were doing it manually each time? Time is the only infinitely valuable scarce resource. I don't want to spend time mucking with codesigning and provisioning profiles; I want to spend more time coding.

[0] I've open-sourced my iOS build script! https://github.com/splinesoft/SSBuild

rubiquity · on July 17, 2014

If this is all happening at a set time every day, couldn't it just be a cron job? You do have the script created already, CI is just running it for you.

novum · on July 17, 2014

Sure it could be a cron job, but using Jenkins adds some conveniences: a nice UI to browse projects, plus it archives the output of every historical build, locally-archived build artifacts, lots of extensibility with plugins, and so on.

emsy · on July 18, 2014

Plus: Once you already got everything up and running you can take it a step further and deploy from the CI Server. In my previous job, the Ops Team loved it when I told them how to deploy from the CI instead of going to the "Release Developer" (Which was loathed by the Ops, because she was not the friendliest person :D)

eddieroger · on July 17, 2014

It's worth adding that it does those things based on the output of previous jobs. Also, if you have many of these setups in parallel, it can create good dashboards and give you one nice (and free) GUI to take in the state of your build empire.

jipiboily · on July 17, 2014

You also are running your tests in a different environment (diff machine) which could catch some bugs IMHO. In addition to that, having the CI server deploying for you also means you can have a very limited part of your team that have actual production rights.

I never had any good experience with Jenkins, Hudson before that or CruiseControl.net (yes, I automated .Net deployment with that, years ago!). This is why I prefer hosted CI that can scale and won't struggle under the load of a few builds. Circle CI in addition will run your specs in parallel. It is very unlikely that you can be faster than them, I think?

For GitHub PRs' status, I think it is useful when you do code review and just get from one PR to the other and you can see it's status quickly, it's a nice to have, but we were able to move fast before that feature dropped a couple of years ago.

It might be a matter of taste, but waiting for the whole suite to run sometime seems like a waste of time, I am most of the time running only the specs that I think could break, and I rarely see red builds on Circle CI.

My 2 cents :)

rubiquity · on July 17, 2014

Yeah, I don't run the entire suite either until the merge just before master. While I'm working on a branch I'm only running specific tests.

Having it build and run on a different machine could be valuable for certain applications.

benjiweber · on July 17, 2014

CI != CI Servers.

CI is about continually integrating your changes with the rest of the team, and ideally it's also continuous deployment (integrating with production environment continually). The reasons for integrating regularly is to tighten the feedback loop and minimise the integration pain by doing it in frequent, small steps.

It's perfectly possible to do continuous integration without a CI server, particularly when you're also doing continuous deployment.

It's also easy to do the opposite of continuous integration using a CI server. I have seen a lot of people using a CI server to test branches in version control. They have the continuous part but not the integration part.

rubiquity · on July 17, 2014

Good to know. So I'm actually practicing CI the way that I work, I'm just not paying some company monthly fees for a CI server. So the difference between CI the practice and "CI wink wink" is $$$.

xorcist · on July 17, 2014

Monthly fees? I don't know where you get that from. The state of the art CI tool, Jenkins, is actually free software.

What there is to a complete CI process depends on what your release process is, apart from the build process. There may be code signings, integration with external services etc. These are steps that are not part of the individual developers build process.

The letter I in CI stands for integration. Two checkins might very well look good on their own, but cause havoc with other systems.

jipiboily · on July 17, 2014

My 2 cents here, but I think hosting fees will be higher with Jenkins than small CI plans...there are free offerings and even the paid one starts with pretty cheap prices that makes it hard for me to justify even an hour of work configuring a server for that. That said, I know this is not how everyone is thinking.

aytekin · on July 17, 2014

Just looked up what kinds of things we do on our CI & CD:

- Run unit and casperjs/selenium UI tests

- Minimize and optimize JS and CSS files if necessary

- Add revision numbers to prevent caching on highly cached files

- Processing less files

- Closure compiler

- Grunt

- CDN cache clears or uploads

- Translation locale strings updates

- Deploying on servers

Tens of tasks/tests done in parallel and finished in 45 seconds.

The most important thing is by automating things, you are dramatically reducing possibility of making mistakes or forgetting something.

rubiquity · on July 17, 2014

> - Run unit and casperjs/selenium UI tests

Of all of the steps you listed, this is the only step that actually pertains to CI. The rest are all part of deployment so they would fall under CD.

aytekin · on July 17, 2014

Don't agree. Except unit tests, the automated testing should be done on the very final version of the app/site/software, so that you can also catch integration/optimization related problems.

a-saleh · on July 17, 2014

It ... depends?

If your rake|make tests last few minutes, you probably can do without CI just fine. I am working Q/A, and each night we run install tests on latest repo, i.e. provision a clean machine, run the install script, check all the daemons are running, run sanity check. This can take ~hour. No dev will take an hour to check if he didn't mess up instalability of the system :) For CI server it is not a problem.

ZeroGravitas · on July 18, 2014

Something not mentioned yet is running your tests in the same environment as they'll be deployed. If people dev on Mac OS X it's good to run your tests on the dexact deploy target.

For code that'll be distributed widely, e.g. an open source library, that might mean a test matrix of different language versions, different DBs, and so on.

fredsters_s · on July 17, 2014

"brainwashed to love by the programming hegemony" I don't know what this means but I think I love you.

PhilipA · on July 17, 2014

The hosted CI solutions like CircleCI looks good, but letting them control my code and do the deployment, really requires quite a bit of trust. It is another chain who can have a security breach, which could let intruders have access to my code.

fredsters_s · on July 17, 2014

Like with all these things it's a trade-off. If security is important to you above everything else, then yeah, host it yourself and spend the extra time. But this is true for only a tiny % of startups.

Also, you're assuming that you're better able to secure this stuff than they are. Which doesn't seem obvious to me.

grosskur · on July 17, 2014

There's also Buildbox, which is a hosted CI engine and web interface. It uses agents that you run and control yourself:

https://buildbox.io/

PhilipA · on July 18, 2014

Thanks, will look into it!

akurilin · on July 17, 2014

1. Is there an easy way of integrating an existing large ansible project with CI to use it for deployment? 2. With continuous deployment, how do you coordinate multiple repositories being changed all at once and potentially numerous migrations that need to be done beforehand? e.g. say you're changing the schema and the web API and all of its clients. Can you CI the entire system at once rather than one git repo at a time?

matlock · on July 17, 2014

Do you want to test your ansible scripts during deployment, or do you just want to run ansible deployment after the tests are run?

The basic workflow that we at Codeship tell people is test each repository by itself and if that works push to a staging environment.

Once the push into the staging environment is done you restart the last build on an integration test repository that will thoroughly integration test the whole staging system.

When you build service oriented architecture some backwards compatibility for the time that systems are updated are in our opinion the best way. If there are breaking changes treat every part of the infrastructure as a separate api with specific guarantees and just build a v2 api so you can update the clients, but the old ones still work for a while until you can remove the old clients.

jipiboily · on July 17, 2014

I think it's a hard question to answer as it could be so different depending on the project.

Ideally, your projects should be deployable independently from the other ones, to a certain extent. As an example, your API projects should get deployed, then the other ones using that API can be deployed later, be it a few minutes later or even days.

Hopefully I answer part of your question here?

For deployment, "it depends™" but you can certainly do whatever you need to do, always a matter of time and money! :) Your CI server could kick something on your servers. I encourage you to look at the hosted CIs and their documentation.

Anjin · on July 17, 2014

If you are using a CI, you might as well also use something like http://www.coveralls.io to get history and statistics on your source code's test coverage.

jipiboily · on July 17, 2014

I use it for my open source projects, but I personally think coverage % is not worth much for private projects. First, coverage doesn't mean well covered, it just means the code is being run. For new comers, it could be confusing and I find it more dangerous than anything to introduce that principle to beginners. My 2 cents :)

Nice service though!

Dewie · on July 17, 2014

It's nice to send a pull request with only documentation changes and then having Travis CI fail.

fredsters_s · on July 17, 2014

This is why having a solid CI setup is crucial.

sergiotapia · on July 17, 2014

PSA: CodeShip just today announced free plans for up to 5 private repositories. Excellent and more than enough for most startups. Get started with CI -today- it's worth it.

https://www.codeship.io/

jipiboily · on July 17, 2014

Actually, they had that plan for a few months I think, they just added continuous deployment for their free offering a few days ago I think?

matlock · on July 17, 2014

Continuous Deployment was also always part of it. Basically we upped from 50 free builds to 100 and announced it properly. But the basic feature set hasn't changed and you currently get all of the features in the free plan that you would get in a larger plan.

jipiboily · on July 17, 2014

Thanks for the clarification!

timboslice · on July 17, 2014

I think they upped the free builds/mo from 50 to 100

jipiboily · on July 17, 2014

Yeah, that, too.

joshpadnick · on July 17, 2014

In my opinion, this book[1] is the authority on continuous integration and continuous deployment.

Continuous Integration is fundamentally about creating a tight feedback loop between your developers and your code. When you program in your IDE, the instant you write uncompilable code, you get red squigglies, so you're getting an instant feedback loop on something you just wrote.

CI is the same thing, but at a higher level. The instant you commit your code, some automated process should take over and start analyzing / compiling / testing your code and look for things to give you feedback on. If your code doesn't even compile -- one of the first milestones of CI, you should know that immediately.

Since you just committed it, making the fix is easy. This is compared to a developer who downloads your code the next day, can't compile, comes and bugs you about it, etc...

As far as some real world use cases, we just setup Jenkins for a new Java project we're writing. It does an automated build test that compiles and executes all unit tests automatically on any commit to GitHub on any branch. It's a little slower than I like -- our still growing app takes a full 3 minutes to compile and give feedback.

But, it's been great. For example, the GitHub client on Mac OS X doesn't recognize when I change uppercase letters to lowercase and vice versa, so while my local compiles worked fine, my repo actually had a failing build. Once I committed, I got an automated email within 5 minutes telling me the build failed, and I fixed it. Without CI, I may not have found about that issue for weeks, making the change more difficult.

For production deployment, we're still in alpha, but we've got a 1-button push to deploy. Again, slower than it should be -- in this case 5 minutes -- but the automation is awesome and makes doing any deployment -- whether hot fixes or new releases that much more pleasant.

Regarding the performance, I see it as a win just to get anything automated, however slow it may be. Because once you're there, you can always look for ways to optimize it. For example, our current build process, re-downloads dependencies every single time. This could clearly be cached. When it's a priority for us, we'll do it.

[1] http://www.amazon.com/dp/0321601912?tag=contindelive-20

rubiquity · on July 17, 2014

> [1] http://www.amazon.com/dp/0321601912?tag=contindelive-20

Did you just put your Amazon associates referral code in a link on HN? lol

joshpadnick · on July 17, 2014

No, I didn't. I don't even have an Amazon associate's referral code, so I must have pasted someone else's. Hoping 1 person on Hacker News will buy a book I linked to in one comment so I could earn $0.25 does not strike me as a great earnings strategy.