Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
MILA and the future of Theano (groups.google.com)
195 points by shagunsodhani on Sept 28, 2017 | hide | past | favorite | 60 comments


I find myself very impressed with the humility of the group taking a bow at the right time.

This is the crux of the matter it would seem: "Even with the increasing support of external contributions from industry and academia, maintaining an older code base and keeping up with competitors has come in the way of innovation."

Very mature move Theano team, you all did a great job and raised the bar at your peak with the solid innovations that became standard as you identified. Best wishes.


Reminds me of Chrome. Lots of competing alternatives. Google comes late yet still manage to take over the market organically very quickly by producing a superior and more robust alternative.


It's something of a pattern for Google. Existing products (say, Theano and Torch) were revolutionary, but had a number of pain points that were difficult to remove[0]. TensorFlow is essentially Theano without those problems, and with the guarantee of continued relevance thanks to Google's backing[1]. And of course, they reap the benefits of making it "open-source" while keeping a closed version for private use, similar to Chrome and Google Ultron.

However, I still feel kinda nostalgic for those warm summer days, writing my first autoencoder in Theano and using it to make blurry versions of photos.

-----

0. Theano had some truly gnarly exception tracebacks, which made it difficult to identify if you'd made a truly incompatible model or if one of the layers/operations just needed a datatype specified.

1. I wish more companies did something like this. It's not necessarily innovation, so much as having a corps of people experienced with some wondrous but unwieldy technology and then giving those people the opportunity to reimplement it "right". Maybe there are many companies that attempt it, but instead produce a clone with problems of its own? Or maybe you need someone like Jeff Dean supervising the project to make sure everything comes together?


This is going to sound wildly ignorant, is Google Ultron a real thing?!

I just remember it as a joke "super secret ultimate browser used by NASA" from that "Tales of IT" green text story [0].

[0]: https://imgur.com/gallery/iJD8f

edit: Whoops, it was mentioned in part 2 - https://imgur.com/gallery/AOz0d


It's as real as you want it to be: http://ultronbrowser.io/


Apple is arguably another example of this, but in a completely different arena.

They are really good at polishing cutting edge technology into nice hardware products.


" by producing a superior and more robust alternative."

In a commodity market - those with brand and distribution will win.

Chrome isn't really that much better than others. It may be in some areas, not in others.

But when the 'company that owns the web' is promoting it, they can get any number of downloads they chose to as long as the product is competitive.

And distribution is one thing, brands have incredible power - even with 'we HN readers' who should kind of know better.


Not now, but at the time of its introduction, Chrome was significantly better than everything else. Once they added extensions at least.

Chrome was the first web browser that didn't crash and didn't need to be frequently restarted. Chrome was immune to memory leaks, Flash was sandboxed, each tab was sandboxed, etc. It was a revolution.


> Chrome isn't really that much better than others.

The point is that it was much better than the others when it came out. It spurred the competition into actually competing. (If you remember, IE had >80% of the market at the time, and was horrendous.)


I don't know why this is being downvoted. I moved to chrome because ff performance was abysmal (and moved back when Chrome's v8 jit kept taking 100% cpu and locking my Linux laptop)


"We've noticed you're commenting in another browser. Would you like to try commenting in Chrome?" /header


Chrome's growth was bootstrapped with very pushy ads on the highest traffic website on the internet, that typically doesn't allow ads. They also paid to bundle it with unrelated installers.


Chrome also was faster and simpler than the alternatives. <s>That was relevant too</s>. If it was only about Google weight, G+ would be a success.


It really wasn't relevant. Browsers are fungible. Social networks are not. Comparing Chrome with Google+ is missing the point.


I'll be curious if WebAssembly affords them an opportunity to leverage {large number of engineers} or instead enables new browser entrants via evening the legacy "How good is your js engine?" playing field.


> Chrome's growth was bootstrapped with very pushy ads on the highest traffic website on the internet, that typically doesn't allow ads.

It was (is) also a fantastic browser.

> They also paid to bundle it with unrelated installers.

So did other browsers.


Anybody else has the feeling that PyTorch is to TensorFlow what Chrome was to other browsers? I started PyTorch about a month ago and was impressed how effortless everything was compared to TF.


I’m doing research (not deployment) and have the same feeling. PyTorch has inspired a blog post [1], Tensorflow didn’t.

Briefly, the benefits of PyTorch are

* easy conversion to NumPy arrays (meaning rest of Python can be used!). This is a bottleneck in Tensorflow; for reasonable sizes, PyTorch is 1000x faster.

* trackbacks are easy to follow (because defines graph by running)

* it’s as fast as tensorflow [2] (or at least torch is, which calls the same C functions as PyTorch, and there’s a tweet [4] by a core dev saying to expect the same speeds). Plus on the web I’ve only found anecdotes that support PyTorch faster than tensorflow.

* it’s easy to extend; everything is a simple Python class. e.g., see their different optimizers [3]

[1]:http://stsievert.com/blog/2017/09/07/pytorch/

[2]:https://github.com/soumith/convnet-benchmarks

[3]:https://github.com/pytorch/pytorch/tree/master/torch/optim

[4]:https://twitter.com/soumithchintala/status/83545486710789734...


Tensorflow's advantage is that once you build your model, you can run on everything from a massive cluster to a mobile GPU without significant modification. Because you're just writing a description of a computation graph, it's easy for backend systems to process that description and optimize the execution of your model.

PyTorch's imperative semantics (where the computation graph is implicitly defined at runtime by the execution of your Python code) definitely make it cleaner to do research prototyping. But AFAIK most PyTorch models need be reimplemented in lower-level code, or maybe something like Caffe2, before they can be used in production. That's a fairly significant tradeoff, which makes it hard to see PyTorch totally replacing Tensorflow anytime in the near future. That said PyTorch is obviously a great tool and it's exciting to see how it will develop and be used.

(disclaimer: I work for Google, opinions are my own)


PyTorch does have ONNX, a tool to convert from PyTorch models to Caffe2 models: http://pytorch.org/tutorials/advanced/super_resolution_with_...


> massive cluster

No you can't because Google doesn't release those functions in the open source version of TF

Also PyTorch will soon by able to export directly to Caffe2 / CNTK


PyTorch adoption is growing quickly among ML researchers. TensorFlow made a splash in 2016... Jeremy Howard gives his reasons for switching to PyTorch from TF here:

http://www.fast.ai/2017/09/08/introducing-pytorch-for-fastai...

Basically, some things were really hard in TF and in PyTorch they're easy. Time to insight while testing a new model is shorter.


Thoughts on Keras on top of tensorflow?

I have not yet committed to a deep learning framework, as up until now, I was mostly either using scikit-learn or building neural networks from scratch, straight numpy (lol)

I've heard a nice thing about Keras is that it forms more of an abstraction on top of other libraries, though I could be misunderstanding.


I don't have a lot of experience, but I recently ported an RNN model from lasagne/theano to keras/tensorflow, and the latter combination was about 4x slower. Not sure whether it's keras or tensorflow causing the slowdown. The API is nice though.

I also tried to flip the keras switch to run on top of theano instead but it had issues that I didn't have time to fix, so I just stuck with the original lasagne/theano stack.


I guess this is quite natural. Once some key ideas have been laid out, it's easy for a big corp like Google (whose expertise is on software engineering & algorithms for data processing) to chip in and produce something better.

Research organizations like the University of Montreal cannot compete with that.


> Research organizations like the University of Toronto cannot compete with that.

This is true; note that in this case it is Université de Montréal.


Sorry, I tend to mix both. Edited.


Try chrome on a Mac, it constantly revs up the CPU and eats way more memory than safari


I was wondering why these people were praising Chrome so much about until you pointed out the problems on Mac, i.e. I mainly use Mac for browsing, these other people must be using non-Macs, you use both.


I think that the goals of MILA/Theano might be somewhat different from that of Mozilla/Firefox.

Mozilla aims to continuously steer the direction in which the web develops. As far as a software layer developed by an academic research group, I'm sure they will be extremely happy that they could set the agenda with early software, and now get to pass the baton to an industrially supported codebase so that they can focus on the next generation of innovative ideas.

While the dominating market share of Chrome over Firefox could be considered a battle loss for Mozilla, the maturation of Tensorflow (and other frameworks which leverage GPUs and provide a higher level interface for ML algorithms) may be considered a big win for MILA.


How does a ML Python library that's shutting down remind you of Chrome?


I think s/he meant (other browsers) : chrome :: theano : tensorflow.


Correct. When Tensorflow came out, it seemed that there was already too many good alternatives (Theano, Caffe, Torch). When Chrome came out, Firefox, Safari and the latest version of IE also seemed decent options.


Chrome was so out of left field.

A desktop browser? Made by Google?! In an environment where people have celebratory launch parties for Firefox??

Of course, the business goals made sense, but I don't think anyone guessed it would overtake the market share so effortlessly.


Come to think of it, google has pissed away so much user goodwill over the years. There was a time when they could do no wrong. I guess same as MS in early 90s. Now they are nearing a point where with every product release they have to prove they don’t suck and aren’t screwing the user. It’s a very difficult regime to operate under.


One of the boldest design decisions they made was to make the URL bar and search bar into one whereas every other browser (can't remember what Opera did) separated the two. That immediately hooked me from using Safari/Firefox. That and it felt faster.


It's the feature I hate the most. It made sense for them because having all your history reside with Google fuels their engines. It makes no sense for the user and gives worse results when trying to find something back.


when google.com itself came out, the search engine market was practically covered by yahoo, altavista and co.


organically? Please, Google promoted Chrome through ads for months! -- disclaimer, I love Chrome!


Hah, that sounded almost like "Embrace, Extend, Extinguish" :-)


I'm curious what will happen with PyMC... incorporate another library? Is there an easily swappable library that does what Theano does?


Tensorflow does everything Theano does, but with the backing of the big G. That's basically why Theano is being sunsetted.


As I understand it Tensorflow does not handle a large number of variables very well[0], which is the killer feature of NUTS in pymc3. This makes it a bit of a non-starter for pymc3.

[0] https://github.com/pymc-devs/pymc3/issues/1650


That comment states that TensorFlow does not handle a large number of operations well, not a large number of variables. There's a large difference between those statements :). But yes, the general point is correct, especially for models that do not use many matrix multiplications.


Oops! Right you are. I just recalled that there was some issue with a large number of <insert important item for MCMC here>. I should have reread the threat before linking it.


Keras is pretty cool and can already use Theano and Tensorflow as a backend.


Not sure how this comment relates to PyMC3 though - PyMC3 is for Bayesian statistical modeling with MCMC. It uses Theano as a backend. Keras is a specifically for neural networks.


Who needs anything else except deep learning? \s


I'm actually nostalgic despite not using Theano for all that long. Theano really was groundbreaking and excellent, though, like many, I've moved on. Thanks to the developers!


EDIT:

Theano inspired TensorFlow, Pytorch, ect ect. That there are so many imitators is a complement to Theano.


Can't say I'm not sad to hear this... I've always much preferred the Theano API. IMHO, the way you have to explicitly build the graph in Tensorflow is cumbersome, compared to how it automagically happens behind the scenes in Theano. The code to simply multiply a couple of matrices, for example, reads much nicer in th than tf.

Annyywhoo, better brush up on TensorFlow I guess.


You should give PyTorch a try


First time I've looked up what it is, looks like it's quite useful - I'd imagine the community will keep patching it.


RIP


I originally misread this as Theranos and thought well duh, but oddly enough Theranos seems to have outlived Theano!


Please use the original title ("MILA and the future of Theano").


Yes, this. The seemingly increasing HN practice of rewriting submission titles and headlines strikes me as the ultimate in IT support nerd "just move"-ism.


I disagree, the original title is too vague. Perhaps "MILA to stop working on Theano".


The original title is less precise, but it's not even remotely inaccurate or even "vague" by any reasonable definition of the term.

If you want to know about the future of Theano, that tells you about it.

If a title is misleading clickbait, then sure, fix it. But if you're making subjective judgment calls about whether you think the level of imprecision is appropriate or not, and so on, that's way over the line.

Again, the original title is 100% appropriate for the post, because it describes exactly what the post lays out. It may not be as precise as you personally want it, but this is how the author chose to package it and that should be respected.


It has been going on since the beginning, which is why there's a rule about it and also why people ignore the rule.

We changed the title from "Theano not to be developed further". Submitters: please don't editorialize like this. It breaks the rule.

https://news.ycombinator.com/newsguidelines.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: