Hacker Newsnew | past | comments | ask | show | jobs | submit | mrmaximus's commentslogin

+1 For real world business problems that I most frequently encounter doing consulting, it's hard to beat Random Forests and/or Gradient Boosting. Truth be told, most business problems I encounter turn out to be largely helped by good old linear models.


Agreed! I have often done more sophisticated analysis and then stepped back and concluded that a simpler analysis was actually better for the business. It moved the business into a better place for informed decisions and gave them simple (analysis backed) rules of thumbs that every manager/director/ VP could understand and use by just checking a couple numbers and doing a simple easily remembered bit of math.

Understandable models with clear intervention points are what most businesses seem to need once you get to digging around in their operations, customer and sales data.


I'd settle for figuring out how to send an email to a previous point in time. I'd like to email myself in 2009. Something about buying a bunch of ASIC's and mining something for a long-term hold.


No need to mine it. Just buy in bulk at $1 and hold.


Bitcoin market cap is still pretty small as far as global financial markets go. I'd have doubts at being able to extract serious amount of money from it. I'd bet you can find something much more conventional to invest in that'll produce big gains that can be extracted easily.


You know in that timeline you'd spend a fortune on ASIC mining hardware and the market would tank because Murphy's Law could not possibly resist showing up and ruining the party.


With a little luck, in about two months we'll finally get to watch Shazam again.


Mark is also the that recently tweeted a picture indicating that he as working through an "into to python for machine learning" book. He may have been successful in tech 25 years ago, but I don't place a premium on his opinions of where things are going. Advising someone who does not intend to go into the academic field to major in any of those is pretty poor career advice. Either you want to work in academia or you just love the subjects and will figure out money through other means... only plausible reasons to pursue those majors. Of course you don't have to worry about automation impact on degrees that have little to no correlation to job-placement anyway.


This guy is just trying to tout his own product. That aside, Kaggle can be quite useful for learning... at least it used to be. Kaggle got me into machine learning 2-3 years ago and now it is about 50% of my job in consulting. Problem is, Kaggle is running too many deep learning, image based comps and they are lucky to get a couple hundred competitors each. The wider user base engages much more heavily in transaction based competitions, and these are much closer to the real world problems you are likely to encounter as a professional.


I think this is because at it's core, the math behind deep learning is pretty simple and there's not much of it. Linear algebra, some simple activation functions and gradient descent. Implementing a net from scratch in python is pretty concise. The simplicity-to-effectiveness ratio is what I find so interesting about deep learning.


Short answer is that their employment contracts probably don't allow them to sue. Almost all employment contracts contain binding arbitration agreements now.


Interesting. They are not TTS like we are accustomed to, they are replicating a specific persons voice with TTS. Listen to the ground-truth recordings at the bottom and then the synthesized versions above. "Fake News" is about to get a lot more compelling when you can make anyone say anything as long as you have some previous recordings of their voice.


> you can make anyone say anything as long as you have some previous recordings of their voice.

That's not what this is doing. They're simply resynthesizing exactly what the person said, in the same voice. It's essentially cheating because they can use the real person's inflection. Generating correct inflection is the hardest part of speech synthesis because doing it perfectly requires a complete understanding of the meaning of the text.

The top two are representative of what it sounds like when doing true text to speech. The middle five are just resynthesis of a clip saying the exact same thing. And even in that case, it doesn't always sound good. The fourth one is practically unintelligible. But it's interesting because it demonstrates an upper bound on the quality of the voice synthesis possible with their system given perfect inflection as input.

To clarify, this is cool work, the real-time aspect sounds great, and I'm sure it will lead to even more impressive results in the future. But I don't want people to think that all of the clips on this page represent their current text-to-speech quality.


Thank you for clarifying this! We tried fairly hard to make this clear, because as you say, the hard part is generating inflection and duration that sounds natural. There's still a ton of work left to do in this duration – we're clearly nowhere near being able to generate human-level speech.

Our work is meant to make working with TTS easier to deep learning researchers by describing a complete and trainable system that can be trained completely from data, and demonstrate that the neural vocoder substitutes can actually be deployed to streaming production servers. Future work (both by us and hopefully other groups) will make further progress for inflection synthesis!


My "Fake News" comment aside, I think what y'all are doing could be transformational for many reasons. Imagine a scenario where a person loses a loved one, and similar technology is able to allow them to "have conversations" with the deceased as a form of healing and closure. Not to mention, this could add a personal touch to assistant bots that will make them a pleasure to use.


>The top two are representative of what it sounds like when doing true text to speech. The middle five are just resynthesis of a clip saying the exact same thing.

Gotcha, now I understand.


>> They're simply resynthesizing exactly what the person said, in the same voice. It's essentially cheating because they can use the real person's inflection.

Yes, but imagine being able to take sound from one person and inflection from another. If you want to fake someone saying something you don't need to do pure TTS, a human can be used to fake another persons inflections.


Based upon what little is posted there, I thought they were taking the original recording, then training the model on that recording against the text of the recording... reproducing the recording. I would think next step is to sample enough audio and text to be able to produce new outputs entirely. It should in theory even be able to learn when/where/how to use inflection.


> "Fake News" is about to get a lot more compelling hen you can make anyone say anything as long as you have some previous recordings of their voice.

Adobe has already developed that technology:

https://arstechnica.co.uk/information-technology/2016/11/ado...

Now imagine combining it with this:

Face2Face: Real-time Face Capture and Reenactment of RGB Videos https://www.youtube.com/watch?v=ohmajJTcpNk

Perhaps using the intonation from the face-actor's voice to guide the speech synthesis.


I agree and I've upvoted you, but I feel it's worth pointing out that Adobe's claim about their own progress in this field was fake news.

https://www.youtube.com/watch?v=I3l4XLZ59iw&t=2m34s

"Wife" sounds exactly the same in both places. All they did was copy the exact waveform from one point to another. Nothing is being synthesized.

https://www.youtube.com/watch?v=I3l4XLZ59iw&t=3m54s

The word "Jordan" is not being synthesized. The speaker was recorded saying "Jordan" beforehand for this insertion demo and they're trying to play it off as though it was synthesized on the fly. This is a scripted performance and Jordan is feigning surprise.

https://www.youtube.com/watch?v=I3l4XLZ59iw&t=4m40s

The phrase "three times" here was prerecorded.

This was a phony demonstration of a nonexistent product. Reporters parroted the claims and none questioned what they witnessed. Adobe falsely took credit and received endless free publicity for a breakthrough they had no hand in by staging this fake demo right on the heels of the genuine interest generated by Google WaveNet. I suppose they're hoping they'll have a real product ready by whatever deadline they've set for themselves.

To be clear, I like Adobe and I think it's a cunning move on their part.


Thanks for the detailed breakdown. The irony is not lost!


As already mentioned, they do have Windows VM's but there are some caveats that indicate it's not fully baked yet. 1.) They require that each VM MUST have a public IP address so that Windows can talk to an activation server every 30 days. 2.) You cannot yet bring your own license.


czep, even though I don't agree with much of what you wrote (in the article), I do appreciate you taking the time to so thoroughly expound on the ideas. Paul Graham most likely fell victim to projection of a Libertarian ideal when he viewed "best programmers" from his perspective.

> I'm also frustrated to see a powerful man perpetuate self-serving ideologies without acknowledging the influences of power and luck.

I agree with this above statement maybe or maybe not for the reasons you do.

The "Power Game", or even the lack of willingness or know-how to play it is the reason a lot of programmers think themselves superior to the sales guys who peddle the product of their labor.

The "Power Game" is also the same reason the good sales guys feel that despite being so technically smart, programmers can be damned idiotic fools.

As an introverted "programmer-type" myself, life would be way easier for me if there was no Power Game. But the Power Game is as human as eating, drinking and pissing.

Hell, I get irritated daily that I even have to eat, drink or piss. It feels like a waste of time when I am in the zone with something. Same thing with the Power Game.

Luck... hugely important. Heck, we've all played RPG's and know to fill up that skill bucket ASAP.

Maybe what would really be helpful to programmers is something that can stir inspiration like Paul Graham, but that covers something akin to "The 48 Laws of Power" for the introverted modern day employees.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: