More

gh1 · on June 4, 2022

I think the idea is not to replace learning the theory or math, but rather to just postpone it. Learning the practical aspects of an engineering discipline can provide the necessary motivation to study the theory/math, and this is the oft ignored factor. I also think there is some benefit in learning a topic using the tool that you will actually use in production (if that is possible without adding unnecessary complexity in the syllabus).

I personally learned DRL from David Silver's course and Sutton & Burto back in the days. They were the only good resources around and I liked them very much. But I think that with the advent of high-level frameworks in DRL, there are better learning paths.

I do intend to teach the theory/math in a later installment of this series, but I wanted to do it by showing students how to implement the various classes of algorithms e.g. Q-learning (DQN/Rainbow), policy gradients (PPO) and model-based (AlphaZero) using RLlib. This would kill two birds with one stone: you can simultaneously pick up the theory/math and the lower level API of the tool that you will be using in the future anyway.

gh1 · on June 3, 2022

Maybe this would help you differentiate: GPT-3, DALL-E 2 etc. uses transformers, while AlphaGo, OpenAI Five etc. uses Deep Reinforcement Learning. They are not mutually exclusive, but just different things.

rg111 · on June 3, 2022

Yrrk.

Transformers are being used in Deep RL for at least months.

Try these: https://scholar.google.com/scholar?q=transformer+deep+reinfo...

gh1 · on June 3, 2022

Good catch. I can imagine that this is annoying. I have put it in my todo.

I hope you enjoy the course over the weekend.

pciexpgpu · on June 3, 2022

Thank you for doing this!

I haven't looked deeply enough, but does this course use a higher-level 'package' such as OpenAI Gym or teach at a lower-level? (Is lower-level stuff even possible...)

gh1 · on June 3, 2022

I think the levels (high, low etc.) are relevant for the Deep RL algorithm, not the environment. The lower level version of OpenAI Gym canned environments would be custom Gym environments. I don't see much reason to go any lower than that.

The situation looks different for Deep RL algorithm. You can implement them from scratch yourself using Tensorflow or any other similar library. Otherwise, you could just use a higher-level library like RLlib which implements the algorithm using modular components and exposes hyperparameters as configuration parameters.

In many real world use cases, all one needs to do is to use RLlib's implementation and then tune the hyperparameters. In that way RLlib is to Deep RL what Keras is to Deep Learning.

This course uses RLlib. Does that answer your question?

pciexpgpu · on June 3, 2022

Great, yep, that is good to know.

gh1 · on June 3, 2022

Thanks. I learned it from an Udemy course [0]. Took just a couple of weeks to pick up. Regarding overlays, Sketchbook supports the idea of layers. I simply put different elements in the illustration in different layers. Sketchbook gives me PSD files that can be imported in GIMP. I then export many PNG files by progressively selecting more layers in GIMP. These PNG files go into Beamer like this:

    \begin{figure}
        \includegraphics<2>[width=0.35\textwidth]{images/shop/1.png}%
        \includegraphics<3>[width=0.35\textwidth]{images/shop/2.png}%
    \end{figure}

The % sign is important and it maintains the correct positioning of the images.

[0] https://www.udemy.com/course/drawing-for-trainers-leaders-an...

gh1 · on June 3, 2022

My experience matches yours. Recently, I was trying to solve an optimization problem using Deep RL. As usual, I had to run many experiments over several days using various tricks and hyperparameters. Finally, it turned out something related to the symmetry of the action space made a huge difference in learning.

Anyhow, the experimentation stage requires a certain discipline and feels tedious at times. But the moment when learning takes off, it feels great, and for me personally, compensates for the tedious phase before.

It's certainly not fun for everyone, but I guess it could be fun for the target audience of the course (ML engineers/Data Scientists).

Regarding frameworks, my experience has been different. I find RLlib to be more modular and adaptable than SB3. But the learning curve is certainly steeper. The biggest differentiating factor for me is production readiness. Assuming that we are learning something in order to actually use it, I would recommend RLlib over SB3. The equation for researchers may be different though.

InefficientRed · on June 3, 2022

Have you ever encountered a situation where RL solved a (IRL "people paid me non-research-grant money for this") problem for you faster than classical controls engineering and/or planning? I have not.

gh1 · on June 3, 2022

Depends on what you mean by faster. Do you mean "time to solution" or "time to inference"? I think there are also more factors to take into consideration when considering the merit of the method e.g. performance, robustness, ability to handle non-linearity, ability to solve the full online problem etc.

When all these factors are taken into account, I have encountered situations where Deep RL performed better.

There are also very public examples of this e.g. Google's data center cooling [0] and competitive sailing [1].

[0] https://www.technologyreview.com/2018/08/17/140987/google-ju... [1] https://www.mckinsey.com/business-functions/mckinsey-digital...

InefficientRed · on June 3, 2022

> Do you mean "time to solution" or "time to inference"?

I meant time to a real solution that works well enough to put into a product.

> There are also very public examples of this e.g. Google's data center cooling [0] and competitive sailing [1].

DeepMind really needed DRL wins on real problems.

McKinsy has a strong incentive to be able to say "we know all about the AI RL magic" (and all the better that it's in the context of an oligarchy's entry in a Rich Person Sport... such C-suite/investor class cred!)

In both cases, DRL was used because it was the right tool for the job. But, in both cases, proving DRL can be useful was the job! Go is a better example, but of course wasn't solving a real problem.

If you throw enough engineering time and compute at DRL, it can usually work well enough. (There is a real benefit to "just hack at it long enough" over "know the right bits of control theory".)

gh1 · on June 3, 2022

I am sorry about that. Unfortunately, the same thing happens in Firefox when the tracking protection is set to "strict".

This is apparently happening after Teachable updated their video player. Earlier, they used Wistia. Now they use Hotmart.

I have informed Teachable about this issue. They said they will look into it.

The current workaround would be to use Chrome or Firefox (with tracking protection set to a level below "strict").

gh1 · on June 3, 2022

I am glad you like it. The coding exercises don't require a GPU. Thankfully, most RL problems (and certainly the ones used in the course) require small neural nets which can be trained in reasonable time using a CPU.

gh1 · on Nov 14, 2019

Hi all, creator of PythonBooks here. PythonBooks is running since 2017. During launch, it was on the front page of HN [0]. Thanks to the HN community for showing so much interest in the website. There has been steady traffic ever since and I have tried my best to keep the website updated with the latest books. Starting this month, there's a new page which lists the best Python books published every month - based on popularity, topic and novelty. I thought that it could be an useful tool for Python developers to cut through the noise and keep abreast of the best recent books. Feedback is welcome :-)

[0] https://news.ycombinator.com/item?id=14769317

gh1 · on Jan 6, 2019

In my opinion, the best problem to solve are your own problems - both emotional and material.

gh1 · on June 17, 2018

The article claims that catching up on sleep during the weekend doesn't work. This is most likely wrong information, as shown in this very recent study [0].

[0] https://onlinelibrary.wiley.com/doi/pdf/10.1111/jsr.12712

101km · on June 17, 2018

As far as I can tell your linked study says that "possibly, long weekend sleep may compensate for short weekday sleep" in terms of mortality but says nothing about qualify of life.

You can catchup/reset but the days you don't sleep enough you'll do worse.

gh1 · on June 18, 2018

Yes, different priorities lead to different results. If you are looking at life satisfaction, then 8 hours a day is claimed to be optimal [0].

[0] https://link.springer.com/article/10.1007/s12232-016-0256-1

coldtea · on June 17, 2018

The "very recent study" is just a recent study. It needs to be repeated, critiqued by other researchers and studies, and so on to be verified. Until then it's worth nothing.

The processor, on the other hand, summarized our research knowledge so far, in other words, tons of study results and a large body of knowledge.