More

ndh2 · on Nov 30, 2018

So you're saying that the one Nothing has a different meaning than the other Nothing, and that this is something that anybody would want? Or that we have a Nothing and a Just Nothing? I'm sorry, but I don't get how that would be considered good design in any context. Do you have any example where this would be considered sensible?

Skeime · on Dec 3, 2018

Having a situation where one handles both `Nothing` and `Just Nothing` in the same context should be rare.

But you might be writing some functions on some abstract data structure with a type parameter `a` (say it’s a graph and users can tag the vertices with values of type `a`). And (maybe just internally) there are situations in your algorithms where a value might be absent, so you use `Maybe a`. If `Nothing == Just Nothing`, your Users can’t use a maybe type for `a` anymore because your algorithm wouldn’t be able to distinguish between its `Nothing`s and the user’s.

tel · on Dec 3, 2018

I am.

In any concrete context, Maybe (Maybe A) could _probably_ be simplified to just Maybe A as we expect. Alternatively, we could be in a situation where there are two notions of failure (represented here as Nothing and Just Nothing) in which case we'd be better off simplifying to Either Error A where Error covers you multiplicity of error cases.

But while these are all obvious in a concrete context, what we often are doing is instead compositional. If I want to write code which works over (Maybe a) for some unknown, library-user-declared type `a` then I may very well end up with a Maybe (Maybe Int) that I can't (and mustn't) collapse.

As a concrete example, consider the type of, say, a JSON parser

    ParserOf a = Json -> Maybe a

We hold the notion of failure internal to this type so that we can write

    fallback : ParserOf a -> ParserOf a -> ParserOf a

which tries the first parser and falls back to the second if the first results in error. We might also want to capture these errors "in user land" with a combinator like

    catch : Parser a -> Parser (Maybe a)

If we unwrap the return type of `catch` we get a Maybe (Maybe a) out of a compositional context (we can't collapse it). Additionally, the two forms of failure are distinct: one is a "handled" failure and the other is an unhandled failure.

dllthomas · on Dec 5, 2018

Similarly, a polymorphic function may want to put values of an unknown type in a Map or MVar. From the outside it may be completely reasonable to call that function on Maybe Foo, which would mean a Maybe (Maybe Foo) somewhere internally and I would struggle to call that bad design.

JadeNB · on Dec 5, 2018

> In any concrete context, Maybe (Maybe A) could _probably_ be simplified to just Maybe A as we expect.

And doing so is just what `join :: m ( m a ) -> m a` specialises to for `m = Maybe`!

tel · on Dec 7, 2018

That's just one way to do it. Another might be

    collect :: Maybe (Maybe a) -> Either[Bool, a]

JadeNB · on Dec 7, 2018

But you mentioned simplifying `Maybe (Maybe a)` to `Maybe a`, which your `collect` doesn't do (at least not directly).

(Also, shouldn't `Either[Bool, a]` (which I don't know how to make sense of) be `Either Bool a`? Even with this signature, I'm not sure what the implementation would be, and Hoogle doesn't turn up any obviously correct results ( https://www.haskell.org/hoogle/?hoogle=collect https://www.haskell.org/hoogle/?hoogle=Maybe+%28Maybe+a%29+-... ), but that's probably my fault.)

dllthomas · on Dec 11, 2018

I think the obvious implementation is `maybe (Left False) (maybe (Left True) Right)`

If we have a value then we clearly have both layers. If we don't have a value then we need to distinguish Nothing from Just Nothing by way of the book.

JadeNB · on Dec 13, 2018

Oh, I see; we're thinking of `Maybe a` as `a + ()`, and then identifying `(a + ()) + ()` with `a + (() + ())` and then `() + ()` with `Bool`. Thanks!

dllthomas · on Dec 18, 2018

Right, exactly!

lostmsu · on Nov 30, 2018

Imagine you're working on a compiler. You need to represent compile-time computed value of type Maybe Int (e.g. you are precomputing nullable integers).

You see 1 + null. So you have add: Maybe Int -> Maybe Int -> Maybe Int, that takes two precomputed values, and returns new precomputed value for the operation result.

However, you can't precompute Console.readInt().

For some expression, you can either be able to compute value at compile time, or not.

What is the output type of compileTimeCompute: Expr -> ???

ndh2 · on Nov 30, 2018

I don't understand your example. What does compile-time computed stuff have to do with readInt()?

I get that it might be possible to do that, use a Maybe Maybe T. But it's like an optional<bool> in C++. It can be done, it's just not a good idea. So if you design your system not to allow that in the first place, nothing of value was lost.

If you have specific error cases that you want to communicate, like "what was read from the console didn't parse as an int" as opposed to "the computation didn't find a result", then using the two values "Nothing" and "Just Nothing" as the two distinct values to encode that is not a sound design. Either you have meaning, or you have Nothing. Nothing shouldn't have any meaning attached to it.

lostmsu · on Dec 1, 2018

> "what was read from the console didn't parse as an int"

I meant what you read from the console can not be computed at compile time.

ndh2 · on Nov 30, 2018

Thanks dang

ndh2 · on Oct 14, 2018

First I want to thank you for your honesty. It is so refreshing to read, in this sea of self-promotion of Medium posts that go around here.

You may want to talk to a coach about this. I'm sure you can find someone specialized in finding jobs. You make it sound like you somehow let yourself down, and if that shimmers through in an interview, you won't leave a good impression. I think you need somebody to help you with finding a good way to frame this time in a better light. Not only for your next application, but also for yourself. What started as time off might have turned into involuntary unemployment, with all the negative weight that this brings.

ndh2 · on Oct 12, 2018

My understanding is that this is only about memory access patterns, and has nothing to do with multithreading. The basic example being matrix multiplications C=A∙B. If the matrices are too big to fit in the L2/L3 cache, you have to think about how to optimize memory access. So instead of having the compiler just generate instructions to read/write memory, you want it to interleave instructions to prefetch memory, ahead of time. But how do you do that? What is the best point in time to prefetch?

Another optimization is locality: The basic algorithm is to loop over i and j, then multiply A's row i with B's column j. You do that with another loop over k to compute the sum over Aᵢₖ∙Bₖⱼ, then store the result in Cᵢⱼ. But what if one row or column is already too big for the cache? Also, once you invested into loading the data into the cache, you want to make the best use of it. You don't want to load it again if you can avoid it. So what you do is you limit the loop ranges for i, j, and k such that the overall memory accessed fits into the cache. But what are the optimal loop sizes?

The answers depends on the CPU architecture (cache sizes) and probably also the memory that you're using.

gnufx · on Oct 12, 2018

I don't know about this specific example, but polyhedral optimization generally, at least, is useful for parallelization of such loops, e.g. http://pluto-compiler.sourceforge.net/. However, serial GEMM is typically most important in HPC, with parallelization at a higher level.

(GEMM shouldn't be main memory limited, if that's what you mean.)

ndh2 · on Oct 10, 2018

> NOTE 2: Depending on the shape of your head the head band of the earmuffs can make the crown of your head hurt – though this is not limited to earmuffs as may headphones have this undesirable feature.

I tied a piece of memory foam that I salvaged from an old mattress to the head band, using two strips of velcro tape. Looks ridiculous, works really well.

ndh2 · on Oct 4, 2018

Some more interesting comments here: https://www.reddit.com/r/cpp/comments/9jwdpf/journeys_toward...

ndh2 · on Sept 21, 2018

Because Starcraft is 1v1 and LoL is 5v5. Poker AIs also focus on 1v1 as opposed to full ring games. It's easier.

Also because Starcraft has three races, i. e. 9 matchups. But if you pick one race for your AI, you only have three matchups to worry about. LoL on the other hand has about 500 champions (give or take), whose strengths and weaknesses are vastly dependent on the current patch. LoL is less about skill or strategy, and more about picking the champion of the week.

Starcraft works because they have such a small number of matchups that balancing them is possible. Certainly not easy, and players keep figuring out ways to evolve the meta game. Maybe an AI could even find interesting new strategies. But LoL has so many matchups that their balancing team has an absolutely impossible task of trying to catch up. It's so complex that they can't just think about in theory, but they have to look at win-rate statistics, over different skill ranges. At this point it's more a problem of data science. And the number of champions keeps growing, because that's how they make money.

justfor1comment · on Sept 21, 2018

  > Because Starcraft is 1v1 and LoL is 5v5
  Starcraft can also be played 5v5
  > Also because Starcraft has three races, i. e. 9 matchups
  But each race has 20 different units. The units are more 
  similar to LOL champions than the races. There are also a 
  tonne of different buildings and upgrades to choose from.
  LOL is very similar to DOTA and given that OpenAI already 
  beat *human* players in DOTA, LOL is not that farfetched. The 
  current work, although impressive, still cannot beat *human* 
  players in SC2. The complexity of designing AI for a game 
  depends heavily on the branching factor of decision in the 
  game. SC2 has a much higher branching factor than LOL or any 
  other games played by n00bs.

LitFan · on Sept 21, 2018

> LoL on the other hand has about 500 champions (give or take)

LoL has 141 champions. Still significantly higher than Dota's 116.

A common note I hear when people discuss open AI playing dota is that it uses pre-made matchups, which reduces the number of matchups considerably.

Would it be that difficult to have the AI play the picking stage? Compared to the complexity of the decisions you have to make during the game, the picks are straightforward - especially if you have performance data on matchups.

The number of practice games the AI would have to play to learn to handle all the heroes goes up drastically, but that's only a matter of time.

Ajedi32 · on Sept 21, 2018

They had drafting working just fine in a previous version of the AI. (The one that played against semi-pros a month or so before TI.)

One theory I heard was that the pre-made matchups at TI were designed to prevent the human players from getting out-drafted. (The hero pool was pretty limited, which could result in a different meta from what the human players were used to.) They wanted to make the matches purely a test of in-game skill, not about the metagame. As far as I know though, the OpenAI team hasn't explicitly confirmed that.

maaark · on Sept 21, 2018

I'm pretty sure OpenAI only plays the single simplest hero, mechanically. The other 115 would be varying degrees of "more difficult" to play competently.

Also, it only played mirror matches. If it had to play any hero against any hero, that's 3.393109e+190 possible matchups.

bilkoo · on Sept 21, 2018

I don't think you need to learn all possible matchups to play any matchup, as a human would. You'd need to just learn how to play/counter each hero, plus some 2 or 3 hero combos (not all permutations), so on the order of a hundreds?

wrsh07 · on Sept 21, 2018

They've had some updates since then: https://blog.openai.com/openai-five-benchmark/

(Plus two shorter / more recent status updates)

ndh2 · on Sept 21, 2018

> If they had a question on a change, it would add two days latency

Wouldn't that be just one day? You ask the question while they sleep, they answer while you sleep, you get the answer the next day.

falsedan · on Sept 21, 2018

You publish the code review at 5pm on day 1, it gets reviewed and a question added at 5pm (local time) on day 2, you see it, answer the question/make a change and publish at 5pm day 2, it gets a ship it on day 3 (local time), and then you push the change on day 3

ndh2 · on Sept 21, 2018

Oh, alright. I thought the "it" would refer to the question, not the whole review process.

ndh2 · on Sept 20, 2018

Ben, could you go a little into detail on what you had in mind with the term rewriting system? The sin/cos example seems a little underwhelming, since these should be evaluated at compile-time in the first place.

They seem like a variant on C macros, but without the restriction on function syntax. Maybe one could implement Python's with statement in userland, which is very nice.

It also seems to me that you could replace C++ templates, yet you do have generics. How come? What's the difference?

My intuition is that ambiguities are at least possible, so what about precedence, which AST term is evaluated first? I'm thinking of some situation where two subtrees match the rule, but you end up with different results depending on which is transformed first. E. g. if you were to define a cross product rule, and apply it to a × b × c.

What about infinite loops in rule sets? What about recursion?

What about a situation where multiple rules could be applied? Is there any way to debug the transformations? Is that what the `using` scopes are for? If the compiler was to detect ambiguities, that seems pretty expensive, because it would need to match every rule to every subtree, right?

This thing seems very, very powerful, I just find it a bit hard to grasp what you can do with it in practice, and what the limitations are. I would be very interested to hear what you ran into so far.

bendmorris · on Sept 20, 2018

I think this thread captures a more interesting application of term rewriting: https://twitter.com/bendmorris/status/1041130408463687681

>My intuition is that ambiguities are at least possible, so what about precedence, which AST term is evaluated first? I'm thinking of some situation where two subtrees match the rule, but you end up with different results depending on which is transformed first. E. g. if you were to define a cross product rule, and apply it to a × b × c.

This is absolutely a potential issue, and the examples I have up now are quite bad. This is why it's important for rules to be strictly scoped. The only rules that can affect an expression are the ones you've explicitly brought in with a "using" block, or those that are defined for the types you're directly working with. Given the scoping, I think overlapping rule applications will be uncommon in practice.

>What about infinite loops in rule sets? What about recursion?

There's a sanity limit on the number of times an expression can be rewritten, and it'll show an error in that case which displays the transformations that triggered the limit (including the first, last, and any non-repeating transformations going backward from the last - so it should be very clear.)

>Is there any way to debug the transformations?

This is definitely an area for improvement. I'm planning to (1) add enforced "rewrite this" metadata that will fail to compile if the expression isn't rewritten, and (2) enable dumping not only the final AST but also all of the transformations that occurred.

forgotpwd16 · on Sept 20, 2018

It seems you've independently rediscovered extensible programming which had seen much research in the past and again a few years back. There have been a number of languages that did same thing as in your tweet (e.g. [1]) but most aren't developed anymore.

[1]: http://seed7.sourceforge.net/examples/declstat.htm

ndh2 · on Sept 21, 2018

Neat, another one for my list. Seed7 allows the extension of the grammar [1]. The compiler understands a simple EBNF which doesn't distinguish between different non-terminals. The semantic checks and the transformation into a known AST are deferred to another stage.

[1]: http://seed7.sourceforge.net/manual/syntax.htm

ndh2 · on Sept 20, 2018

Does the compiler build the initial AST based on a grammar, before the transformations happen? That would mean you can't expand the grammar with user-defined rules. The reason you can "implement for loops in user land" is because they're already part of the grammar, is that correct?

How abstract is that initial AST? Can you use these rewriting transformations to expand the grammar?

bendmorris · on Sept 20, 2018

That is correct. I think there are tradeoffs in making the language too extensible; in this case it's changing some of the details but not the high-level semantics of how the language works. If every library looked completely different lexically, it would be a nightmare to reuse code. Users of Rust macros may already start to feel this; many libraries are implemented as essentially DSLs.

With that said, down the line I plan to add procedural macros, and those will likely be lexical (they take a series of tokens as input, which doesn't need to parse into valid AST.) If I do go that route, such macros would have to be invoked explicitly.

ndh2 · on Sept 20, 2018

There's got to be some boundary, though. The guy streams every single day. Doesn't even take one weekday off. That's not healthy.

Doxin · on Sept 20, 2018

Working until you're 80 isn't healthy either. given the choice I'd rather do 10 years with no free time than 60 years with only a tiny bit of free time.

tuesdayrain · on Sept 21, 2018

He plays videogames for a living. I could do that all day every day for free if I had the opportunity, nevermind 500k a month.