So you're saying that the one Nothing has a different meaning than the other Nothing, and that this is something that anybody would want? Or that we have a Nothing and a Just Nothing? I'm sorry, but I don't get how that would be considered good design in any context. Do you have any example where this would be considered sensible?
Having a situation where one handles both `Nothing` and `Just Nothing` in the same context should be rare.
But you might be writing some functions on some abstract data structure with a type parameter `a` (say it’s a graph and users can tag the vertices with values of type `a`). And (maybe just internally) there are situations in your algorithms where a value might be absent, so you use `Maybe a`. If `Nothing == Just Nothing`, your Users can’t use a maybe type for `a` anymore because your algorithm wouldn’t be able to distinguish between its `Nothing`s and the user’s.
In any concrete context, Maybe (Maybe A) could _probably_ be simplified to just Maybe A as we expect. Alternatively, we could be in a situation where there are two notions of failure (represented here as Nothing and Just Nothing) in which case we'd be better off simplifying to Either Error A where Error covers you multiplicity of error cases.
But while these are all obvious in a concrete context, what we often are doing is instead compositional. If I want to write code which works over (Maybe a) for some unknown, library-user-declared type `a` then I may very well end up with a Maybe (Maybe Int) that I can't (and mustn't) collapse.
As a concrete example, consider the type of, say, a JSON parser
ParserOf a = Json -> Maybe a
We hold the notion of failure internal to this type so that we can write
fallback : ParserOf a -> ParserOf a -> ParserOf a
which tries the first parser and falls back to the second if the first results in error. We might also want to capture these errors "in user land" with a combinator like
catch : Parser a -> Parser (Maybe a)
If we unwrap the return type of `catch` we get a Maybe (Maybe a) out of a compositional context (we can't collapse it). Additionally, the two forms of failure are distinct: one is a "handled" failure and the other is an unhandled failure.
Similarly, a polymorphic function may want to put values of an unknown type in a Map or MVar. From the outside it may be completely reasonable to call that function on Maybe Foo, which would mean a Maybe (Maybe Foo) somewhere internally and I would struggle to call that bad design.
I think the obvious implementation is `maybe (Left False) (maybe (Left True) Right)`
If we have a value then we clearly have both layers. If we don't have a value then we need to distinguish Nothing from Just Nothing by way of the book.
Imagine you're working on a compiler. You need to represent compile-time computed value of type Maybe Int (e.g. you are precomputing nullable integers).
You see 1 + null. So you have add: Maybe Int -> Maybe Int -> Maybe Int, that takes two precomputed values, and returns new precomputed value for the operation result.
However, you can't precompute Console.readInt().
For some expression, you can either be able to compute value at compile time, or not.
What is the output type of compileTimeCompute: Expr -> ???
I don't understand your example. What does compile-time computed stuff have to do with readInt()?
I get that it might be possible to do that, use a Maybe Maybe T. But it's like an optional<bool> in C++. It can be done, it's just not a good idea. So if you design your system not to allow that in the first place, nothing of value was lost.
If you have specific error cases that you want to communicate, like "what was read from the console didn't parse as an int" as opposed to "the computation didn't find a result", then using the two values "Nothing" and "Just Nothing" as the two distinct values to encode that is not a sound design. Either you have meaning, or you have Nothing. Nothing shouldn't have any meaning attached to it.
First I want to thank you for your honesty. It is so refreshing to read, in this sea of self-promotion of Medium posts that go around here.
You may want to talk to a coach about this. I'm sure you can find someone specialized in finding jobs. You make it sound like you somehow let yourself down, and if that shimmers through in an interview, you won't leave a good impression. I think you need somebody to help you with finding a good way to frame this time in a better light. Not only for your next application, but also for yourself. What started as time off might have turned into involuntary unemployment, with all the negative weight that this brings.
My understanding is that this is only about memory access patterns, and has nothing to do with multithreading. The basic example being matrix multiplications C=A∙B. If the matrices are too big to fit in the L2/L3 cache, you have to think about how to optimize memory access. So instead of having the compiler just generate instructions to read/write memory, you want it to interleave instructions to prefetch memory, ahead of time. But how do you do that? What is the best point in time to prefetch?
Another optimization is locality: The basic algorithm is to loop over i and j, then multiply A's row i with B's column j. You do that with another loop over k to compute the sum over Aᵢₖ∙Bₖⱼ, then store the result in Cᵢⱼ. But what if one row or column is already too big for the cache? Also, once you invested into loading the data into the cache, you want to make the best use of it. You don't want to load it again if you can avoid it. So what you do is you limit the loop ranges for i, j, and k such that the overall memory accessed fits into the cache. But what are the optimal loop sizes?
The answers depends on the CPU architecture (cache sizes) and probably also the memory that you're using.
I don't know about this specific example, but polyhedral optimization generally, at least, is useful for parallelization of such loops, e.g. http://pluto-compiler.sourceforge.net/. However, serial GEMM is typically most important in HPC, with parallelization at a higher level.
(GEMM shouldn't be main memory limited, if that's what you mean.)
> NOTE 2: Depending on the shape of your head the head band of the earmuffs can make the crown of your head hurt – though this is not limited to earmuffs as may headphones have this undesirable feature.
I tied a piece of memory foam that I salvaged from an old mattress to the head band, using two strips of velcro tape. Looks ridiculous, works really well.
Because Starcraft is 1v1 and LoL is 5v5. Poker AIs also focus on 1v1 as opposed to full ring games. It's easier.
Also because Starcraft has three races, i. e. 9 matchups. But if you pick one race for your AI, you only have three matchups to worry about. LoL on the other hand has about 500 champions (give or take), whose strengths and weaknesses are vastly dependent on the current patch. LoL is less about skill or strategy, and more about picking the champion of the week.
Starcraft works because they have such a small number of matchups that balancing them is possible. Certainly not easy, and players keep figuring out ways to evolve the meta game. Maybe an AI could even find interesting new strategies. But LoL has so many matchups that their balancing team has an absolutely impossible task of trying to catch up. It's so complex that they can't just think about in theory, but they have to look at win-rate statistics, over different skill ranges. At this point it's more a problem of data science. And the number of champions keeps growing, because that's how they make money.
> Because Starcraft is 1v1 and LoL is 5v5
Starcraft can also be played 5v5
> Also because Starcraft has three races, i. e. 9 matchups
But each race has 20 different units. The units are more
similar to LOL champions than the races. There are also a
tonne of different buildings and upgrades to choose from.
LOL is very similar to DOTA and given that OpenAI already
beat *human* players in DOTA, LOL is not that farfetched. The
current work, although impressive, still cannot beat *human*
players in SC2. The complexity of designing AI for a game
depends heavily on the branching factor of decision in the
game. SC2 has a much higher branching factor than LOL or any
other games played by n00bs.
> LoL on the other hand has about 500 champions (give or take)
LoL has 141 champions.
Still significantly higher than Dota's 116.
A common note I hear when people discuss open AI playing dota is that it uses pre-made matchups, which reduces the number of matchups considerably.
Would it be that difficult to have the AI play the picking stage?
Compared to the complexity of the decisions you have to make during the game, the picks are straightforward - especially if you have performance data on matchups.
The number of practice games the AI would have to play to learn to handle all the heroes goes up drastically, but that's only a matter of time.
They had drafting working just fine in a previous version of the AI. (The one that played against semi-pros a month or so before TI.)
One theory I heard was that the pre-made matchups at TI were designed to prevent the human players from getting out-drafted. (The hero pool was pretty limited, which could result in a different meta from what the human players were used to.) They wanted to make the matches purely a test of in-game skill, not about the metagame. As far as I know though, the OpenAI team hasn't explicitly confirmed that.
I'm pretty sure OpenAI only plays the single simplest hero, mechanically. The other 115 would be varying degrees of "more difficult" to play competently.
Also, it only played mirror matches. If it had to play any hero against any hero, that's 3.393109e+190 possible matchups.
I don't think you need to learn all possible matchups to play any matchup, as a human would. You'd need to just learn how to play/counter each hero, plus some 2 or 3 hero combos (not all permutations), so on the order of a hundreds?
You publish the code review at 5pm on day 1, it gets reviewed and a question added at 5pm (local time) on day 2, you see it, answer the question/make a change and publish at 5pm day 2, it gets a ship it on day 3 (local time), and then you push the change on day 3
Ben, could you go a little into detail on what you had in mind with the term rewriting system? The sin/cos example seems a little underwhelming, since these should be evaluated at compile-time in the first place.
They seem like a variant on C macros, but without the restriction on function syntax. Maybe one could implement Python's with statement in userland, which is very nice.
It also seems to me that you could replace C++ templates, yet you do have generics. How come? What's the difference?
My intuition is that ambiguities are at least possible, so what about precedence, which AST term is evaluated first? I'm thinking of some situation where two subtrees match the rule, but you end up with different results depending on which is transformed first. E. g. if you were to define a cross product rule, and apply it to a × b × c.
What about infinite loops in rule sets? What about recursion?
What about a situation where multiple rules could be applied? Is there any way to debug the transformations? Is that what the `using` scopes are for? If the compiler was to detect ambiguities, that seems pretty expensive, because it would need to match every rule to every subtree, right?
This thing seems very, very powerful, I just find it a bit hard to grasp what you can do with it in practice, and what the limitations are. I would be very interested to hear what you ran into so far.
>My intuition is that ambiguities are at least possible, so what about precedence, which AST term is evaluated first? I'm thinking of some situation where two subtrees match the rule, but you end up with different results depending on which is transformed first. E. g. if you were to define a cross product rule, and apply it to a × b × c.
This is absolutely a potential issue, and the examples I have up now are quite bad. This is why it's important for rules to be strictly scoped. The only rules that can affect an expression are the ones you've explicitly brought in with a "using" block, or those that are defined for the types you're directly working with. Given the scoping, I think overlapping rule applications will be uncommon in practice.
>What about infinite loops in rule sets? What about recursion?
There's a sanity limit on the number of times an expression can be rewritten, and it'll show an error in that case which displays the transformations that triggered the limit (including the first, last, and any non-repeating transformations going backward from the last - so it should be very clear.)
>Is there any way to debug the transformations?
This is definitely an area for improvement. I'm planning to (1) add enforced "rewrite this" metadata that will fail to compile if the expression isn't rewritten, and (2) enable dumping not only the final AST but also all of the transformations that occurred.
It seems you've independently rediscovered extensible programming which had seen much research in the past and again a few years back. There have been a number of languages that did same thing as in your tweet (e.g. [1]) but most aren't developed anymore.
Neat, another one for my list. Seed7 allows the extension of the grammar [1]. The compiler understands a simple EBNF which doesn't distinguish between different non-terminals. The semantic checks and the transformation into a known AST are deferred to another stage.
Does the compiler build the initial AST based on a grammar, before the transformations happen? That would mean you can't expand the grammar with user-defined rules. The reason you can "implement for loops in user land" is because they're already part of the grammar, is that correct?
How abstract is that initial AST? Can you use these rewriting transformations to expand the grammar?
That is correct. I think there are tradeoffs in making the language too extensible; in this case it's changing some of the details but not the high-level semantics of how the language works. If every library looked completely different lexically, it would be a nightmare to reuse code. Users of Rust macros may already start to feel this; many libraries are implemented as essentially DSLs.
With that said, down the line I plan to add procedural macros, and those will likely be lexical (they take a series of tokens as input, which doesn't need to parse into valid AST.) If I do go that route, such macros would have to be invoked explicitly.
Working until you're 80 isn't healthy either. given the choice I'd rather do 10 years with no free time than 60 years with only a tiny bit of free time.