market cap = book value + discounted future cash flows = share price * number of outstanding shares
When you do a buyback, the book value drops (company loses cash), but the discounted future cash flows remains unchanged. The number of outstanding shares also drops.
The net result is the stock price increases as a company accumulates cash and uses it for buybacks because the number of outstanding shares drops.
Another way of thinking about it is that it's the same as dividends, but the dividend only goes to the sellers of the stock during a buyback.
Market cap = whatever the market decided at that time, and is determined by offer and demand.
The whole book value + discounted future cash flow does not reflect the real stock exchange at all. If that was the case no investor would want share buybacks because as you say it would only reward the ones who sell...
When taking into account offer and demand stock buybacks make more sense: the buyback increase the demand for the stock. The offer will increase a bit (a few might sell) but not in the same proportion because the offer is not very elastic (most of your investors are in for a long ride and are not going to sell), so the price goes up, you get rid of some short term investors, and the long term investors see their share increasing in value. Everybody's happy and the dividend ratios do not mean anything anymore.
There's nothing about PPO that helps it learn long-range strategies. It primarily lets you make multiple steps for a single batch so you can converge faster.
In fact, for a single step with no policy lag, it's equivalent to a standard policy gradient update.
I suspect the difference that allows you to train with reaction time is an RNN or compensating for the lag some other way. I'm testing that out right now with my own SSBM bot: https://www.twitch.tv/vomjom
The author of that paper has since built an agent that has human-level reaction time and is comparable against professional players:
http://youtube.com/vladfi1
People rarely train ML models on macOS for the reason you mentioned. Most machine learning work happens on Linux, so this should work well there.
TensorFlow supports a standalone server mode where it receives computation graphs and executes them. This is nice because then you can remotely execute on any accelerator (Cloud TPU, multi-worker multi-GPU) from your laptop.
In their demo, they did exactly that with a Cloud TPU: it connected to a TensorFlow server that executed the machine learning training part of the program.
I agree, I just had in mind that Apple just now added/announced support for external GPUs. Besides Image+Video edditing, I though general computing tasks is a use case they had in mind. It's not like Gaming is big on MacOS.
>TensorFlow supports a standalone server mode where it receives computation graphs and executes them. This is nice because then you can remotely execute on any accelerator (Cloud TPU, multi-worker multi-GPU) from your laptop.
Where can I find more documentation on this? I’ve been looking for something exactly like this.
It seems like Lattner's part of the show is gone from that video now, though it had been there yesterday when I clicked the above link (or maybe I'm just doin' it wrong now?). Anyway, looks like Lattner's part is here now: https://www.youtube.com/watch?v=Yze693W4MaU.
Keep in mind that what you linked refers to TPUv1, which is built for quantized 8-bit inference. The TPUv2, which was announced in this blog post, is for general purpose training and uses 32-bit weights, activations, and gradients.
It will have very different performance characteristics.
I'll discuss this a bit during my talk at the dev summit.
The short answer is no.
The long answer is yes, but only if you create the model in Python, export it, and then feed training data in other languages. There are some people doing exactly that.
Long term, I'd like to give all languages equal footing, but there's quite a bit of work left.
Forgive my ignorance, but why is it that it is Python-only?
Does Python have intrinsic qualities that other languages don't possess or is it that the huge initial investment in creating TensorFlow was based on Python and duplicating that effort somewhere else would require too much work?
Traditionally, most neural network architectures have been implemented in C/C++ - for performance reasons. But ML researchers are not hackers, for the most part, and Python has the lowest impedence mismatch for interfacing with C/C++ of all the major languages. Julia was popular for a bit, but now Python is dominant. Programs tend to be very small, and not modular - so static type checking is less important than it would be in picking up errors in larger systems.
It's not just the lowest impedance mismatch, but it's also a framework coming out of google, where python and Java were really the only two language choices for a high level interface, and of the two python is the clear winner in prototyping / scientific community acceptance. I think it's because of the ease in experimentation and expressiveness of the language.
It's not clear what you're suggesting as an alternative. My understanding is that you're suggesting thread-per-request, which has many known flaws. There are three approaches to serving requests:
1. Thread-per-request. This is a simple model. You have a fixed-size thread pool of size N, and once you hit that limit, you can't serve anymore requests. Thread-per-request has several sources of overhead, which is why people recommend against it: thread limits, per-thread stack memory usage, and context switching.
2. Coroutine style handling with cooperative scheduling at synchronization points (locks, I/O). This is how Go handles requests.
3. Asynchronous request handling. You still have a fixed-size thread pool handling requests, but you no longer limit the number of simultaneous requests with the size of that thread pool. There are several different styles of async request handling: callbacks, async/await, and futures.
#2 and #3 are more common these days because they don't suffer from the many drawbacks of the thread-per-request model, although both suffer from some understandability issues.
Those options aren't as distinct as you might imagine. Would calling it fiber-per-request make you happy?
(By the way: most of the time, a plain-old-boring thread-per-request is just fine, because most of the time, you're not writing high-scale software. If you have at most two dozen concurrent tasks, you're wasting your time worrying about the overhead of plain old pthread_t.)
I'm using a much more expansive definition of "thread" than you are. Sure, in the right situation, maybe M:N threading, or full green threads, or whatever is the right implementation strategy. There's no reason that green threading has to involve the use of explicit "async" and "await" keywords, and it's these keywords that I consider silly.
(I agree that thread-per-request works just fine in the majority of cases, but it's still worthwhile to write about the cases where it doesn't work.)
Responding to your original post: you argue that async/await intends to solve the problem of data races. That's not why people use it, nor does it tackle that problem at all (you still need locks around shared data).
It only tries to solve the issue of highly-concurrent servers, where requests are bound by some resource that a request-handling threads have to wait for the result of (typically I/O).
Coroutines/fibers are not an alternative to async servers, because they need primitives that are either baked into the language or the OS itself to work well.
Coroutines/fibers are completely orthogonal to async anything. The OP is arguing against poor-man coroutines, aka stackless coroutines aka top-level yield only, which are significantly less expressive and composable than proper stackfull coroutines (i.e. first class one shot continuations).
An alleged benefit of stackless coroutines is that yield point are explicit, so you know when your state can change. The OP is arguing that this is not really a benefit because it yield to fragile code. I happen to strongly agree.
Green threads / coroutines / fibers are isomorphic with async keyword transparently implemented as a continuation passing style transform, which is how async callbacks usually work. Actual CPU-style stacks in a green thread scenario are nested closure activation records in an explicit continuation passing style scenario, and are implicit closure activation records (but look like stacks) when using an 'async' compiler-implemented CPS.
Properly composed awaits (where each function entered is entered via an await) build a linked list of activation records in the continuations as they drill down. This linked list is the same as the stack (i.e. serves the same purpose and contains the same data in slightly different layout) in a green threads scenario.
What makes all these things different is how much they expose the underlying mechanics, and the metaphors they use in that exposition. But they're not orthogonal.
(If you meant 'async' as in async IO explicitly, rather than the async / await keyword with CPS transform as implemented in C#, Python, Javascript, etc., then apologies.)
As you said, you can of course recover stackful behaviour by using yield/await/async/wathever at every level of the call stack, but in addition to being a performance pitfall (you are in practice heap allocating each frame separately and yield is now O(N): your iterpreter/compiler/jit will need to work hard to remove the abstraction overhead), it leads to the green/red function problem.
Please correct me if I'm wrong, but doesn't asyncio in the form of async/await (or any other ways to explicitly denote context switches) solve the problem of data races in that per-thread data structures can be operated on atomically by different coroutines? My understanding is that unless data structures are shared with another thread, you don't usually need locks for shared data.
async and threads are fundamentally different mechanisms. green threads (async) are scheduled by the runtime, threads are scheduled by the OS.
In CPython threads can be (theoretically) switched at every bytecode instruction. Since calls into extensions / the interpreter are a single instructions, many data structure updates (like dict[a] = b, list.append) will appear atomic from Python.
That being said it is rather rare to have multiple threads run an event loop and process requests in Python. If threads and async are combined in the same process in Python, then it's usually only one event loop thread, and a thread pool for background activity. Usually these will be synchronized through async (eg. tornado.concurrent.run_on_executor) -- but that has nothing to do with context switches.
Edit: Reread your post. I may have slightly missed the point :)
Yes. Often one will find/design that there is no shared state, or that shared state is modified completely between yield points, so no locks between coroutines needed.
In Java, NIO is often slower than threads. And am saying this as somebody who used >60k threads on 256 core machines vs NIO on the same for highly available transacted system.
Why are they silly though? Don't you need them to specify that an operation is in fact async and handle it accordingly? Or in your solution (something about locking?) is that not necessary because everything is safe? How does that work out though? How do I say "No computer, wait for this before we do that"?
Don't be focused on "requests". Requests (where most people mean HTTP requests) are one layer where you need concurrency, but in principal you need it on multiple layers.
E.g. at first you have a server that accepts multiple connections and each must be handled -> Thread per connection or one thread for all connections? If you go for threads you might even need multiples, e.g. a reader thread, a writer thread which processes a write queue and a third one which maintains the state for the connection and coordinates reads and writes.
Then on a higher layer you might have multiple streams per connection (e.g. in HTTP/2), where you again have to decide how these should be represented.
Depending on the protocol and application there might be even more or other layers that need concurrency and synchronization.
But the general approaches that you mention do still apply here: You can either use a thread for each concurrent entity and using blocking operations. Or you can multiplex multiple concurrent entities on a single thread with async operations and callbacks. Coroutines are a mix which provide an API like the first approach with an implementation that looks more like the second approach.
> You have a fixed-size thread pool of size N, and once you hit that limit, you can't serve anymore requests.
If your application acts as a stateless proxy between client machines and your persistence layer, can't you just spin up another instance and load balance them at any time? It's not the most efficient solution at scale, but lots of people use this strategy.
$10 is fully half the monthly fee for ATT DSL. So they are charging quite a bit more for data if you go over the cap.
I'd be less ticked off if the rate above cap were more similar to the rate below cap, and if the cap allowed saturating your link more than 9% of the time, which is what this cap does for DSL users with a typical 6.5 Mbps download speed.
To buy a connection that can actually be used 24/7 with the new price structure would cost well over $200/month. That's not cool.
When you do a buyback, the book value drops (company loses cash), but the discounted future cash flows remains unchanged. The number of outstanding shares also drops.
The net result is the stock price increases as a company accumulates cash and uses it for buybacks because the number of outstanding shares drops.
Another way of thinking about it is that it's the same as dividends, but the dividend only goes to the sellers of the stock during a buyback.