> continuously monitoring the estimated model performance
Main point here. Model performance can degrade for any number of reasons and at varying rates. As a starting point, focus on setting up anomaly monitoring for a robust set of model eval metrics tailored to your task: loss, calibration, model staleness, etc. Timely alerts can give you sufficient time to dig in and root cause, roll back a model, etc.
Recently listened to this on a road trip and I can report that it kept me interested and awake :). This podcast quickly focuses in on the problem of boolean satisfiability and the history of SAT solvers for those unfamiliar with said topics.
In theory, this benefits complex business processes running across corporations/agencies/gov/etc. requiring a distributed ledger. For example, mineral mining/procurement/certification/etc is a complicated lifecycle across many actors. I can't remember which podcast I heard it on, but the suggestion was anywhere there was a "clearing house" in use by multiple corporations for a particular process, there was an opportunity for blockchain/smart contract use.
In practice, I have yet to see anything concrete, but I haven't exactly been looking hard.
> In practice, I have yet to see anything concrete, but I haven't exactly been looking hard.
"In theory yes, in practice no" - sums up every explanation I've seen of whether blockchain technology could be applied to a particular problem. I still haven't seen a "killer app" that's not better suited to a traditional database.
Bitcoin became useful because it's an unregulated currency that has enough acceptance to be liquid and enough anonymity to be used for clandestine purposes. As soon as you try to use blockchain tech for "traditional" transactions you end up eating the computational cost of a distributed ledger for no apparent benefit.
Having built something similar with RabbitMQ in a high-volume industry, there are a lot of benefits people in this thread seem to be glossing over and are instead debating semantics. Yes, this is not "exactly once" -- there really is no such thing in a distributed system. The best you can hope for is that your edge consumers are idempotent.
There is a lot of value derived from de-duping near ingress of a heavy stream such as this. You're saving downstream consumers time (money) and potential headaches. You may be in an industry where duplicates can be handled by a legacy system, but it takes 5-10 minutes of manual checks and corrections by support staff. That was my exact use case and I can't count the number of times we were thankful our de-duping handled "most" cases.
> Instances get role data from the metadata service, but containers can't access that metadata and should access the local ECS agent instead (which has its own API).
Just a quick aside, but is this can't or shouldn't? I'm 100% positive you can use something like instance profile credentials from within a container (which loads credentials from the instance metadata service).
I think I agree that there's definitely a lot of depth to topics that should be covered here, and whether you want to go down the rabbit hole will vary based on org size and features you're using.
I'd personally prefer: 1. deep-dives into best practices for each feature as opposed to an on the surface glance.
2. enable it with examples. Include CloudFormation or Terraform scripts to set up each piece so that we actually build something. Documentation is important, but you can't learn without doing.
3. test against the security you've put in place.
Technically, shouldn't. But in AWS' documentation for container roles, they have a note that explicitly suggests implementing a iptables rule (and even provides the iptables command) to prevent access to the instance's metadata.
That said, this is another of those "more ink should be spilled" moments, since preventing access to the instance metadata is something that you SHOULD do from a security point of view.
I don't recall Task Roles being a thing when I started using EC2 Container Service. For container security and isolation, that makes a whole lot of sense.
Yes this will forever be the case and engineers should get over the fact that solutions will forever rotate and be reintroduced. This is due to several possibilities: software ages, becomes unmaintained, company goes out of business, becomes too bloated, etc.
There are also trends and fads to consider, such as social media.
There is also human habits to consider. Not everyone knows about or is signed up for slack and uses it. They may eventually find out or not.
Until a firm starts enforcing a tool like Slack as the default mode of communication organization wide; and worse still if you are expected to be, or kind of, always on.
I would prefer the image to look identical. Each host is going to have a unique name, and you're going to outsource your routing/discovery to the Client/another service anyway right?
Generally though, if there is some custom config data that must be passed on BOOTSTRAPPING the node, you can use templated values for user-data on cloud-init based systems, or the remote_exec provisioner. This can even include the modified shard name (with count). If you're talking about updates AFTER the machine has been in service, I do not believe this is 'officially' supported by Terraform but it can be done [0]. You may want to look into a proper CM tool w/ idempotency for that sort of thing (Chef, Salt, etc.)
Oh, no, definitely the former approach. All right. That sounds perfectly all right to me. I was wondering if there was a different way people did things.
Some basic conditional logic was added in 0.8 [0]. The reliance on 'interesting' declarative tricks, e.g. counts for conditionals and loops, is a little annoying I agree. However, it looks like they're listening to the community and will build on these features in the future. There's a lot to work on :)
Great changes! State environments is the feature that stands out the most for me. Currently, this is done by managing different sets of TF VARs. That's not going away, but this should allow for more nuanced modules.
Main point here. Model performance can degrade for any number of reasons and at varying rates. As a starting point, focus on setting up anomaly monitoring for a robust set of model eval metrics tailored to your task: loss, calibration, model staleness, etc. Timely alerts can give you sufficient time to dig in and root cause, roll back a model, etc.