More

aisofteng · on Dec 23, 2023

Years ago my team was tasked with greenfield dev of a cloud native app while the platform/infrastructure was also evolving. We worked nights and weekends to get it done on time only to find out at the last second that the platform team had enforced controls on internal services being able to access the internet, requiring authentication to do so. This was news to us.

We were behind schedule and had, I think, three separately implemented/maintained/deployed services that needed to be able to access the internet to do their work. Rather than implementing the intended auth mechanism in each service, writing tests for it, going through code review, and redeploying, I instead added nginx to the base Docker image they all used, configured them to send requests to that nginx instead of as normal, and made that nginx instance man-in-the-middle our own services to attach a hardcoded HTTP header with the right creds.

I man-in-the-middled my own services as a hack - dumb but it worked. It was meant as a quick hack but stayed for I think a couple years. It did end up being eventually being the source of an outage that took a week to diagnose, but that's a different story.

aisofteng · on Dec 23, 2023

Not quite what was asked but a few of the stories here reminded me of this.

Years ago I was working on developing a new cloud native service. The particular microservice I was working on had to call out to multiple other services, depending on the user parameters. Java 8 had just come out and I implemented what I thought was an elegant way to spin up threads to make those downstream requests and then combine the results using these fancy new Java 8 stream APIs.

I realized at some point that there was a case where the user would want none of those downstream features, in which case my implementation would spin up a thread that would immediately exit because there was nothing to do. I spent a couple days trying to maintain (what I saw as) the elegance of the implementation while also trying to optimize this case to make it not create threads for no reason.

After a couple days I realized that I was spending my time to try to make the system sometimes do nothing. When I phrased it that way to myself, I had no problem moving on to more pressing issues - the implementation stayed as is because it worked and was easy to read/understand/maintain.

To this day, I avoid the trap of "sometimes make the system do nothing". One day, that performance optimization will be necessary, but that day has not yet arrived in the ~7 years since then.

aisofteng · on Nov 23, 2022

Nobody becomes a billionaire off salary. Generally speaking, billionaires come to be billionaires because they created something that has value in the billions.

Someone who has created their own worth saying that there are people being paid salaries without creating enough value to justify said salaries is not tone deaf. I bet most billionaires - at the very least, the self made ones, which is most of them in the US - have a better sense of value than most other people.

The comment isn’t about making too much money. It’s about being given too much money in exchange for what is produced.

aisofteng · on May 26, 2021

I see a lot of negativity here, and I do agree about the daily ping which I see you’re removing, which I think is a good move.

As someone that has done cloud development at a giant corporation that offers public cloud, I think that this tool absolutely has an enterprise use case. In my particular experience at my particular (tens of billion dollars plus yearly revenue) previous employer, our SREs would have massively benefitted from having this. I’ve been woken up at 3am while on call for L3 support just to step out SREs through tools we wrote to help them do their jobs that we failed to adequately train them on and/or that they didn’t have the bandwidth to absorb due to the volume of services they were tasked to support.

From that point of view, supporting macOS first is the right move in my opinion because at enterprises developers are standardizing on macOS, and so are SREs.

This can solve a real pain point. Best of luck! And if it may happen to be helpful, let me know if you’d like to chat about this use case in more detail.

aisofteng · on April 22, 2021

A strange comment to leave on a social media site.

jcpham2 · on April 26, 2021

Is that what this is? I thought it was news for hackers. Apparently everyone wants to make friends on random websites and talk to each other.

aisofteng · on Jan 8, 2021

As a fellow practitioner, I entirely agree. Actually, reading this article made something click for me regarding the oft discussed and denigrated “bias in AI” always brought up in discussions of the “ethics of AI”: there is no bias problem in the algorithms of AI.

AI algorithms _need_ bias to work. This is the bias-variance trade off: https://en.m.wikipedia.org/wiki/Bias–variance_tradeoff

The problem is having the _correct_ bias. If there are physiological differences in a disease between men and women and you have a good dataset, the bias in that dataset is the bias of “people with this disease”. If there is no such well-balanced dataset, what is being revealed is a pre-existing harmful bias in the medicinal field of sample bias in studies.

If anything, we should be thankful that the algorithms used in AI, based on statistical theory that has carefully been developed over decades to be objective, is revealing these problems in the datasets we have been using to frame our understanding of real issues.

Next up, the hard part: eliminating our dataset biases and letting statistical learning theory and friends do what they have been designed to do and can do well.

jjcon · on Jan 8, 2021

> AI algorithms _need_ bias to work. This is the bias-variance trade off: https://en.m.wikipedia.org/wiki/Bias–variance_tradeoff

To be clear, statistical bias is in fact distinct from the colloquial term ‘bias’ most people use - but they can be interpreted similarly if given the proper context (which you did)

YeGoblynQueenne · on Jan 8, 2021

In machine learning the "bias" that relates to the bias-variance tradeoff is inductive bias, i.e. the bias that a learning system has in selecting one generalisation over another. A good quick introduction to that concept is in the following article:

Why We Need Bias in Machine Learning Algorithms

https://towardsdatascience.com/why-we-need-bias-in-machine-l...

The article is a simplified discussion of an early influential paper on the need for bias in machine learning by Tom Mitchell:

The need for bias in learning generalizations

http://dml.cs.byu.edu/~cgc/docs/mldm_tools/Reading/Need%20fo...

The "dataset bias" that you and the other poster are discussing is better described in terms of sampling error: when sampling data for a training dataset, we are sampling from an unknown real distribution and our sampling distribution has some error with respect to the real one. This error manifests as generalisation error (with respect to real-world data, rather than a held-out test set), because the learning system learns the distribution of its training sample. Unfortunately this kind of error is difficult to measure and is masked by the powerful modelling abilities of systems like deep neural networks, who are very capable at modelling their training distribution (and whose accuracy is typically measured on a held-out test set, sampled with the same error as the rest of the training sample). It is this kind of statistical error that is the subject of articles discussing "bias in machine learning".

Inductive bias has nothing to do with such "dataset bias and is in fact independent from dataset bias. Rather, inductive bias is a property of the learning system (e.g. a neural net architecture). Consequently, it is not possible to "eliminate" inductive bias - machine learning is impossible without it! The two should absolutely not be confused, they are not similar in any context and should not be interpreted as in any way similar.

aisofteng · on Jan 7, 2021

This comment made me realize what my mentor did years ago. He would have the entire team list all possible solutions to a problem, including the obviously bad ones, then have us whittle down the list based on pros and cons of each until we reached consensus on what to do. It was a teaching exercise and I didn’t realize it.

I’ve repeated it that exercise with junior engineers to great effect. Some catch on over time and start intuitively considering the trade offs of a few reasonable solutions to a problem; some don’t.

I never reflected on what he was doing there; thanks.

jaggederest · on Jan 7, 2021

To add to that, some of the smartest "good" ideas come from "bad" ideas that people rejected out of hand, sometimes from unusual sources. It pays to do brainstorming thoroughly.

The best software engineers know when to solve a problem without using any code at all, like the classic "just do it manually" for a complicated task that is worth more than $x per task.

aisofteng · on Jan 6, 2021

> running anything related to AI involves GPU instances

This is not true. A _lot_ of AI applications use algorithms such as logistic regression or random forests and don’t need GPUs - partly, of course, because GPUs are so expensive and these approaches are good enough (or more than good enough) for many applications.

vsupalov · on Jan 6, 2021

Whoops, sloppy generalization on my part. You're completely right of course, thanks! I've been focusing on deep learning a lot lately, to the point where AI has become an alias for those exciting new GPU-heavy techniques.

aisofteng · on Dec 31, 2020

Why is “get more women into STEM” a worthy goal in and of itself?

whatshisface · on Dec 31, 2020

Well, it depends on whether women stay out of STEM because they don't care about programming, or whether they stay out of STEM because they're worried about being alone in a toxic environment. Ideally a diversity program would not influence group 1 while solving the problem for group 2.

aisofteng · on Jan 8, 2021

That’s a different problem. “Fix toxic environments that keep certain types of people out of an honest line of work” is a good problem to solve. That is not the same as “let’s increase the number of people of a certain demographic within an industry”, in and of itself, taken entirely on its own.

slashdev · on Dec 31, 2020

If you phrase the problem like that, there is no problem. Group 1 is tiny compared to group 2.

I think addressing group 2 is a desirable and valid goal.

Societal gender roles say nurses and teachers are women's work and STEM is man's work. That's the thing one would need to address, and at a young age. It's not entirely clear how one can do that. One "only" has to challenge the societal norms that define gender roles themselves. Not an easy problem.

whatshisface · on Dec 31, 2020

>Group 1 is tiny compared to group 2.

That is in my opinion an unjustified assumption. Huge numbers of men and women are uninterested in programming. In fact, virtually everyone I know of either gender is uninterested in programming. The number of people who don't program because they're not interested anecdotally dwarfs the number of people who don't program for any other reason.

slashdev · on Jan 1, 2021

I got the two groups mixed up. Just swap 1 for 2 in my comment, which I can't edit anymore. I fully agree with you.

aisofteng · on Dec 30, 2020

Watson is pretty racist himself, actually: https://www.vox.com/2019/1/15/18182530/james-watson-racist