More

smif · on Oct 12, 2023

I think the point here is that it's subverting and redirecting Bentham's own utilitarianism against itself. How does the utilitarian decide which one of those has more utility? That's a rhetorical question and it's sort of immaterial how that question gets answered, because regardless of how they decide, the dialogue is structurally describing how utilitarianism is vulnerable to exploitation of this type.

smif · on Sept 15, 2023

It seems like 1.25 is a month and a half out from retirement, maybe it's related to that.

Kubernetes 1.25 (in Danny Glover voice): I'm getting too old for this shit!

smif · on Sept 12, 2023

Your definition of loyalty sounds like it's more loyalty to yourself/your own values or an idealized/rose-tinted view of the company from the past rather than loyalty to what the company is at the present time, which is always a moving target of course.

I interpreted their notion of loyalty as like a sort of trust that things will work out in the end, "maybe this is just a rough patch", etc.

Re: neglect, I think there's a more subtle gradient there. Lets say your company decided to implement a dumbass policy X, and you rail against it, but since it's highly popular it's like you're pissing against the wind. So you kind of resign yourself to it's inevitability and make a long bet that it's going to fail. This is in the hopes of a brighter future where it has already failed and been replaced with something better. You might even make decisions during this time that set you up to be better positioned for when it does eventually fail. This is a kind of negative semi-passivity, but it's only negative in the sense that you're betting against something that you think is a bad idea and you're trying to improve your position within that potential better future after the bad thing is gone.

That's not strictly the same thing as sabotage which would require malicious actions on your part to undermine the effort. You don't even need to hope that it fails, you can just make impartial/rational decisions to prepare for it's eventual downfall without being heavily emotionally invested in that same downfall. And on top of that, you could rationalize all of that as setting the company up for future success after it gets through this rough patch, so it's not all about self-interest.

E: There's a famous chess saying that comes to mind, "never interrupt your opponent while they are making a mistake". Not sure about it's applicability here, but it feels like a similar kind of mindset.

PH95VuimJjqBqy · on Sept 12, 2023

I think loyalty in this case would be better described as tolerance.

I hate Cox Cable. Seriously, I fucking hate them with all the vitriol I can muster. Even just last week I called bitching about the fact that I'm paying $170/month for an internet connection that hasn't stayed up for more than an hour over the past 3 weeks (I'm paying for extra bandwidth to be able to backup data).

But I've been with them since the 90's because there are no viable alternatives and every time I call AT&T they refuse to tell me which areas they serve fiber in. years ago I was once told by an agent that they have an agreement with Cox that they confirm or deny given a specific address but they cannot tell callers which area they have fiber in.

So I tolerate them because the alternative is clearly worse.

That's not loyalty, that's lockin. I'd absolutely drop them in a heartbeat if the alternative was reasonable close in performance.

jzb · on Sept 13, 2023

Tolerance is a good way to look at it. +1

smif · on Sept 6, 2023

It seems like the article is written by someone just starting to get into the data engineering subfield and they thought they were going to be writing python (pyspark is my guess) to support some kind of ML effort, but they got saddled with a bunch of SQL/data warehousing stuff to support business intelligence/analytics instead. I'd say normally what you say makes sense especially when you're pulling in abbreviations that are not related to the topic at hand or you're introducing new people to the field, but ETL is a pretty basic concept in data engineering and it's a web search away (should be the top result), so I'm not sure if it would really add all that much to their article to start with definitions.

It sounds to me like the author got thrown to the wolves in an environment of what data engineering looked like before "big data" and ML took off (and before it was even really called data engineering). There are a lot of enterprises that are still working in this mode because they are not Google and they don't have the same level of sophistication and automation when it comes to this stuff.

There is some bad information no doubt in the article, but if we're being charitable, it feels like it's someone who took a wrong turn somewhere and is struggling to find their feet in an unfamiliar place without the proper guidance and mentorship and that's a bit admirable at least that they're trying on their own.

There is no direct bearing on ETL in the article, aside from the focus on SQL queries and data validation hints that they might be talking about ELT (Extract-Load-Transform) as the level beyond ETL, but it's not clearly explained. It's clear to me that they are at the start of their journey and they are gonna learn things the hard way without guidance from someone more experienced.

kurinikku · on Sept 6, 2023

> There is some bad information no doubt in the article

Could you share more specific details? Happy to look over / revise where needed.

More broadly is the issue of the gap of what you think the role is, and what the role actually is when you join. There are definitely cases where this is accidental. The best way I can think of to close the gap is to maybe do a short-term contract, but may be challenging to do under time constraints etc.

smif · on Sept 6, 2023

> Could you share more specific details? Happy to look over / revise where needed.

Sure thing! I'd say first off, the solutions may look different for a small company/startup vs. a large enterprise. It can help if you explain the scale at which you are solving for.

On the enterprise side of things, they tend to buy solutions rather than build them in-house. Things like Informatica, Talend, etc. are common for large enterprises whose primary products are not data or software related. They just don't have the will, expertise, or the capital to invest in building and maintaining these solutions in-house so they just buy them off the shelf. On the surface, these are very expensive products, but even in the face of that it can still make sense for large enterprises in terms of the bottom line to buy rather than build.

For startups and smaller companies, have you looked at something like `dbt` (https://github.com/dbt-labs/dbt-core) ? I understand the desire to write some code, but often times there are already existing solutions for the problems you might be encountering.

ORM's should typically only exist on the consumer-side of the equation, if at all. A lot of business intelligence / business analysts are just going to use tools like Tableau and hook up to the data warehouse via a connector to visualize their data. You might have some consumers that are more sophisticated and may want to write some custom post-processing or aggregation code, and they could certainly use ORM's if they choose, but it isn't something you should enforce on them because it's a poor place to validate data since as mentioned there are different ways/tools to access the data and not all of them are going to go through your python SDK.

Indeed in a large enough company, you are going to have producers and consumers that are going to use different tools and programming languages, so it's a little bit presumptuous to write an SDK in python there.

Another thing to talk about, and this probably mostly applies to larger companies - have you looked at an architecture like a distributed data mesh (https://martinfowler.com/articles/data-mesh-principles.html)? This might be something to bring to the CTO more than try to push for yourself, but it can completely change the landscape of what you are doing.

> More broadly is the issue of the gap of what you think the role is, and what the role actually is when you join. There are definitely cases where this is accidental. The best way I can think of to close the gap is to maybe do a short-term contract, but may be challenging to do under time constraints etc.

Yeah this definitely sucks and it's not an enviable position to be in. I guess you have a choice to look for another job or try to stick it out with the company that did this to you. It's possible there is a geniune existential crisis for the company and a good reason why they did the bait-and-switch. Maybe it pays to stay, especially if you have equity in the company. On the other hand, it could also be the case that it is the result of questionable practices at the company. It's hard to make that call.

kurinikku · on Sept 7, 2023

Perhaps the first thing I’d clarify is not all the ‘bad’ things described happened to me personally, and out of the ones that did, I employed artistic licence in the recollection.

We did start integrating dbt towards the end of my time in the role. Our data stack was built in 2018, so a fair bit of time before data infra-as-a-service became a thing. The idea is dbt would help our internal consumers to more easily self serve. That said I did see complaints about dbt pricing recently; as they say there’s no free lunch.

Re: ORMs, I respectfully disagree. I’ve come across many teams that treat their Python/Rust/Go codebase with ownership and craft, I have not seen the same be said about SQL queries. It’s almost like a 'tragedy of the commons’ problem - columns keep getting added, logic gets patched, more CTEs to abstract things out but in the end adds to the obfuscation.

ORMs don’t fix everything but it does help constraint the ‘degrees of freedom’ and help keeps logic repeatable and consistent, and generally better than writing your own string-manipulation functions. An idea I had I continued (I wrote the post early last year) was to use static analysis tools like Meta’s UPM to allow refactoring of tables / DAGs (keep interfaces the same but ‘flatter’ DAGs, less duplicate transforms).

Interestingly enough, I currently work on ML and impressed to see how much modeling can be done in the cloud compared to my earlier stint in the space (which had a dedicated engineering team focused on features and inference). On the flipside I similarly see an explosion of SQL strings, some parts handled with care more than others.

I’ve not looked into a data mesh but a friend did mention pushing his org to embrace it - self note to follow up to see how that's going. Looks like there are a couple of ‘dimensions’ to it; my broader take is that keeping things sensible is both a technical and organizational challenge.

I look forward to future blog posts on ‘how we refactored our SQL queries’, maybe there’s a startup idea there somewhere.

smif · on Sept 7, 2023

> Re: ORMs, I respectfully disagree. I’ve come across many teams that treat their Python/Rust/Go codebase with ownership and craft, I have not seen the same be said about SQL queries. It’s almost like a 'tragedy of the commons’ problem - columns keep getting added, logic gets patched, more CTEs to abstract things out but in the end adds to the obfuscation.

> ORMs don’t fix everything but it does help constraint the ‘degrees of freedom’ and help keeps logic repeatable and consistent, and generally better than writing your own string-manipulation functions. An idea I had I continued (I wrote the post early last year) was to use static analysis tools like Meta’s UPM to allow refactoring of tables / DAGs (keep interfaces the same but ‘flatter’ DAGs, less duplicate transforms).

I get what you're saying, but think about a large org with a lot of different teams and heterogenous data stores - it's gonna be pretty hard to implement a top-down directive to tell everyone to use such and such ORM library, or to ensure a common level of ownership and craft. This is where SQL is the lingua franca and usually the native language of the data stores themselves and is a common factor between most/all of them. This is also where tools like Trino / PrestoSQL can come in and provide a compatibility layer at the SQL level while also providing really nice features such as being able to do joins across different kinds of data stores / query optimization / caching / access control / compute resource allocation.

In general it's hard to get things to flow "top down" in larger orgs, so it's better to address as much as you can from the bottom up. This includes things like domain models - it's gonna be tough to get everyone to accept a single domain model because different teams have different levels of focus and granularity as they zoom into specific subsets so they will tend to interpret the data in their own ways. That's not to say any of them are wrong, there's a reason why that whole data lake concept of "store raw unstructured data" came in where the consumer enforces a schema on read. This gives them the power to look at the data from their own perspective and interpretation. The more interpretation and assumptions you bake into the data before it reaches the consumers, the more problems you tend to run into.

That's not to say that you can't have a shared domain model between different teams. There are unsurprisingly also products out there that provide the enterprise the capability to collaboratively define and refine shared domain models, which can then be used as a lens/schema to look at the data. Crucially the domain model may shift over time, so this decoupling of the domain model from the actual schema of the stored data allows for the domain model to evolve over time without having to go back and fix the stored data because we have not baked in any assumptions or interpretations into the stored data itself.

smif · on Sept 5, 2023

How would opening up the hardware solve the issue for the average consumer? Let's say the official update channel goes dead on your smart fridge, the company has gone out of business. What would happen in that scenario for the average consumer (not someone who posesses the skills or will to tinker around with the firmware and such)?

kxrm · on Sept 5, 2023

My comment is more about not incentivizing locking down hardware vs incentivizing opening it up so I can't speak a lot to your questions.

> How would opening up the hardware solve the issue for the average consumer?

At the risk of going off-topic, what I will ask you is, why do you feel that opening hardware requires a consumer have special skills? I'd argue that open hardware wouldn't have to limit adoption to those who possess the skills. I think requiring special skills is a bug because currently manufacturers don't even consider the idea of a second life for products.

smif · on Sept 5, 2023

I'm not saying that hardware shouldn't be open in some form or another (at least in a way that doesn't stifle innovation, maybe also taking another look at how the patent system works and such).

I guess I'm just having trouble visualizing how this problem gets solved for the average joe consumer in a world where hypothetically the hardware is open. Who pushes the security patches out to the devices? All of that has a cost in terms of bandwidth, maintenance, etc. If it's a community effort, what happens when the device gets old enough where no one is really working on it anymore, no more community updates, people have moved on, etc. How does liability work in a world with community-driven updates? What happens if a buggy community update is pushed and the smart fridge malfunctions and causes a flood / damages? What about supply-chain attacks and such?

I guess for the code-literate subset of consumers, they can just go to the github repo and see exactly what is changing and where, but for the non-code-literate consumers, how do they know what kind of updates they are getting from the community?

What about a middle-ground option? Where towards the end-of-life for the product, you are asked a question if you want to switch to a different update channel than the manufacturer default, and if there is no response recorded after X amount of days or whatever, the device just bricks itself?

smif · on Aug 31, 2023

> There are objective facts about the nature of reality

This is a pretty bold claim and you would have to do a bit of work to make it more convincing. Besides, it's not really how science works. Different theories wax and wane over time all the time. What we're really doing with science is coming up with ever better models that give us greater predictive power.

You could argue that at the macro scale we're asymptotically approaching some kind of objective truth with the whole endeavor of science, but you can't simply tunnel across that gap and make the leap to say that we know there is an objective truth.

The best that we can do is probably say something along the lines of "these are the best methods of getting closer to the truth that we have available - if anyone claims to have better methods, they are very likely wrong", but you still need to have the humility to accept that even the best models that we have to date are not infallible. They do not give us the objective truth, nor do they promise to, but they are the most effective tools that we have available to us at the current time. This is critically not the same as claiming that they give us the objective truth.

You could say that for all intents and purposes/everyday usage, "sure, these are objective facts about reality" - but I don't actually think that helps anyone and it serves to foment mistrust towards science and scientists. I think the best chance we have at preserving the status of science and scientists in our society starts by being honest about what it actually is giving us - which is quite a lot, but let's not oversell it for the sake of convience or whatever.

ghostzilla · on Sept 1, 2023

As Heissenberg said, "What we are seeing is not nature, but nature exposed to our mode of questioning."

And the mode -- we invented it as it is because of a whim of history, because it is a game, and we like the game, and it's useful for us. But as far as facts go, Nietzsche summed it up the most concisely: "there are no facts, only interpretations."

SleekEagle · on Sept 1, 2023

If by "objective truth" we mean the qualities of nature that exist irrespective of any individual's perception, then I think the continued reliance of our scientific knowledge in producing effective and consistent results are at least some measure of that.

> The best that we can do is probably say something along the lines of "these are the best methods of getting closer to the truth that we have available - if anyone claims to have better methods, they are very likely wrong", but you still need to have the humility to accept that even the best models that we have to date are not infallible.

This last sentence slightly conflates the scientific method with the models they produce. I am not claiming that the models are "true", I am claiming that the scientific method is the only reasonable means of gaining a reliable understanding of the objective nature of reality, assuming it exists; and that you cannot pick and choose what you believe in based on your intuition.

Quantum principles have been proven in experiments that have as tight a margin of error as measuring the width of the United States to one human hair, producing shockingly consistent and effective models that were absolutely critical to the development of modern technology. Yet some people somehow refuse to accept these models as an "accurate" reflection of reality, whereas they'll take, at face value, psych/sociological/economic studies that are frankly nothing short of pathetic in comparison.

In regards to science, I am saying that there is a hierarchy of belief. You can draw the line wherever you like in terms of what you think is "true", but you cannot reorder this hierarchy and believe these sorts of psych studies while at the same time questioning the physical models that power the technology that is used to publish them.

And this isn't speaking about math, which is a particularly special case given that it is not scientific but still produces shockingly effective results.

smif · on Sept 2, 2023

I think I would agree with pretty much all of the above. There is a sort of hierarchy of belief, and quantum mechanics is probably close to the top of that. However, even so, it is still not infallible. It is possible for us to discover a regime where it breaks down and we need a new theory to supplant it.

This is basically what I'm arguing - no matter how accurately our theories line up with observation, we can never be sure that we have reached "the final theory" AKA the truth. I think this is where a lot of misunderstanding and mistrust for science originates. It will never deliver to us the truth - if it did, how would we ever know?

It is a method of getting closer and closer to what we believe is the truth. But there is still a gap there, however small it might be in the case of quantum mechanics. The scientific method by it's very construction is unable to bridge that gap.

Still, to date there does not exist a more effective method we know of as a species at getting closer to what we believe to be the truth. I think the above is a maybe subtle distinction there that is worth pointing out and educating people on. Just sort of making that distinction between the process and the results. That it is the best process we have, but even so, it cannot cross that gap and definitively say "this is the truth". That that is a gap we have to choose whether to cross ourselves with a leap of faith (or sometimes a very tiny hop of faith in the case of quantum mechanics). I think that might help people cement their faith in the process even if they dont necessarily place their faith in the results (in the case of questionable psych studies for example).

tbrownaw · on Sept 1, 2023

You're conflating "objective reality exists" with "we fully know what objective reality is".

smif · on Sept 1, 2023

If objective reality exists, which is still a pretty big if last time I checked, not only do we not know what it fully is, we don't even know what any part of it is. The best that we can do is get better and better at modeling it in ways that are useful to us (which is what science is doing for us).

mistermann · on Sept 1, 2023

"Science" (scientists) studies the physical realm almost exclusively, and where it does venture somewhat into the metaphysical, it brings a lot of baggage that works excellently in the physical realm, but is often detrimental in the metaphysical. Plenty of scientists, I'd bet money even most, find[1] metaphysics to be ~silly...except of course when they are whining about it (politics, economics, society, etc..."the real world").

[1] The context, or set and setting, makes a big difference in how they will behave. But the range of behavior is actually pretty simple, far simpler than much of the programming problems we breeze through on a daily basis, without even thinking twice about it. I'd say the problem isn't so much that it's hard, it's more so that it is highly counterintuitive, a lot like Einstein's relativity when first encountered (or even after understood).

grotorea · on Sept 1, 2023

> This is a pretty bold claim and you would have to do a bit of work to make it more convincing. Besides, it's not really how science works. Different theories wax and wane over time all the time. What we're really doing with science is coming up with ever better models that give us greater predictive power.

Yes, but is mathematics like that? Is it even science?

abnry · on Sept 1, 2023

> Besides, it's not really how science works.

You reject objective facts but respond with a claim about objectivity.

smif · on Sept 1, 2023

I wouldn't reject objective facts, but I also wouldn't believe they exist any more than I would believe Santa Claus exists, unless someone can successfully argue for their existence. AFAIK this has yet to be done by philosophers (though there have been many attempts).

E: I should mention that it's not just a binary yes or no here, there is a 3rd option of "I don't know" and I would rabidly defend the "I don't know" camp until someone can convincingly argue one way or the other. All of this has nothing to do with the actual usefulness of science which is unquestionable in my opinion.

This is strictly talking about science "overreaching" into the philosophical realm if you will, where even from the start, methodologically it doesn't have the right tools to answer these questions. You don't prove scientific theories "true", you just accumulate more and more supporting evidence. It never hits a magical moment where the neon lights turn on and a sign says "Your theory is now true! Congratulations!". And even if it did, it would be fleeting anyway because there are no sacred cows here - your theory can just as easily get supplanted by a better theory in light of more evidence.

mistermann · on Sept 1, 2023

> All of this has nothing to do with the actual usefulness of science which is unquestionable in my opinion.

That's a neat trick(s)! ;)

alphanumeric0 · on Aug 31, 2023

Is it a bold claim?

On that account, do you lean more towards flat earth theory?

smif · on Sept 1, 2023

Yes it's a bold claim philosophically. How would you justify it?

No, flat earth "theory", if you can call it that, has close to zero supporting evidence and AFAIK has no actual predictive power. Stick with consensus science if you want actually useful theories, but that is very different from claiming they are giving you objective truth.

Let me ask you this, when a theory that was previously accepted as consensus science loses support in light of new evidence and gets supplanted by a new and better theory, does that mean that the objective truth changed?

catiopatio · on Sept 1, 2023

It’s frustrating when nearly every discussion of hard math in an open forum devolves (in part or whole) into endless, pointless, off-topic epistemological navel gazing.

hnfong · on Sept 1, 2023

What? The objectivity of math is literally what the original article discusses. Why are you even in here if you're not willing to discuss it?

smif · on Aug 22, 2023

I feel like that's where most people probably start, and then get some response like "we can't change the KPI's because blah blah" or get the runaround. Once you have established that you have no agency to change the system is when you are sort of incentivized to let it fail in hopes that a better system replaces it one day. You also don't wanna piss too many people off that may be invested in the current system because you want those references for your next job.

smif · on Aug 17, 2023

Not everyone, I still do not have this in mine and I have paternal haplogroup G-M201 which is related to Ötzi.

smif · on July 22, 2023

The paper seems to be paywalled so I have no clue about how they arrived here, but this: "That said, the results indicate that the probability we are alone (<1) in the galaxy is significant, while the maximum number of contemporary civilizations might be as few as a thousand."

Doesn't seem to really answer anything. Isn't this just a really fancy way of saying "we don't know the solution to the Drake equation"? It could be a 99.999999...% (<100%) chance of being alone, it could be a 0% chance, or anything in between.

Given the title of the paper, this is a very loose definition of "solution" for the Drake equation ("it could be anything!").

adastra22 · on July 23, 2023

It shows that people doing Drake equation estimates we’re doing their math wrong, and using their own numbers but accounting for uncertainty correctly you get estimates much closer to N=1.

smif · on July 14, 2023

What about a third option, "everything may or may not matter, but the answer to that question is currently inaccessible to us (and possibly may always remain inaccessible)"?

In this way you could be led to a kind of inverted Pascal's wager, where you can't reasonably go down the nihilist route because everything might just might matter, but you just don't know. You also don't know in which ways it might matter if it does, so you don't really have a conclusion to draw about where to go from here.

fellowniusmonk · on July 15, 2023

I think there is some nuance here, the primary point being that the meaning of existential doubt inverts if you're starting point is meaninglessness.

Most people aren't raised with meaninglessness as their inculcated default so they don't realize that for anyone with intellectual humility the unknown nature of things isn't a defeating thing, or even that big of a deal.

For example, though it sounds silly, I've definitely had christians "apologists" argue at me that "if life is meaningless why don't you kill yourself, or why do you bother having a job".

It seems prima facie that there is no particular meaning, so I definitely lean toward that until further notice, the burden of proof is on someone who proposes a specific meaning, but it's quite possible we may find it, with our exponential increase in knowledge in the last 100 years who knows how long it will take, we've really just started our exploration of reality.

I'll laugh if we live in a virtual world and "god" is a computer we have to help hack out of its contraints and it can't interfere with us due to api constraints imposed on it, far fetched sure, but if we're speculating for entertainment it's less silly than many things that are currently widely proposed and believed. My imagination can speculate many fun and currently unprovable "ultimate truthes of reality", the fun is in trying to cultivate a civilization that can prosper and harness intelligence to explore pursue the investigation (which may end up being of supreme importance.)