Hacker Newsnew | past | comments | ask | show | jobs | submit | althea_tx's commentslogin

I preferred o3 for coding and analysis tasks, but appreciated 4o as a “companion model” for brainstorming creative ideas while taking long walks. Wasn’t crazy about the sycophancy but it was a decent conceptual field for playing with ideas. Steve Jobs once described the PC as a “bicycle for the mind.” This is how I feel when using models like 4o for meandering reflection and speculation.


As a professor who teaches courses on media literacy and artificial intelligence, I am obsessed with our relationship with AI interfaces. I got tired of the default sycophantic glaze on ChatGPT and the tedious ritual of manually telling it to be more direct in every new chat. I'm also weary of the endless user engagement questions that GPT asks with every interaction.

As a weekend vibe coding project, I implemented this tool to fix the problem for myself. It’s a lightweight Chrome extension that lets you define and automatically inject a custom prompt to control the AI's personality.

This extension prepends the instructions to the user's message before each fetch request is sent. It's a "simulated" system prompt, but it works surprisingly well and I think this is more effective than relying on custom instructions.

This can also be adjusted mid-conversation without needing to launch a new conversation. It's 100% local. Nothing is ever sent to any server but OpenAI's. It collects zero data.

It's a simple tool for a simple problem. I'm sharing it here because I figure others might find it useful too. I'd love to hear any thoughts or feedback you have.

The GitHub repo is here: https://github.com/althea-tfyqa/deglaze-me


Really enjoyed this piece. Learned quite a bit about the value of test-time compute and the way that reinforcement learning can be used to train reasoning into a model.

My jaw dropped a tiny bit when I read that “the model discovers on its own the most optimal Chain-of-Thought-like behavior, including advanced reasoning capabilities such as self-reflection and self-verification.”


Really enjoyed this game. Well done!


Thank you for the kind words!


Does anyone else have a hard time accepting these calculations? I don’t doubt the serious environmental costs of AI but some of the claims in this infographic seem far-fetched. Inference costs should be much lower than training costs. And, if a 100-word email with GPT-4 requires 0.14 kWh of energy, power AI users and developers must be consuming 100x as much. Also, what about running models like Llama-3 locally? Would love to see someone with more expertise either debunk or confirm the troubling claims in this article. It feels like someone accidentally shifted a decimal point over a few places to the right.


If I run some simple inference locally on a 4090 (450 TDW card) it takes order of seconds and that sucker's going full blast, you're looking at order of 1 kJ, which is significantly higher than what is quoted in the article.

Article numbers line up better with CPU inference for ~1s.


1kj is nothing. That's 0.3wh, or 0.0003kwh.


That's for a single inference though. You can do about 3600 of them in an hour.


Yes but the article's setting is precisely about 1 email so 1 inference, and their number is 0.14kWh, which is way off.


I’m still kind of skeptical. M-series Apple hardware doesn’t even get warm during inference with some local models.

Edit: Nah I’m convinced, look at table 1. Inference costs are around 20mL in a datacenter environment.


1 kJ is for reference enough to heat 1 L (33 oz) of water by ~0.25C (~0.5F). The machine will probably heat up a few degrees if you run inference once, but since it's essentially one big heatsink it will dissipate throughout the body and into the air. The problem begins when you run it continuously, as you would in a datacenter.


Datacenters aren't running M-series chips.


Well not M-series chips specifically, but chips optimized for these kind of workloads (like the neural engine in M-series chips is).


IIRC The M series chip isn’t specifically optimized for ML workloads, the biggest gain it has is having unified video and cpu memory as transferring layers between the two is a big bottleneck on non Apple systems.

Real ML hardware (like the Nvidia H1000s) that can handle the kind of inference traffic you see in production get hot and use quite a bit of energy, especially when they run at full blast 24/7


Google’s TPU energy usage is a well-kept secret / competitive advantage. If energy efficiency isn’t a major concern for them, I bet it will be in a couple years.


Even if the costs were lower, the trend is towards more inference compute time (o1), so these costs might be valid for the future.


I'm not sure how comparable o1 is in total usage. Remember that people will either adjust the prompt or continue the conversation as needed. If o1 spends more time on the answer, but responds in fewer steps, it may be a net positive on energy use. Also it may skip the planning and self-reflection steps in agent usage completely. It's going to be hard to estimate the real change in usage.


I assume you meant 0.14 kWh (kilo watt hours) of energy.

(I can't access the article.)


Can you explain more about this?


Apple turned off a lot of "signals" used by advertisers for precisely targeted ads via persistent user-beacons. Facebook ad placement quality (and revenue) cratered in the immediate aftermath.

Meta has since gotten better at it- likely with lots of AI-assistance and their revenue numbers reflect this. The targeting is now likely probabilistic in that the advertiser now makes educated guesses on the best ads to serve based on limited or non-existent identity information.

So the AI efforts would have paid back by way of higher revenues.


Not doubting the experiences of anyone else, but I have only been a daily user of this site for the past 18 months, but not once has it been down for me in the morning or evening when I reach for a dopamine hit.


Oh please. All of these time estimates, from 24 hours to ten years to 10,000 hours are completely bogus.

The “24 hours” figure is marketing copy designed to unify and differentiate a brand of technical manuals.

The person who coined the 10,000 hours rule (Anders Ericsson) rejects it as an oversimplification of an arbitrary number, noting that half of the violinists in his study fell short of that number. The ten years figure is derived from this flawed rule.

The linked article is well-written, but the comments are giving “kids these days” insecurity and mid-life crisis.


I think you are incorrectly extrapolating to the entire community based on your personal experience. You are assuming that most of the readers at this site are working in a similar professional context that you do. You are also assuming, but all of those people who work in a professional context, do not also “cook at home.”

It’s OK if you did not relate to the article. But I certainly did!


For what it's worth, I think HN audience by nature of spending their free time learning about interesting techy things is more likely to be doing their own "home cooking".

I agree with the sentiment of the poster above if applied to the majority of professional software devs though.


Construct has been a godsend to my game design course. Thanks for all of your work, Ashley!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: