I also did this a few months ago using a custom MCP server I built for the Alpaca API, the yfinance MCP server, and a reddit MCP server, and the "sequential thinking" mcp server. I hade claude write a prompt that combined them all together starting with checking r/pennystocks for any news, looking up the individual ticker symbols with alpaca and yfinance, checking account balance and making a trade only if a very particular set of criteria was met. I used claude code instead of desktop so that I could run it as a cron job, and it all works! I mostly built it to see if I could, not for any financial gain. I had it paper trading for a few months and it made a 2% profit on 100k. I really think someone that knows more about trading could do quite well with a setup like this, but it's not for me.
they said their paper account did 2% over a few months, which is not beating the s&p 500, and is probably why they said "someone could make money off this, but not me"
I'm curious because 2% over the last few months while the S&P 500 is tanking might be interesting, but doing worse than the S&P 500 over the same period is less so.
It's good at taking code of large, complex libraries and finding the most optimal way to glue them together. Also, I gave it the code of several open source MudBlazor components and got great examples of how they should be used together to build what I want. Sure, Grok 3 and Sonnet 3.7 can do that, but the GPT 4.5 answer was slightly better.
there's a $1 widget and a slightly better $10 widget.
if you're only buying 1 widget, you're correct that the price difference doesn't matter a whole lot.
but if you're buying 10 widgets, the total cost of $10 vs $100 starts to matter a bit more.
say you run a factory that makes and sells whatchamacallits, and each whatchamacallit contains 3 widgets as sub-components. that line item on your bill of materials can either be $3, or $30. that's not an insignificant difference at all.
for one-off personal usage, as a toy or a hobby - "slightly better for 10x the price" isn't a huge deal, as you say. for business usage it's a complete non-starter.
if there was a cloud provider that was slightly better than AWS, for 10x the price, would you use it? would you build a company on top of it?
It's not really the beginning (1.0) of anything - more like the end given that OpenAI have said this'll be the last of their non-reasoning models - basically the last scale-up pre-training experiment.
As far as the version number, OpenAI's "Chief Research Officier" Mark Chen said, on Alex Kantrowitz's YouTube channel, that it "felt" like a 4.5 in terms of level of improvement over 4.0.
That's a lot of other stuff, and you express disagreement.
I'm sure we both agree it's the first model at this scale, hence the price.
> It's not really the beginning (1.0) of anything
It is a LLM w/o reasoning training.
Thus, the public decision to make 5.0 = 4.5 + reasoning.
> "more like the end...the last scale-up pre-training experiment."
It won't be the last scaled-up pre-training model.
I assume you mean, what I expect, and you go on to articulate: it'll be last scaled-up-pre-training-without-reasoning-training-too-relesed-publicly model.
As we observe, the value to benchmarks of, in your parlance, scaled-down pretraining, with reasoning training, is roughly the same as scaled-up pre-training without reasoning training.
At some point, I have to say to myself: "I do know things."
I'm not even sure what the alternative theory would be: no one stepped up to dispute OpenAI's claim that it is, and X.ai is always eager to slap OpenAI around.
Let's say Grok is also a pretraining scale experiment. And they're scared to announce they're mogging OpenAI on inference cost because (some assertion X, which we give ourselves the charity of not having to state to make an argument).
What's your theory?
Steelmanning my guess: The price is high because OpenAI thinks they can drive people to Model A, 50x the cost of Model B.
Hmm...while publicly proclaiming, it's not worth it, even providing benchmarks that Model A gets the same scores 50x cheaper?
OpenAI have apparently said that GPT 4.5 has a knowledge cutoff date of October 2023, and their System Card for it says "GPT 4.5 is NOT a frontier model" (my emphasis).
It seems this may be an older model that they chose not to release at the time, and are only doing so now due to feeling pressure to release something after recent releases by DeepSeek, Grok, Google and Anthropic. Perhaps they did some post-training to "polish the turd" and give it the better personality that seems to be one of it's few improvements.
Hard to say why it's so expensive - because it's big and expensive to serve, or for some marketing/PR reason. It seems that many sources are confirming that the benefits of scaling up pre-training (more data, bigger model) are falling off, so maybe this is what you get when you scale up GPT 4.0 by a factor of 10x - bigger, more expensive, and not significantly better. Cost to serve could also be high because, not intending to release it, they have never put the effort in to optimize it.
See, you get it: if we want to know nothing, we can know nothing.
For all we know, Beezlebub Herself is holding Sam Altman's conciousness captive at the behest of Nadella. The deal is Sam has to go "innie" and jack up OpenAI costs 100x over the next year so it can go under and Microsoft can get it all for free.
Have you seen anything to disprove that? Or even casting doubt on it?
Versions numbers for LLMs don't mean anything consistent. They don't even publicly announce at this point which models are built from new base models and which aren't. I'm pretty sure Claude 3.5 was a new set of base models since Claude 3.
What do mean by "it's a 1.0" and "3rd iteration"? I'm having trouble parsing those in context.
If Claude 3.5 was a base model*, 3.7 is a third iteration** of that model.
GPT-4.5 is a 1.0, or, the first iteration of that model.
* My thought process when writing: "When evaluating this, I should assume the least charitable position for GPT-4.5 having headroom. I should assume Claude 3.5 was a completely new model scale, and it was the same scale as GPT-4.5." (this is rather unlikely, can explain why I think that if you're interested)
** 3.5 is an iteration, 3.6 is an iteration, 3.7 is an iteration.
Ah Operator. This synth is so deep. Not only is it a fantastic FM synth, but it does subtractive synthesis well too. Also, it really is impressive how the UI manages to fit all those parameters. I mostly use it for cool synth leads. Here's one of my favorite videos on Operator https://youtu.be/rfeY0_k1ctk?si=s68Lr033cHf34a4M by Robert Henke himself.
Yeah, I can confirm that writing windows GUI apps is not at all painful for me. I still use Windows Forms in .NET 4.8 and my executables are < 1mb, Visual Studio's form designer is very easy to use, you can subclass all the .NET UI controls and customize them however you want. There's always been accessibility and even support for high DPI.
.NET 4.8 is the last .NET to be bundled with Windows. It's a legacy stack, but it exists on every Windows >= 10 so it is a legacy stack that makes deployables easy (just assume it is installed). (.NET 4.8 is the new VB6.)
With .NET 9 right around the corner, how far behind the legacy stack is only increases.
.NET > 5 will never be installed out of the box on Windows PCs. The trade offs to that concession however are: cross-platform support, better container support, easier side-by-side installs support ("portable" installs). .NET > 7 can do an admirable job AOT compiling single-file applications. For a GUI app you probably aren't going to easily get that single-file < 40MBs yet today, but it's going to be truly self-contained and generally don't need a lot of specific OSes or things installed at the OS level. Each recent version of .NET has been working to improve its single-file publishing and there may be advances to come.
A nice thing about .Net Framework 4.8 is that they finally finished it! No more update treadmill and dicking around dealing with what versions are installed or how to configure your application to use whatever different versions. Just target that and forget about it.
I've tried all the popular yerba mate brands, smoked, flavored, Uruguayan, Argentinian, but I still prefer organic unsmoked Yerba Mate with stems. I brew 1/2 cup of mate with 2 cups of 150F water and a splash of lemon juice for 30 minutes, then pour the whole thing through a chemex coffee filter. It takes a few minutes to filter, but the result is a delicious, very caffinated, slightly lemony tea.
I own this tv, and I had it connected to the internet for a short while. Not only does it have ads everywhere, but the menu is SLOW. Taking it off the internet not only got rid of the ads, but also made menu navigation faster.
There have a few times that I presented a VB6 prototype along with a price and had my prototype purchased on the spot. Sometimes when a client says "We need this software now", they really mean it.
I don't know if this counts, but I built so much software in Access '97. Mostly for small businesses and individuals. I could build a whole inventory management system in a weekend(a simple one anyways). It was phenomenal. Once I learned Java and SQL(how to correctly use SQL, lol) I quit using it as much. But sometimes I still prototype software in old versions of Access just to model everything out.