I don't have an iPhone to try this, but I've been a long time time user of Tasks...

sburud · 2025-11-04T17:04:45 1762275885

That’s cool! Slight fear of replicating the Dropbox comment here, but all you really need to do is run whisper (or some other speech2text), then once the user stops talking jam the transcript through a LLM to force it into JSON or some other sensible structure.

raybb · 2025-11-04T18:02:59 1762279379

"once the user stops talking" is a key insight here for me. When using this I wasn't intentionally pausing to let it figure out an answer. It seemed to just pop up while I was talking. But upon experimenting some more it does seem to wait until here's a bit of a pause most of the time.

However it's still wild to me how fast and responsive it is. I can talk for 10 seconds and then in ~500ms I see the updates. Perhaps it doesn't even transcribe and rather feeds the audio to a multimodal llm along with whatever tasks it already knows about? Or maybe it's transcribing live as you talk and when you stop it sends it to the llm.

Anyone have a sense of what model they might be using?

makingstuffs · 2025-11-04T18:06:27 1762279587

I cannot remember off the top of my head the exact number and am clearly too lazy to google it but there is a specific length of time in which, if no new noises pass through, the human brain processes it as a pause/silence.

I want to say 300ms which would coincide with your 500ms example

wisemang · 2025-11-04T18:52:00 1762282320

This is definitely dependent on individuals. It’s a reason during some conversations people can never seem to get a word in edgewise, even if the person speaking may think they’re providing opportunities do so. A mismatch in “pause length” can make for frustrating communications.

I am also too lazy to google or AI it but it’s something I remember from when I taught ESL long ago.

makingstuffs · 2025-11-05T07:44:35 1762328675

That makes sense! To be honest I’m referring to my audio engineering degree and the pause was specific to noticing silence in audio so I’d 100% agree that in conversation it can vary between people as I know some many people who will not let you get a word in

SteveMorin · 2025-11-04T18:14:32 1762280072

https://boundaryml.com/

LLM to types and done

Cassandra99 · 2025-11-05T06:22:11 1762323731

For Android and Windows users, you can try my todo app “Hamsterbase Tasks”.

It's open-source and supports self-hosted. Available on web, Mac, Windows, Linux, iOS, and Android.

raybb · 2025-11-05T06:38:27 1762324707

Do you support calDAV? I looked briefly but didn't see it. A good calDAV client is something I'd pay for.

nashashmi · 2025-11-05T20:48:25 1762375705

standard AI meeting room note takers have been able to extract todo items for a while now.

raybb · 2025-11-05T21:24:05 1762377845

Is one well integrated with a project management tool and actually creates and updates the todos?

nashashmi · 2025-11-05T21:52:27 1762379547

All of the ones I come across only extracted the points for action items. I didn’t notice any of them submitting to task managers.

You asked “how they are able to do this” and I said it has been a standard feature for a while now in meeting rooms. The additional features in todoist fills in the right data in the right columns, which is notable, but things like have been done with 30boxes where natural language is used to create events.