Hacker Newsnew | past | comments | ask | show | jobs | submit | spolu's commentslogin

Yes you are perfectly right. Our product pushes users to be selective on the tables they give access to a given agent for a given use-case :+1:

The tricky part is correctly supporting multiple systems which each have their own specificity. All the way to Salesforce which is an entirely different beast in terms of query language. We're working on it right now and will likely follow-up with a blog post there :+1:


Salesforce architect here (from partner firm, not the mothership directly)--Salesforce's query language, SOQL, is definitely a different beast as you say. I'd like to learn more about the issues you're having with the integration, specifically the permissions enforcement. I may be misunderstanding what you meant in the blog post, but if you're passing a SOQL query through the REST API then the results will be scoped by default to the permissions of the user that went through the OAuth flow. My email is in my profile if you're open to connecting.


Hi, you are right that things can go sideways fast. In practice, the data that the typical employee needs is also quite simple. So there is definitely a very nice fit for this kind of product with a large number of use-case that we do see provide a lot of value internally for employees (self access to data) and data scientist (reducing loads).

For complex queries/use-cases, we generally instead push our users to create agents that assist them in shaping SQL directly, instead of going directly from text to result/graphs. Pushes them to think more about correctness while still saving them tone of time (the agent has access to the table schemas etc...), but not a good fit for non technical people of course.


This works well until it doesn’t. As long as there is someone who is responsible for the data correctness. E.g. the cardinality of two joining tables maintained cardinality instead of: there’s currently no one in the system in with two locations in the employee_to_location table so it works right now. One it happens there will be the wrong employee count from this query


Hi, this is a fair concern. We're super early and working on a proper privacy policy as we speak. But we also provided some color about how we handle your data on our Discord. Copying it here:

``` The privacy section on the landing README remains true. We just send your requests to OpenAI and store them for debugging purposes but we don't fetch or store anything else than what is required to process your requests of course. XP1 being opensource you can also look at the code if needed, but happy to answer any question In short: - Requests (including the text dump of tabs you select) go to the Dust main platform - They are processed as part of a Dust app whose Run object is stored - The LLM query is sent to OpenAI (retention policy 30 days, not used for training) - The response is tored as part of Dust's Run object - The response is streamed back to the client ```


>store them for debugging purposes

This project this looks great, but it's going to be a no from me until your formal policy clarifies this point. Do those requests still "include the text dump of tabs you select" when stored? It's not that I don't trust you folks, it's that I can't trust the entire wider world to not eventually break into or subpoena your debugging repository.

Further up in the thread people asked your extension about privacy concerns and at least one assumed the response's included remark about storing requests for debugging must have been an AI hallucination.


"store them for debugging purposes" is a bit concerning if they then become available if law enforcement requests data, or if you guys are hacked and everything leaks.


Completely agreed with that. We'll revisit if needed but we don't expect it to be too much of an issue at this stage.

Our goal is to gather marginally more usage to learn more about productivity use-cases using LLMs.

Here's the email we sent to our users as we removed the paywall: https://twitter.com/dust4ai/status/1633484243228585988


Completely agreed! I believe I’ll open source the app in the coming days.

This is before anything else an exploration at this stage.


I see you already open-sourced dust. I love this one too.

So is it dust like "des grains de poussière qu'on assemble ?" ;-)

Anyway, you chose the most direct "Data flow Programming" approach. It is neat and works at a flexible level.

However, one of the amazing things about ChatGTP is its ability to remember and respond to incremental "conversational" prompts in a smooth way.

How to integrate this kind of manual feedback-like "flow" within that other more rigid automatic "dataflow" of dust is something to think about, I guess.

It is not just "time for AI-native products!" [1], it is now time for "AI-prompts-native products", and yours are more like platforms than products. This is what I like about them.

A lot of coming pseudo-apps are going to be really just simple prompts in disguise [2]. So open platforms are the way to go.

[1] https://aigrant.org/

[2] Example from HN today : https://news.ycombinator.com/item?id=33975805


Dust XP1 runs on a Dust app. You just need a "history" input parameter that gets injected at the right place. Does that make sense?


It does.


The real value will accrue as the assistant become more personalised as users use it.


Some additional context here if interested: https://twitter.com/spolu/status/1602692992191676416


Copilot actually knows a bit of Lean and can be helpful when formalizing stuff. But it does not get the critical feedback we get from the formal systems (as it's not designed for formal systems).

It would be interesting to eval copilot on the same benchmark as I'm pretty sure it can close some of the proofs still.


This is definitely something that we'll be looking into more closely in the future. Our old and busted `gptf` tactic was a good start but we can do much better!


I may be misunderstanding, but it seems like gptf can only apply theorems from the set you guys trained it on ... in which case I don't see how it could possibly help with the development of a theory beyond its first few statements. Have I got that right? and if so is it something that might be addressed in future iterations?


How long (or how much compute resources) does it take to solve one of these IMO problems right now? My estimation is that just the time to run tactics linarith for many different cases would be significant enough to make it too slow to do interactively, but I'm curious to check my guess.


It takes a lot of CPU (lean side) on top of GPU (neural side) indeed. But technically, when properly parallelized, it takes no more than 1-2h to crack the hardest problem we've cracked so far.


Sorry, late reply. Indeed settle.network expired and is now owned by another party. Sorry for the ensued confusion.


Completely agreed.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: