Also related: https://news.ycombinator.com/item?id=36049449
Kinda crazy how they didn't check again the responses of GPT in the video before marketing it. Reminded me of the Bing and the Bard demo mistakes earlier this year.
The race between Delta Lake and Apache Iceberg is a throwback to VHS/Betamax days.
Microsoft is betting on the former while for example Snowflake is now allowing you to back tables with the latter, making it possible to integrate your data directly with other systems.
I talked to a Microsoft rep who said that they would support Iceberg if there was a Rust-based client for it.
That's true - neither of the three data lake formats has good C++ or Rust libraries. We have to support all of them (Iceberg, Hudi, and Delta Lake) in ClickHouse, but afford it only for reading. The overall data lakes infrastructure is fragmented, the language support is sketchy, and the details often change in incompatible ways.
Note that s3 does not support the atomic "create object or return an error if exists" operation, and it makes it questionable how the formats can correctly work for concurrent writes.