Editorial Summary :
Apache Iceberg is one of the 3 table formats that are currently available for organizing and tracking data files in data lakes . Iceberg supports ACID transactional capabilities, which means you are allowed to perform any kind of data warehouse-level operations such as INSERT, DELETE & UPDATE directly on your data lake storage (Amazon S3, Microsoft ADLS, etc.) In this article, we will take a look at some of the . significant features that Iceberg provides out-of-the-box . A lot of these features separate Iceberg from the other available ones such as Delta Lake & Apache Hudi . Apache Iceberg is a quintessential table format for data lake tables . Data compaction is supported out-of-the-box & you can choose from different rewrite strategies such as ‘bin-packing’ or ‘sorting’ to optimize file layout and size . A snapshot is the state of a table at ‘some’ given point of time . Iceberg keeps a log of previous snapshots of the table allowing for ‘time travel’ queries . Read more about how you can achieve time travel below .
Key Highlights :
- Apache Iceberg is one of the 3 table formats available for organizing and tracking data files in data lakes .
- Apache Iceberg is a quintessential table format for data lake tables .
- It is imperative to have similar ability in a data lake table like Apache Iceber .
The editorial is based on the content sourced from medium.com