Editorial Summary :

Apache Iceberg is one of the 3 table formats that are currently available for organizing and tracking data files in data lakes . Iceberg supports ACID transactional capabilities, which means you are allowed to perform any kind of data warehouse-level operations such as INSERT, DELETE & UPDATE directly on your data lake storage (Amazon S3, Microsoft ADLS, etc.) In this article, we will take a look at some of the . significant features that Iceberg provides out-of-the-box . A lot of these features separate Iceberg from the other available ones such as Delta Lake & Apache Hudi . Apache Iceberg is a quintessential table format for data lake tables . Data compaction is supported out-of-the-box & you can choose from different rewrite strategies such as ‘bin-packing’ or ‘sorting’ to optimize file layout and size . A snapshot is the state of a table at ‘some’ given point of time . Iceberg keeps a log of previous snapshots of the table allowing for ‘time travel’ queries . Read more about how you can achieve time travel below .

Key Highlights :

  • Apache Iceberg is one of the 3 table formats available for organizing and tracking data files in data lakes .
  • Apache Iceberg is a quintessential table format for data lake tables .
  • It is imperative to have similar ability in a data lake table like Apache Iceber .

The editorial is based on the content sourced from medium.com

Read the full article.

Similar Posts