Recently I posted Getting Started with Apache Spark, Apache Kafka and Apache Iceberg. When I did that originally it was missing two things. The first was using the Nessie Catalog which I updated. The second was it was using AWS S3. I didn’t like it depending on AWS S3 for a few reasons.
- So many examples use AWS and often as a result will not let you understand certain configurations and aspects to how the system you are using works. Less ability to learn outside the box.
- Not everyone uses AWS and some people are in the data center and like it that way.
So, I needed an object store and Minio fit the bill. Minio is a replacement for AWS S3 that can run on Kubernetes. It also runs without it. Also, it spins up nicely in a docker container (not that I couldn’t have used localstack) so it works better for my example but is something that can be used for real.
I updated the repo with how to get started.
Take a look at the code here.
Thanx =8^) Joe Stein
https://www.twitter.com/charmalloc
https://www.linkedin.com/in/charmalloc
Leave a Reply