Best way to get parquet data from AWS S3 bucket?

SeanPT
SeanPT 🟠
edited October 22 in Connectors

We have some parquet files being replicated to an AWS S3 bucket.

I've started to look to see if I can use Amazon Glue to crawl the bucket, Athena to query the Glue table, and then Domo to pull data from Athena. I'm running into a few issues (like the initial load file Glue picks up as a table, says it has rows, but Athena can't query any data from it) but I think I can get there.

However, before I go too far down the road, is there another approach that works?

Unfortunately the S3 connector doesn't read parquet files.

I could convert them to CSV and upload directly to Domo using something like https://stackoverflow.com/questions/62275672/converting-parquet-files-in-s3-to-csv-and-store-back-in-s3 but that seems ... cludgy?

If anyone has any suggestions, I'm all ears.

Answers