Best way to get parquet data from AWS S3 bucket?
We have some parquet files being replicated to an AWS S3 bucket.
I've started to look to see if I can use Amazon Glue to crawl the bucket, Athena to query the Glue table, and then Domo to pull data from Athena. I'm running into a few issues (like the initial load file Glue picks up as a table, says it has rows, but Athena can't query any data from it) but I think I can get there.
However, before I go too far down the road, is there another approach that works?
Unfortunately the S3 connector doesn't read parquet files.
I could convert them to CSV and upload directly to Domo using something like https://stackoverflow.com/questions/62275672/converting-parquet-files-in-s3-to-csv-and-store-back-in-s3 but that seems ... cludgy?
If anyone has any suggestions, I'm all ears.
- 10.6K All Categories
- 13 Getting Started in the Community
- 30 Beastmode & Analytics
- 2.1K Data Platform & Data Science
- 59 Domo Everywhere
- 2.7K Charting
- 2.4K Ideas Exchange
- 1.3K Connectors
- 362 Workbench
- 301 Use Cases & Best Practices
- 499 APIs
- 118 Apps
- 48 News
- 753 Onboarding
- 1.1K 日本支部
- 4 Private Company Board