2 days agoData Products: An Overview (Part II)Introduction In my last post, I discussed data products and some of the key characteristics that help make this a reality. Using Louise from Castor’s excellent overview as a starting point, I’ll cover the remaining characteristics with my thoughts as well. Addressable/Accessible Data isn’t useful if it can’t be reached. Data should…Data Engineering3 min readData Engineering3 min read
3 days agoData Products: An OverviewNote: Many points in this post were taken from this excellent post from Louise at Castor. I have included my thoughts as well where useful. Introduction There’s been many references to data products or data-as-a-product recently. As data and software get closer and closer together, this isn’t much of a surprise…Data Engineering3 min readData Engineering3 min read
3 days agoAWS Cost-Saving GotchasIntroduction Have you figured out by now that I’m on a major FinOps kick? I’ve been spending a lot (too much, actually) time deep in the weeds of AWS seeing where our costs are coming from. …Data Engineering3 min readData Engineering3 min read
Mar 4Subsurface Day 2 TakeawaysIntroduction As a continuation of my takeaways from Subsurface (day 1 here), I’ve put together another summary post on day 2 of the action. Once again, thanks to Dremio for putting on such an insightful event. Arrow Check this talk out for a good overview on Arrow. Apache Arrow has been one…Data Engineering3 min readData Engineering3 min read
Mar 4Subsurface Day 1 TakeawaysIntroduction Dremio hosted their wonderful Subsurface Conference earlier this week, so I’m just now making my way through all the on-demand sessions (shoutout to all the conferences who do this). …Data Engineering3 min readData Engineering3 min read
Mar 3Cost Optimizations With ECSIntroduction A few months ago, we converted our daily batch processing into a file-based process. By doing so, we went from the safe confines of EMR world into the unknown territory that’s ECS and more specifically, Fargate. …Amazon Web Services3 min readAmazon Web Services3 min read
Feb 26FinOps: Taking InitiativeIntroduction When you’re working in a larger organization, the task of saving on Cloud costs can seem daunting. That’s more often the result of having so many different resources out there, strewn across different accounts and managed through different processes. …Data Engineering3 min readData Engineering3 min read
Feb 24Data Observability For Semi-Structured DataIntroduction I’m happy to be collaborating with the folks at Validio again, after a very successful series we did on data quality last year. …Data Engineering4 min readData Engineering4 min read
Feb 19Datanova TakeawaysIntroduction I’m a bit behind on this one, but I did finally catch up on all of the great sessions from Starburst’s Datanova Conference last week. There were plenty of insightful sessions and panels, so lots to summarize here. Curiosity Doesn’t Kill The Cat I totally agreed with the premise of the talk “Teaching data engineers…Data Engineering3 min readData Engineering3 min read
Feb 17Continuous Improvement In The CloudIntroduction The Cloud is evolving and releasing new features by the second. The engineers who use it for their applications? Not so much. How can teams stay away from drifting into tech debt and keeping everything (most notably costs) optimal? Staying In The Know The easiest way to keep yourself on top of the latest…Data Engineering3 min readData Engineering3 min read