Pies, Lies and AIs

Exploring the world of data and organisational intransigence

Documentation the easy way

So a slight departure from Spark (sort of) for this post, but I wanted to look at one of the most commonly overlooked aspects of building...

Why leave bad data to chance?

Something that we often see as Spark jobs are moved into production is that handling of bad data is either ignored, or a lot of effort...

Pivot, Step, Pivot, Twist, Un-pivot

Getting data into a good shape is a key part to Data Engineering, and we often get data in all sorts of shape and quality

When in doubt, shell out

The command line is a powerful environment that lets you do a lot of work quickly, easily, and in a repeatable way

 
 

CONTACT

Thanks for submitting!