top of page

Pies, Lies and AIs

Exploring the world of data and organisational intransigence

The new new normal

Well the number of people looking for work in the cloud and data sector has now shot up as everyone adjusts their forecast expectations...

Don't touch my *!$@%&* board

I had a great time last weekend with a garden party with my friend and one of my longstanding tech leads David. It was great to spend the...

Data Smoosh

I was mulling over whether to derive a new jocular term for a Data Mesh. I pondered Data Mess but that seemed to obvious so I've opted...

The Loser CTO Cycle

I felt compelled to write about this because it's a phenomenon I'm seeing more and more. Cloud and data is revolutionary because it's...

What makes a data engineer?

Note: this post is mainly about Azure but it can apply to any cloud. You get to a stage when you hear enough definitions of what people...

Documentation the easy way

So a slight departure from Spark (sort of) for this post, but I wanted to look at one of the most commonly overlooked aspects of building...

Using Spark to read from Excel

People have data in Excel, so lets have a look at how we can read that data using Spark

Just one more column, what could go wrong?

Sometimes, when you go scanning through the documentation for Spark, you come across notes about certain functions. These tend to offer...

Why leave bad data to chance?

Something that we often see as Spark jobs are moved into production is that handling of bad data is either ignored, or a lot of effort...

Pivot, Step, Pivot, Twist, Un-pivot

Getting data into a good shape is a key part to Data Engineering, and we often get data in all sorts of shape and quality

When in doubt, shell out

The command line is a powerful environment that lets you do a lot of work quickly, easily, and in a repeatable way

1
2
Home: Blog2

Subscribe Form

Stay up to date

Home: Subscribe

CONTACT

Thanks for submitting!

Home: Contact
bottom of page