In this post, I’m going to hit the Wikipedia API to get data into a database because so often as data engineers our job is to hit an arbitrary API and extract, clean, and store the data.
One of the hard parts of data engineering is that, when you’re not at work, it is hard to practice because database management systems cost money, data infrastructure takes time to set up, and there are no stakeholders asking you interesting questions.
This is the third post in a series of Cool Things to do in Snowflake, but these are things that are really cool when you don’t do them; they’re Uncool Things to Do in Snowflake.
We have to talk about NULLs.
This post won’t include anything revolutionary, but rather just a summary of what NULLs are, what they aren't, and why they are important in data warehousing.
This post is for my friend Mitch, who, despite his persistent belief that he annoys me with his questions, has reminded me why it’s important that we stay hungry for theoretical knowledge even as we get caught up in our day-to-day work.
Explain like I’m 5 (ELI5) comes from the popular concept that if you really know something well, you can explain it simply, so a five year old can understand.
There are two types of relational database management systems: columnar and row-based, also called: column-wise and row-wise, column store and row store, column-based and row-based, and column-oriented and row-oriented.