How to practice data engineering when you're not data engineering
One of the hard parts of data engineering is that, when you’re not at work, it is hard to practice because database management systems cost money, data infrastructure takes time to set up, and there are no stakeholders asking you interesting questions. If you are in a situation in which you want to improve your data engineering skills at home, whether because you are between jobs or because you literally cannot leave your house and your brain is turning into mush from Say Yes to the Dress reruns, the following is some ideas that have worked for me:
Use academic resources
Real life examples may be lacking, but relational databases have existed since the 1970’s so there are some books and courses on the topic. My recommendations are:
- The Data Warehouse Toolkit
- SQL for Dummies - This book is huge and has more background information than a beginner may need, but what I’ve learned from it has come in handy over the years
- Database Systems - One of my first textbooks on the subject!
- To learn SQL, I did the codeacademy course. I still recommend it to people even though there may be others out there. Search around, find a course that works for you!
- From Adam: I Love Logs and Data Intensive Applications
Contribute to open source projects
I’ve had success with finding open source projects on Github that need help with SQL. Github has a pretty good search functionality. I’ve used the search to find issues with help-wanted labels that include the word SQL, such as this search There are some open source data engineering projects that you can find open issues on. For example:
- Singer is an open source ETL tool with many repos. The tap-github repo for instance, which extracts data from Github, has (at the time of writing) 11 open issues.
- Airflow is an open source ETL project, and it (at the time of writing) 140 open issues.
Work on tangential skills
If you don’t find anything that you think will help your SQL or ETL skills, remember we use other skills every day at our job! Some other resources I’ve used throughout the years:
- Thanks for the Feedback This book changed my perspective on feedback and it has helped me in every facet of my career, from interviewing to one on ones to annual reviews.
- Gaining Git and version control skills - no matter what language we work on, we always end up using version control, specifically Git. Whether you use Github or Bitbucket, improving Git skills can help you work faster, make fewer mistakes, and swear less. I practice Git with any project I do (including this site) and I also read about it, including this book. There are also courses on learning Git command line.
That’s my advice for today! I also have advice on picking the perfect dress for your special day if that something you are looking for. Feel free to send me more things that worked for you and I’ll add them here!