what's under the hood

what's under the hood

AUTHORFiona Walter
PUBLISHED05-01-2025
GO BACKAotearoa Data Project

The task at hand

My first step in building the repo is to understand the engineering. I followed this youtube video to get my first lil set-up so I have a “shell” for a dbt repo in github, i.e. I have my ‘workstation’ set up. The video was relatively easy to follow so can definitely give this one a recommend! It’s now time to start making decisions on how to set up the data connections, i.e. what people usually call ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) (spoiler alert, a decision has to be made here too). This is where it gets interesting.

I do have some basic knowledge from all my workplaces but a lot of what I know doesn’t really apply because of my specific constraints. I have parameters for what’s important for my repo:

everything needs to be free
anyone with basic data analyst- or coding-like knowledge should be able to download the repo
anyone should be able to download the data I want to make available

Ideally the data will also be cleaned but I will decide how I approach that at a later stage. The focus for now is continuing with my basic set-up that meets the parameters.

I did some reading, some redditing, so much googling. I relented and got some ChatGPT help when things got too confusing, and with it I managed to clarify where I got stuck and create a plan for a basic set-up.

The plan

Why use many word when few word do trick?

[image]