Skip to content

Tasks Overview

🗓️ 5-Day Data Analyst Onboarding Overview

Section titled “🗓️ 5-Day Data Analyst Onboarding Overview”

Welcome to your 5-day onboarding sprint! This program is designed to introduce you to our real-world data stack and workflows. Each day you’ll tackle a focused, hands-on challenge that simulates core parts of our pipeline—from raw data ingestion to analysis and integration.

In every daily_tasks/day_X/ folder, you’ll find:

  • 🔗 A link to the task’s GitHub Issue (in the README or markdown files)
  • 📂 All necessary datasets, database connection details, and starter materials
  • 🧑‍🏫 You’ll get guided explanations and walkthroughs by a team lead during the morning masterclass.

📊 Day 1 – Onboarding & Setup + Data Exploration & Submission

Section titled “📊 Day 1 – Onboarding & Setup + Data Exploration & Submission”

Get access to the GitHub repository and Slack. Meet the team. Start with a small dataset. Explore it using Google Sheets.
You’ll answer simple analytical questions and share your work.

🧰 Skills: Data loading, quick exploration, Google Sheets, basic markdown submission


🧹 Day 2 – Data Cleaning & Preparation with Python

Section titled “🧹 Day 2 – Data Cleaning & Preparation with Python”

Work with messy data. You’ll clean, transform, and prepare a dataset for analysis.
Handle duplicates, missing values, naming conventions, and outliers.

🧰 Skills: pandas, data types, cleaning strategies, CSV export


🗃️ Day 3 – SQL + Python Exploration

Section titled “🗃️ Day 3 – SQL + Python Exploration”

Connect to the PostgreSQL training DB and explore it with SQL inside Python notebooks.
Answer analytical questions using real queries and joins.

🧰 Skills: psycopg2 / sqlalchemy, pandas.read_sql, JOINs, filtering, logic in SQL


🧮 Day 4 – Data Integration & Schema Design

Section titled “🧮 Day 4 – Data Integration & Schema Design”

Integrate a new dataset into the existing PostgreSQL schema.
Identify keys, clean/prep with Python, and write a script to append it to the database.

🧰 Skills: Data modeling, ETL scripting, foreign key relationships, INSERTs via Python


🚀 Day 5 – Project Wrap-up & GitHub Deployment

Section titled “🚀 Day 5 – Project Wrap-up & GitHub Deployment”

Finalize your project by integrating all components from Days 1-4 into a cohesive whole. Prepare your codebase for public sharing, ensuring all scripts are well-documented and your repository is structured professionally. Push your complete project to your personal GitHub, ready to showcase your new skills!

🧰 Skills: Project management, code organization, Git & GitHub, documentation, data storytelling


We’re here to help — ask questions during the daily masterclass or on Slack!


Let’s build something great. 💪