BDSi Data Science Week

September 20, 2021— September 27, 2021

Contents

What is it?

Over the course of one week, you will compete with other teams of behavioural data scientists to solve a real data science case. BDSi staff will organise several seminars and workshops throughout the week to introduce the various steps involved in data science, and coaches will be available to guide you when you run into problems.

The goal of the Data Science week is to introduce interested students and staff to data science in a fun and cooperative way, and help create a community of data scientists at BMS. After one week, the best teams will be asked to present their solutions, and the winners will be presented with a suitable prize.

Who can join?

Everyone related to the faculty of Behavioural and Management Sciences and their friends and family can join (although at least one member of your team needs to have a University of Twente account in order to sign up). You can join with friends, colleagues or even family. The event is open to both novices and experts, and everyone in between. Three (virtual) lunch workshops will introduce the main steps in all data science projects. You can join as a team, or alone. If you do join alone you can choose to be assigned to a team with other data science enthusiasts.

What can I do to prepare?

Get a team

First off, get a team together, and sign up now - or just sign up on your own.

Sign up now!

Read the book (or at least pretend to)

The materials we will use are based on the freely available Introduction to Statistical Learning book. If you’re interested in data science, statistical learning or machine learning, this book would be a great place to start. BDSi also organizes a yearly reading club around this book.

Install R, RStudio, and tidyverse

As a faculty, BMS has decided to use R for statistical education. We will follow this example, and use R and the tidyverse packages in the workshops and seminars. If you do not already have a preferred programming language, you may want to install R and RStudio. ModernDive has published a good primer on installing R and RStudio, that also covers the basics of working in R. If you’d like to go further, we recommend the free R for Data Science book by Hadley Wickham - a name you’ll encounter often in the R community.

Schedule

The Data Science Week will start and end with a group session on monday the 20th and 27th. You will be free to work on the case on your own schedule, and coaches will be available for questions and feedback throughout.

data science week

Kickoff

September 20th, 15:30 – 17:00

After a quick introduction about BDSi, we will introduce the topic, and give a description of the dataset, and the problem you will solve. We will also explain how to reach the coaches for help, and give a brief overview of the schedule.

Workshop Data Wrangling

September 21st, 12:45 – 13:30

A 45 minute guided introduction to data wrangling in R, using the ‘tidy’ data principles. Karel Kroeze will show how to prepare a ‘raw’ dataset for analysis, by cleaning, reshaping and mutating the data until it gives up all its secrets.

This workshop is also open for those who do not want to participate in the Data Science Week. You can sign up for just the Tidy Data Wrangling workshop here.

Workshop Machine Learning

September 23rd, 12:45 – 13:30

A 45 minute guided overview of basic machine learning techniques. Anna Machens will take you through the basics of model fitting, paramater selection and hyperparameter tuning, ending up with a simple but effective predictive model.

This workshop is also open for those who do not want to participate in the Data Science Week. You can sign up for just the Introduction to Machine Learning workshop here.

Workshop Data Visualization

September 24th, 12:45 – 13:30

A 45 minute guided overview of data visualization using the grammar of graphics. Karel Kroeze will explain the principles of creating and layering visualizations with ggplot in R, and give a quick introduction to interactive visualizations with plotly, shiny and beyond.

This workshop is also open for those who do not want to participate in the Data Science Week. You can sign up for just the Data Visualization workshop here.

Submission Deadline

September 26th, 23:59

After spending all weekend with your team fine-tuning your solutions, you will have to submit them before midnight on Sunday. That gives us a bit of time to check your models and pick a winner. In the mean time, you can practice your victory speech - or suddenly have a brilliant idea that it’s too late to implement before submission.

Closing Session

September 27th, 15:30 – 17:00

The teams that created the best and most creative solutions will give a short presentation about their approach, and there will be time to ask questions to the winning teams as well as BDSi staff and coaches.

References

header image adapted from upklyak