Women in Data Science Week

Register now!

You can sign up for participating in the datathon, (lunch) lectures, talks, and data science drinks separately using the links below.

Datathon and lectures Talks and data science drinks

What is the Women in Data Science Week?

The goal of the data science week is to introduce interested students and staff to data science in a fun and cooperative way, and help create a community of data scientists at the University of Twente, the faculty of Behavioural and Management Sciences, and beyond. BDSi organizes various events during the week, including a datathon, lectures, workshops, and a networking drink.

The spring 2024 women in data science week is independently organized by BDSi and our partners to be part of Women in Data Science Worldwide’s (WiDS) mission to increase participation of women in data science and to feature outstanding women doing outstanding work. The datathon will use a curated dataset provided by WiDS, and women in data science will be in focus during the entire week.

Of course, the event isn’t limited to only women. People of all shapes and sizes are welcome to join - though we encourage you to bring your female friends and colleagues along!

Datathon

A datathon is an event in which teams collaborate and compete to create a solution to a shared problem. By learning from experts and peers and immediately applying your skills on a relevant and engaging real-world dataset, the BDSi datathons provide a great environment for both students and staff, beginners and experts to further hone their skills. For the spring 2024 edition, we will join the Women in Data Science worldwide initiative, and compete together in their datathon.

Women in Data Science datathon

The WiDS datathon is an opportunity for women worldwide to discover and hone their data science skills while solving an interesting and critical social impact challenge. WiDS provides a supportive environment for women to connect, share interests, learn from, and help each other… and have a lot of fun! First launched in 2019, this annual event drew over 4,000 participants from 100 countries in 2023. Our plan for WiDS 2024 Datathon is to provide more flexibility and attract a broader community of data scientists.

WiDS datathons are based on well-curated, real-world datasets that are not readily available in the public domain. Each year, the datathon tackles interesting, relevant, and critical, topical questions. Participating in a WiDS Datathon is a fantastic opportunity for students to gain experience and see an application to a real and critical challenge.

2024 Challenge Theme: Equity in Healthcare

We (WiDS) are thrilled to partner with Gilead Sciences and Health Verity to provide a set of datathon challenges utilizing a real-world oncology dataset which contains information about demographics, diagnosis and treatment options, and insurance provided about patients who were diagnosed with breast cancer from 2015-2018.

Speakers

Stay tuned for updates!

We’re still coordinating with more inspiring women to come and speak during the Women in Data Science Week.

Prof. Dr. Sabine Siesling

Health Technology & Services Research, Netherlands Comprehensive Cancer Organisation

Sabine is a clinical epidemiologist, Professor of Outcomes Research and Personalized cancer care, and senior researcher at the Netherlands Comprehensive Cancer Organisation. She leads the Women's Health programme at the University of Twente, and is a former president of the Netherlands Epidemiological Society.

Dr. Marissa van Maaren

Health Technology & Services Research, Netherlands Comprehensive Cancer Organisation

Marissa is a clinical epidemiologist and Assistant Professor of Oncology care. Her research includes the long-term outcomes of various breast cancer treatments. As a board member of the Netherlands Epidemiology Society she focuses on awarenes and accessibility of advanced epidemiological methods.

Lectures & Practicals

Every lunch break (12:45 - 13:30, Tuesday - Friday) expert data scientists from BDSi and our partners will provide a lecture on the most important tools in a data scientists’ toolbox; data wrangling, modelling, and communicating results. These lectures will be structured to support the datathon materials, but can be attended without participating in the datathon itself.

After a short coffee break (13:30 - 13:45), the lecture will be followed by a hands-on practical session (13:45 - 15:30). During these two hours, the lecturer - supported by a team of motivated coaches - will support participants in applying the lecture materials to their datathon submissions. While these sessions are meant to accompany the days’ lecture, they can be attended by any datathon participants. Coaches will be on hand to answer any questions about the days’ lecture, the datathon, or data science in general.

Drinks

On thursday afternoon, we invite all data science week participants as well as anyone interested in data science at the University of Twente to join us for networking drinks. This is a great opportunity to mingle with the other teams, and create lasting connections with peers and data science experts!

Competition

The best solutions to the datathon challenge will compete on global, regional, and local leaderboards. The global and regional leaderboards are maintained by WiDS, you can sign up for the datathon on their website to join. The local leaderboard includes just those teams participating in the Women in Data Science week at the University of Twente. Compete against your peers for the coveted BDSi Trophy!

Who can join?

Staff, students, family, and friends

Everyone related to the University of Twente and their friends and family can join. You can join with friends, colleagues or even family. The event is open to both novices and experts, and everyone in between. You can join the datathon as a team, alone, or skip it altogether and only participate in the workshops. If you do join alone, you can choose to be assigned to a team with other data science enthusiasts.

Men, women, and everyone in between is free to join - but only teams with 50% or more women are able to compete in the WiDS regional and global leaderboards.

A tutor giving advice to two participants at a previous Women in Data Science event.

Women and men

Both women and men are free to join. In order to compete on the regional or international WiDS datathon leaderboards, teams must be at least 50% women. This restriction does not apply to joining any the data science week events. You’re welcome to join a lecture, practical, or the drinks on your own (regardless of whether you identify as male, female, or otherwise) or with your male friends and colleagues, you just won’t be able to compete on the WiDS leaderboards.

Some experience with R or Python

Some programming knowledge is required!

You'll need to have a basic idea of either R or Python in order to follow along with the lectures and practicals. Materials will be prepared for R by BDSi and WiDS, and for Python by WiDS.

While we will do our best to introduce data science topics in the various workshops without relying on code, a basic understanding of R and/or Python will make it much easier to follow along.

If you have some experience with other programming languages, you should be able to follow along with a little preparation. More information on installing and using R can be found in the What can I do to prepare? section.

If you're new to programming in general or would like a deeper understanding of R, and would rather learn from one of our colleagues, the Cognition, Data and Education (CoDE) section provides courses and materials aimed at teaching staff and Johannes Steinrücke teaches half-day introduction to R and data visualization in R workshops for PhD's (and EngD's).

If you’re confident you can participate in the datathon in another programming language, you’re more than welcome to do so (we challenge you to try in C, Fortran, Brainf***, or JavaScript). Just be aware that we probably can’t offer support if or when you get stuck.

What can I do to prepare?

Get a team

First off, get a team together. The datathon is meant to be a collaborative experience where you work alongside a variety of expertises. In order to compete on the regional and international leaderboards, your team should be at least 50% women.

Create a Kaggle.com account

The WiDS datathons are hosted on Kaggle.com. Kaggle.com is a platform hosting various competitions, datasets, courses, and other data science and machine learning related content. It boasts building “skills in our competitions, co-hosted by world-class research organizations & companies”, “learn cutting edge ML techniques and what worked and didn’t from the top Kaggle competitors”, and a diverse community of “16 million data scientists, ML engineers & enthusiasts from around the world”.

Set up your coding environment

If you’re new to data science, you’ll want to set up a working environment. We recommend working in R or Python, depending on your experience.

Install R and RStudio, and prepare a working environment - Our colleague Johannes Steinrücke has written a good guide on how to set up R and RStudio for your projects, including some practical advice not covered in many other sources. The guide was written for students starting with coursework with R, but is equally applicable for other data science projects.

Install Python - The Women in Data Science team maintains a set of tutorials on installing Python (using Anaconda to manage packages and environments), Jupyter notebooks and the basics of Python data structures: https://github.com/keikokamei/WiDS_Datathon_Tutorials.

Schedule

Stay tuned for updates!

There may still be some minor tweaks to the schedule as we coordinate with external lecturers and speakers.

The Women in Data Science week starts Monday the 15th and ends Monday the 22nd of April. The data science week will start and end with a group session on Monday the 15th and 22nd of April, respectively. Lunch sessions and practicals will be organized on Tuesday 16th through Friday 19th. The deadline for submissions for the local leaderboard is Sunday at midnight, and we’ll ask the team(s) with the best and most interesting submissions to present their work on Monday the 22nd.

Monday

April 15th

Opening and kickoff

12:45 - 13:00 - Location: Citadel T300

Lunch talk: Breast cancer epidemiology and the clinical use of prediction models

13:00 - 13:45 - Location: Citadel T300

Dr. Marissa van Maaren

Health Technology & Services Research, Netherlands Comprehensive Cancer Organisation

Marissa will talk about breast cancer, its risk factors, incidence, survival and the disease trajectory. She will show examples of prediction models that are used in breast cancer care, and discuss their relevance for clinical practice.

Hands-on session

13:45 - 15:30 - Location: Citadel T300

Getting started: introduction to the datathon, finding a team, using Kaggle, installing python/R, setting up an environment.

Tuesday

April 16th

Lunch Lecture: Data Wrangling 101

12:45 - 13:30 - Location: Citadel T300

Exploring a dataset: where to start, finding patterns, visualizing for clarity, creating informative features.

Hands-on session: Data Wrangling 101

13:45 - 15:30 - Location: Citadel T300

Hands on: getting an overview, inspecting descriptives, visualizing distributions and relations, cleaing up and reshaping data, creating new features.

Network Analysis Community

16:00 - 17:00 - Location: TBA

The Network Analysis community is one of several peer communities that brings together researchers across disciplines using similar methods, if not topics.

In this meeting, Doina will present her work on complex networked systems. Many networks are hiding in plain sight: words are connected into discourse, books are connected via the readers they have in common, concepts are connected into knowledge graphs, biological species into food webs, and stars into constellations. All these networks are intangible, but measuring and analysing them provides insight about how the mind and societies work. This talk runs through Network Science (a creative and very cross-disciplinary field), and demonstrates recent research, with diverse data sources, methods (from various areas of artificial intelligence), and case studies. (This talk is based on a conference keynote.)

Wednesday

April 17th

Lunch lecture: Introduction to model building and evaluating

12:45 - 13:30 - Location: Citadel T300

Introduction to modelling in R using the tidymodels framework.

Hands on: Introduction to model building and evaluating

13:45 - 14:30 - Location: Citadel T300

Practical modelling, evaluating models, creating features, working with the tidymodels framework.

Afternoon Talk: (in)Equity in breast cancer care: two examples (direct reconstruction and Gene expression profiles)

14:45 - 15:30 - Location: Citadel T300

Prof. Dr. Sabine Siesling

Health Technology & Services Research, Netherlands Comprehensive Cancer Organisation

In the Netherlands breast cancer care is reimbursed by the health insurers. Still, we see inequity in the application of for instance a direct reconstruction after a mastectomy and the application of gene expression profiles. The first treatment option is proven to improve quality of live. The latter is used in diagnostics to determine the possible profit of chemotherapy for the patient and supports the decision on having chemotherapy.

Thursday

April 18th

Lunch Lecture: Advanced model building

12:45 - 13:30 - Location: Citadel T300

Hands-on session: Advanced model building

13:45 - 15:30 - Location: Citadel T300

Data Science Drinks

16:00 - 18:00 - Location: The Gallery Theatre

(Social) networking with other participants, and other University of Twente students and staff interested in data science.

Friday

April 19th

Lunch Talk & Lecture: Building Fair AI; Methods and Metrics for Reducing Bias in Machine Learning

12:45 - 13:30 - Location: Citadel T300

This talk focuses on the important issue of bias in AI systems, especially when they make decisions about people. We will discuss the need to find and lower bias in machine learning. Our main point is the methods and ways we can do this.

First, we look at where bias in machine learning models comes from and what effects it has. Then, we will talk about the different ways to make machine learning models that think about fairness.

We will go into detail about these methods, focusing on the new and effective techniques being used today. A big part of the talk will be about the methods and measurements we use to check how fair these models are.

We will look at the latest ways to measure bias and see how well we are doing at making things fairer. Finally, we will talk about the challenges and questions that we still face in this area, showing why we need to keep researching and developing better AI systems.

The goal of this talk is to give a clear understanding of how we can make machine learning decisions fairer and more responsible.

Practical: Measuring and modeling equity

13:45 - 14:30 - Location: Citadel T300

Putting into practice the theory and methods discussed in the lunch talk. Practical examples of measuring model biases, and how to ensure equity of the model by mitigating biases.

Register now!

What is the Women in Data Science Week?

Datathon

Women in Data Science datathon

2024 Challenge Theme: Equity in Healthcare

Speakers

Stay tuned for updates!

Lectures & Practicals

Drinks

Competition

Who can join?

Staff, students, family, and friends

Women and men

Some experience with R or Python

Some programming knowledge is required!

What can I do to prepare?

Get a team

Create a Kaggle.com account

Set up your coding environment

Further reading

Schedule

Stay tuned for updates!

Monday

April 15th

Opening and kickoff

12:45 - 13:00 - Location: Citadel T300

Lunch talk: Breast cancer epidemiology and the clinical use of prediction models

13:00 - 13:45 - Location: Citadel T300

Hands-on session

13:45 - 15:30 - Location: Citadel T300

Tuesday

April 16th

Lunch Lecture: Data Wrangling 101

12:45 - 13:30 - Location: Citadel T300

Hands-on session: Data Wrangling 101

13:45 - 15:30 - Location: Citadel T300

Network Analysis Community

16:00 - 17:00 - Location: TBA

Wednesday

April 17th

Lunch lecture: Introduction to model building and evaluating

12:45 - 13:30 - Location: Citadel T300

Hands on: Introduction to model building and evaluating

13:45 - 14:30 - Location: Citadel T300

Afternoon Talk: (in)Equity in breast cancer care: two examples (direct reconstruction and Gene expression profiles)

14:45 - 15:30 - Location: Citadel T300

Thursday

April 18th

Lunch Lecture: Advanced model building

12:45 - 13:30 - Location: Citadel T300

Hands-on session: Advanced model building

13:45 - 15:30 - Location: Citadel T300

Data Science Drinks

16:00 - 18:00 - Location: The Gallery Theatre

Friday

April 19th

Lunch Talk & Lecture: Building Fair AI; Methods and Metrics for Reducing Bias in Machine Learning

12:45 - 13:30 - Location: Citadel T300

Practical: Measuring and modeling equity

13:45 - 14:30 - Location: Citadel T300

Hands-on session: Measuring and modeling equity

14:45 - 15:30 - Location: Citadel T300

Sunday

April 21st

Submission Deadline

23:59

Monday

April 22nd

Closing session

12:45 - 13:30 - Location: Citadel T300