Skip to main content
Workshops & seminars

Data mining & algorithmic bias


Date & time
Wednesday, January 25, 2023
10 a.m. – 12 p.m.
Speaker(s)

Francisco Berrizbeitia

Cost

This event is free

Website

Webster Library

Where

J.W. McConnell Building
1400 De Maisonneuve W.
R. Howard Webster Library

Room LB-205

Wheel chair accessible

Yes

This workshop is open to all. Graduate students: please register through GradProSkills.

This workshop will guide participants through the first steps for doing data analysis, specifically text mining with Weka. Weka is an open source machine-learning tool. We will be replicating the work of Mike Thelwall in his paper on Gender bias in machine learning for sentiment analysis (https://wlv.openrepository.com/handle/2436/620690)

Before getting into the hands-on text mining exercise, we will present a brief introduction to AI and machine learning, as well as the notion of algorithmic bias; what it is, how is introduced and its repercussions.

By the end of the workshop participants will have applied a sentiment analysis technique to a gender segregated data set and be able to determine its effect on the resulting predictive model.

 

IMPORTANT NOTE

Before the workshop, students are strongly encouraged to install Weka and download the data.

Weka: https://www.cs.waikato.ac.nz/ml/weka/downloading.html

Datasets: http://labs.library.concordia.ca/workshops/textmining/datasets.zip

Learning Objectives

Participant of this workshop will:

• Understand basic notions of AI: machine learning and predictive models.

• Understand what is algorithmic bias, how is introduced in machine learning models and its possible repercussions.

• Transform text into word vectors (Bag of Words Approach) as a technique to perform text-mining tasks.

• Create a model for sentiment prediction using a machine learning approach based on a training corpus of real-life textual data (Tripadvisor comments on hotels and restaurants).

• Evaluate the model and compare the performance with different gender biased training corpuses.

Back to top

© Concordia University