Skip to Content

Seminar Abstract

Articulating how our NLP data and systems do and don't represent the world: Toward mitigating bias and enabling better science. (CLaS-CCD Research Colloquium Series)

Speaker : Professor Emily Bender, Department of Linguistics, University of Washington, USA.
Date : 5th of July 2018, 2:00PM until 3:00PM
Location : Wally's Walk (Building E6A), Level 3, Room 357, Macquarie University.

    As the technology we work on becomes broadly used, impacting the lives of both direct and indirect stakeholders, we as NLP technologists have a responsibility to think critically about the real world effects of the design decisions we make as we build systems. In this talk, I look in particular at the problems that arise when there is a mismatch between the datasets used for training and testing NLP systems and the contexts of system deployment. I present a proposed professional practice, drawing on value sensitive design (Friedman et al 2006), which should help us as a field engage with the ethical issues of exclusion, overgeneralization and underexposure (Hovy & Spruitt 2016). This professional practice, called "data statements", will bring our datasets and the populations they represent into better focus and as a result position us to better understand and describe our results and do better science and engineering.

    Further Information

    Contact Details

    Telephone: +61 2 9850 4127
    Email :
    Web :

    Sign up to receive CCD event and research announcements

    Macquarie Univeristy Logo University of Western Australia Logo The University of Sydney Logo

    University of New South Wales Logo University of New England logo