BIDS’ TextXD 2020 conference on December 10-12 will feature an all-new, 48-hour TextXD ‘Law and Society’ Hackathon event starting at 10:00 AM on Thursday, December 10. The deadline to apply for the TextXD Hackathon has been extended to December 5.
The event will conclude with a celebration and awards ceremony on Saturday, December 12, featuring a panel of distinguished judges including Janet Napolitano, Professor of Public Policy at Berkeley's Goldman School of Public Policy, former president of the University of California and former Secretary of Homeland Security under President Obama; and David Barstow, head of investigative reporting at the UC Berkeley Graduate School of Journalism, 4-time Pulitzer prize-winning journalist, and a former senior writer at The New York Times.
Hackathon participants will focus on a unique collection of recent datasets that focus on police misconduct and associated public policy issues in California. Some of these datasets were originally obtained from the California Department of Justice by investigative journalists now affiliated with the UC Berkeley school of journalism, who reported on their acquisition of the data in KQED News and the Cal Alumni Association’s California Magazine. The datasets have also been featured in the San Jose Mercury News and USA Today.
Four different data analysis challenges and learning tracks for all levels
Participants will have the option of organizing into groups, and each team will then have the option of selecting one of four data analysis challenges around which to focus their analysis -- these include visualization (e.g. comparative analysis, fact-checking, and metrics validation), natural language processing (NLP) (e.g. sentiment analysis of news articles, topic modeling, and word embedding), meta-analysis (e.g. spatial analysis, name disambiguation, linking records to convictions for comparison), as well as an “Open EDA” challenge track, in which participants will design a project around a combination of different methods using Exploratory Data Analysis (e.g. statistical analysis, classification methods, machine learning).
Four different learning tracks will include the specific challenges geared toward different levels of expertise, and the entire program is being designed to facilitate optimal learning and engagement across all experience levels, including non-programmers, new programmers, beginners, intermediate and advanced programmers.
In addition to the specific challenges and learning tracks, hackathon mentors from Berkeley Law and the Graduate School of Journalism will be on hand to help participants work with these sensitive datasets in a reproducible and ethical manner.
Final presentations to feature data science narratives and policy recommendations
As part of the final presentation, teams will have the opportunity to present their analysis and describe their results to our online audience and distinguished judges. In addition to data analysis, teams will be invited to address specific public policy questions, and to recommend policy changes on the basis of their analysis.
Projects will be evaluated on the individual’s/team’s completion of their chosen challenges, their execution and implementation of the analytical methods for their chosen track, as well as their presentation of an analysis-driven narration and meaningful policy recommendations.
This event will bring together an interdisciplinary group of practitioners, researchers, learners, and entrepreneurs who work with text as a primary source of data. The goal of the hackathon is to help participants build responsible data-driven policy narratives that enable greater accountability, and to learn how to build representative models that reveal these narratives to a wide audience. Ultimately, this Hackathon will be a catalyst for developing and cultivating a community around responsibly curated datasets.
Winners will be announced during a virtual ceremony on Saturday December 12, 2020. Participants will be eligible for various prizes, research opportunities at BIDS, and having the highlights of their analysis published on the TextXD website.