LT2, William Gates Building
More generally, a data scientist is someone who knows how to extract meaning from and interpret data, which requires both tools and methods from statistics and machine learning, as well as being human. She spends a lot of time in the process of collecting, cleaning, and munging data, because data is never clean. This process requires persistence, statistics, and software engineering skills—skills that are also necessary for understanding biases in the data, and for debugging logging output from code.
Cathy O’Neil and Rachel Strutt from O’Neil and Schutt (2013)
We don’t know what science we’ll want to do in five years’ time, but we won’t want slower experiments, we won’t want more expensive experiments and we won’t want a narrower selection of experiments.
Data Exploration - Used the initial data analysis phase to understand the data, uncover patterns, get insights within the data. - Generated word clouds from the raw dataset. - Performed topic modelling using Latent Dirichlet Allocation (LDA) to discover topics and similarity of the data for Covid-19 on social media.
Cohen’s kappa inter-annotation used to measure annotator agreement.
On Governors, James Clerk Maxwell 1868
compare digital oligarchy vs how Africa can benefit from the data revolution
if
, for
, and procedures