The Data Science Landscape

Neil D. Lawrence

LT2, William Gates Building

Course Overview

Introduction

Neil Lawrence Carl Henrik Ek

An Emerging Field

  • I’ve never seen data science done well, but I’ve seen places where it’s done better.
  • Reddit https://www.reddit.com/r/CST_ADS/

Week 4

  1. The Data Science Landscape. Lecturer: Neil D. Lawrence
  2. The Challenges of Data Science. Lecturer: Neil D. Lawrence
  3. Introduction to Statistical Learning. Lecturer: Carl Henrik Ek

Lab Session One (Review and Refresher, Practical 1)

Week 5

  1. The Challenges of Data Science II. Lecturer: Neil D. Lawrence
  2. A Data Science Process. Lecturer: Neil D. Lawrence
  3. Generalized Linear Models. Lecturer: Carl Henrik Ek

Lab Session Two (Hand in Practical 1 & Practical One Check)

Week 6

  1. Inference Lecturer: Carl Henrik Ek
  2. Remedial Lecture/Q&A Lecturer: Carl Henrik Ek and Neil D. Lawrence
  3. Inference II Lecturer: Carl Henrik Ek

Week 7

  1. Lab Session Four Questions
  2. Inference III Lecturer: Carl Henrik Ek
  3. Summary Lecturer: Carl Henrik Ek and Neil D. Lawrence
  4. Project Results Check
  5. Course Q&A Lecturer: Carl Henrik Ek and Neil D. Lawrence

Assessment

  • Available already in Moodle.
  • Will be able to complete as we teach.
  • After each lab session is complete, progress in assessment.
  • Revisit your questions to “refactor” your code before submission.

Henry Ford’s Faster Horse

There are three types of lies: lies, damned lies and statistics

??

There are three types of lies: lies, damned lies and statistics

Arthur Balfour 1848-1930

There are three types of lies: lies, damned lies and statistics

Arthur Balfour 1848-1930

There are three types of lies: lies, damned lies and ‘big data’

Neil Lawrence 1972-?

Mathematical Statistics

‘Mathematical Data Science’

Embodiment Factors

bits/min billions 2,000
billion
calculations/s
~100 a billion
embodiment 20 minutes 5 billion years

Heider and Simmel (1944)

For sale: baby shoes, never worn

Evolved Relationship with Information

New Flow of Information

Evolved Relationship

Evolved Relationship

Evolved Relationship

What Next?

  • Review notebook (covers pandas, probability and correlation)
  • Practical 1 (more pandas, setting up SQL on AWS, uploading data and performing joins with SQL).
  • Read through assignment to contextualise material.

Further Reading

  • Chapter 5 of Lawrence (2024)

  • Chapter 1 of Lawrence (2024)

  • Chapter 1 of Lawrence (2024)

  • Chapter 1 of Lawrence (2024)

  • Chapter 8 of Lawrence (2024)

Thanks!

References

Heider, F., Simmel, M., 1944. An experimental study of apparent behavior. The American Journal of Psychology 57, 243–259. https://doi.org/10.2307/1416950
Lawrence, N.D., 2024. The atomic human: Understanding ourselves in the age of AI. Allen Lane.
Lawrence, N.D., 2017. Living together: Mind and machine intelligence. arXiv.