One Crank or Two?

One Crank or Two?

Data Science and Intro Stat Kari Lock Morgan Assistant Professor of Statistics Penn State University ECOTS May 2018 Data Science in Intro How should intro stat adapt in an era of

abundant availability and use of data? Data Science and Intro Stat Computin g Statistics (Concepts, Methods, Theory)

Domain knowledg e Data Science? as needed to make sense of data Intro Stat? simple

How to Adapt? Focus on making sense of data! Focus on Making Sense of Data ( )= ( ) + ( ) ( ) 2 ( ) = ( )

( = ) = ( 1 ) ( ) ( < ) = 0

How to Adapt? Focus on making sense of data! What kind?!? Data Collection Classical statistics: Design,

Inference! randomness Ask a question Collect (small) data to answer it Data science?

Inference? Obtain available (big) data See what it tells you Data Quality vs Quantity Which provides a better (MSE) estimate? a) A simple random sample of n = 100

b) A non-random sample of n = 50 million (!) (say from the US population of 320 million) with correlation of 0.05 between x and probability of inclusion (relatively small) The small random sample!!! Meng, X.L. (2016). Discussion of Perils and potentials of selfselected entry to epidemiological studies and surveys, Journal of the Royal Statistical Society: Series A (Statistics in Society), 179(2), 319-376.

Data Quality over Quantity For population inference, small random sample beats large biased sample For causality, small randomized experiment beats large observational study (Statistics beats data science? ) Design (randomness) remains important inference remains important! How far might the estimate be from the truth?

Is the effect more than might be seen by chance? But Random sampling/assignment is hard! Non-random data are EVERYWHERE!!! For intro stat to remain relevant, we have to acknowledge and embrace the abundance of available data. AP Stat theme 2 (of 4): Data must be collected according to a well-developed plan if valid information is to be

obtained. How to Adapt? Focus on making sense of data! Keep some design and inference Do more with available data

Design and Inference Random sampling and assignment Inferential concepts sampling variability interval estimation hypothesis testing How can we cover this more efficiently? Simulation-based inference: more for less

What can we cut? Lets prioritize the good stuff! Available Data Acknowledge that not all data come from question -> design -> inference Data quality and limitations (e.g. sampling bias, confounding, missing data) Inferential cautions (e.g. multiple testing, sample size, non-random) Multivariable thinking

Highlight the abundance, diversity, and omnipresence of data One Way to Start www.gapminder.org/tools/ www.gapminder.org/data/ How to Adapt? Focus on making sense of data!

Keep some design and inference Do more with available data Emphasize the overlap Emphasize Overlap

EDA, especially data visualization Choice of graph/stat/parameter/method Modeling Interpretation and communication Context, background, real conclusions Technology Technology in Intro Stat Use technology in a way that engages students eliminates tedious work

excites students enhances conceptual understanding empowers students to make sense of data extendable or easy? Data Science in Intro Lets think about how to keep intro stat relevant in an era of data science! My opinion: Focus on making sense of data Acknowledge that not all data analysis is question

-> purposeful design -> inference But that the above remains valuable! Emphasize the overlap What do you all think?!? www.tricider.com/brainstorming/3R3ZmK3a02l [email protected]

Recently Viewed Presentations

  • Re-framing the Time and Space for Assessment: from

    Re-framing the Time and Space for Assessment: from

    2. Set OneDrive Excel with time-slots, linked on Blackboard. 3. Students book their conversation slot, email to confirm. 4. Evaluative Conversation: 15-20mins + 10 minutes to write feedback - recording on laptop camera and microphone - 1st part discussing essay...
  • Political Ideologies and the Democratic Ideal

    Political Ideologies and the Democratic Ideal

    John Stuart Mill - defend and extend individual liberty. Stressed the "educative" rather than the "protective" aspect of democracy. Mill thought democracy was susceptible to "the tyranny of the majority" - democracy could threaten liberty. Therefore government can act against...
  • Medical Terminology - Quia

    Medical Terminology - Quia

    Focus charting-charting focuses on client and nursing concerns, focal point is client concerns/need rather than nursing task or medical diagnosis. Data/action/response (DAR) SOAP and DAR both organize thinking and provide structure to promote creative problem solving. Facilities will have their...
  • Chapter 1 Lesson 3 - SCIENCE

    Chapter 1 Lesson 3 - SCIENCE

    Chapter 1 Lesson 3. What is a food web? Vocabulary. Food web: shows how the food chains in an ecosystem connect. Competition: struggle between organisms for the same resource . Energy pyramid: a model that shows how much energy flows...
  • CDE-40 Transportation Training Friday, June 19, 2015 Presented

    CDE-40 Transportation Training Friday, June 19, 2015 Presented

    The district has 4 buses (all log both route and activity miles), two (2) suburbans (both log pupil and non-pupil miles), and a maintenance truck (logs only non-pupil transportation mileage). The total fuel expenditure for all seven (7) vehicles for...
  • Scattegories T 1 2 3 4 5 6

    Scattegories T 1 2 3 4 5 6

    Scattegories T 1 Cyberlaw 2 Domestic Law 3 Part of jury process 4 Contract Law 5 Subject in School 6 Crime 7 Something in this room 8 A Teacher 9 A Foreign City 10 Any vocabulary related to TORTS 11...
  • Six Traits of Writing - Kyrene School District

    Six Traits of Writing - Kyrene School District

    The principal, who just happened to be walking by Word Choice Memorable moments Strong verbs, precise nouns-no modifier overload-very, so, like "Just right" language-suits the topic, the audience, and the purpose Simple language used well-not written to impress Do not...
  • OT Expert Onboarding - VAN Migrations

    OT Expert Onboarding - VAN Migrations

    Review Bon-Ton's and OpenText/ GXS extended attribute information . Review online resources (Bon-Ton's Landing Page or online OpenText GXS Active Catalogue documentation) to identify what attributes you should be providing. Publish your data using the GS1 industry standard/Catalogue attribute format