**GG413: Introduction to Statistics and Data Analysis**

Instructor:
Garrett Apuzen-Ito

__Classes:__** POST 708, Mon & Wed
8:30-9:45**_{}

__Prerequisites:__ Math242 (2^{nd} semester calculus), GG250
(scientific programming using Matlab), or instructor consent

__Textbook:__ Introduction
to Statistics and Data Analysis, by Paul Wessel. Recommended (optional)
text: John C. Davis, Statistics and Data
Analysis in Geology, 3^{rd} Edition

**Overview:**

Quantitative analysis of data and modelling have become the
norm in earth, planetary, and environmental sciences. Having knowledge and
skills in such quantitative analysis enables one to objectively define the
extent and limits of ones interpretations and opens the door to diverse ways of
using data. This course provides a foundational understanding of the basic
theory behind probability, statistics and quantitative data analysis, as well
as practice in using real data sets using computer software (Matlab, Octave, or
FreeMat). The course emphasizes solving problems, interactive class
discussions, and independent inquire so that students**…**

·Learn how to explore and
characterize their data, including defining the mean, median, uncertainties,
and factors that contribute to variance.

·Understand
how to propagate errors in calculations of derived quantities

·Learn
and gain practice in using principles in probability theory and statistics

·Perform
formal hypothesis testing in interpreting data

·Use
basic concepts of linear algebra and least squares formalism for curve fitting
and regression

·Explore
various ways to examine sequential or time-series data, including using
spectral analysis

·Analyze
directional data

The applications will be on geoscience data sets but the
course is relevant to all fields of science.

**STUDENT LEARNING OBJECTIVES **

This course emphasizes three* *__student learning objectives__ for __undergraduate__ and __graduate__
students:

·*Students can apply technical
knowledge of computer applications and mathematics and physics to solving
real-world problems in geology and geophysics*

·*Students use the scientific method to**
define, critically analyze, and solve a problem in earth science*

·*Students can communicate scientific knowledge in both oral
presentations and in writing*

**Format and workload**

Lectures are to be viewed outside of class on YouTube (links
provided below). Class time is an
interactive learning environment and largely dedicated to working problem
sets. Problem sets will be assigned
approximately weekly and will involve using computer software to apply and
practice using the techniques covered.
There will be a mid-term and a final exam.

**GRADING**

Data
analysis is a very hands-on activity and there will be weekly problem sets that
require a mix of mathematical and computational manipulations. __Homework must
be handed in at the beginning of class on THURSDAY__, unless you have made
prior arrangements with me. Otherwise,
unexcused late homework will receive 10% less credit for each day it is late.
If you anticipate a conflict for exams, you must re-schedule the exam *prior *to the scheduled date. The final grade will be a weighted average of
grades for __homework (70%), mid term (15%), the final exam (15%)__.

**Working Course Syllabus**

__Chapters 1 & 2: Exploring Data & Error Analysis__

__Week1:
8/22&24, Swan and Sandilands Handout and Wessel Ch 1 and 2__

1.1 Classification of data (see video #1 on Data
Types and Precision vs. accuracy)

1.2 Exploratory data analysis (see EDA_Lecture
files)

2 Error Analysis

Reporting uncertainties, significant
figures, & errors of sums & general functions

Uncertainties of products, quotients, and
examples cases

Homework #1 and required datasets;
and >>>SOLUTIONS<<<

__Chapter 3: Basic Concepts in Statistics__

__Week2:
8/29 & 8/31 (HW #1 due Wed 8/31) __

3.1 Probability Basics

Lecture Videos

#1: Permutations

#2: Combinations

#3: The Binomial probability
distribution (Davis Ch 2)

#4: The Hypergeometric
distribution, 3.1.3 Probability, 3.1.4 Some Rules of Probability

#5: 3.1.6 Additional
rules, 3.1.7 Conditional Probability

#6: 3.1.8 Conditional Probability and
Bayes Theorem

Examples: Binomial &
Hypergeometric PDs (& Matlab scripts for examples 1 & 2), and Conditional Probability

Homework #2: Probability; and >>>SOLUTIONS<<<

__Week
3: 9/7 (watch #1-#3 for Wed, HW2 due Wed)__

3.2 The M&M’s of Statistics (Davis pages on Central Limit Theorem)

Lecture Videos:

#1:3.2.1 Population and Samples, 3.2.2
Measure of central location (mean, median, mode)

#2: 3.2.3 Measure of variation

#2.5:
3.2.6 Covariance and Correlation

#3: 3.2.4 Robust Estimation (MAD)

HW3: Statistics and Probability Distributions
and (data for problems 2 & 3)

see SOLUTIONS

__Week
4: 9/12 & 9/14, (HW3 Due Wed)__

**Watch
#4-#7 for Mon**

#4: 3.2.5 Inference about the mean and
Central Limits Theorem

#5: 3.3.1-3.3.3 Probability
Distributions, Binomial and Normal Distributions

#6: 3.3.3 The Normal (Gaussian)
Probability Density Function

#7: 3.3.3-3.3.4 Applications of the
Normal Distribution & the Poisson’s Distribution

See example script for plotting the
binomial and normal distributions.

__Wed
9/14__**:
Study videos #1-#5 below**

3.4. Inferences about means
of populations, Videos #1, #2, #3

__Chapter
4: Hypothesis Testing__

4.1 Null Hypothesis, Videos #4

4.2. Parametric Tests
(Students *t*, Chi-squared, *F* tests),

#5: One and two sample test of means

Tables: normal
distribution, t-distribution, chi-squared, F-distribution

Hw4: Hypothesis Testing with Parametric Statistics see SOLUTIONS

__Week
5: 9/19 & 9/21 (HW #4 due Wed)__

**Watch
#7-#9 for Mon 9/19**

#7:
4.2.3 estimating the variance of a population

#8: 4.2.4 one-sample, chi-square test of variance

#9: 4.2.5 two sample
test of F-test of variance

**Watch
#1-#4 for Wed 9/21**

2.2 Parametric Tests, videos...

#1:
general aspects of Chi-squared

#2: 4.2.6 Chi-squared test of a pdf

#3: 4.2.6 Chi-squared test of a pdf, example

#4: 4.2.7 test of linear correlation

Hw5: Hypothesis Testing II: datasets: “quakedays.txt”,
and “rho.txt”

__Week
6: 9/26 & 9/28 (HW #5 due Wed)__

**Mon
9/26, work on HW5. The videos are #1-#4
above**

**For
Wed 9/28: study the 4 videos below** *(annotations will be added by Friday night)*

4.3 Non-Parametric Tests, see video

4.3 Parametric vs. Non-Parametric tests

4.3.1: Sign test of central value

__4.3.2 __videos
#1 and #2: Mann-Whitney 2-sample U test of
median

Tables: Mann-Whitney,
K-S (1-sample), K-S (2-sample)

Hw6:
Hypothesis Testing III, see Matlab script kolsmir.m

__Week
7: 10/3 & 10/5 (HW #6 due Wed)__

**For
Mon 10/3: study the two videos below**

2.3 Non Parametric Tests

__4.3.3 __: Kolmogorov-Smirnov goodness of fit
test (1 or 2 sample) to a pdf

__4.3.4__: Spearman’s Non Parametric
test for correlation

__For
Wed 10/5__**
study videos #1-#4 below. Also come to class with questions about HW 1-6.
Wed is our review before the exam. **

__Chapter
5: Linear (Matrix) Algebra and Least Squares Inversion__

5.1-5.2 #1
Matrices: General concepts and
definitions

5.3-5.4 #2 Matrix
Addition, Dot Product, and Matrix Multiplication

5.5 #3 Determinant of a
Matrix

5.7 #4Matrix
Division: the Inverse Matrix

__Week
8: 10/10-10/12__

**>>MIDTERM
Monday 10/10 (Covering material through HW #6) <<<<**

__For
Wed 10/12__**
study videos #5-#8 below**

5.9.1 #5 Simple
Regression and #6 RMS Misfit

5.9.2-5.9.3 General Least Squares Regression: #7
Part I and #8 Part II

Hw7: Least Squares Regression: see datasets
*Lanai_elev_faa_GG413.txt** * and* hf.txt*

__Week
9: 10/17 & 10/19 (Hw #7 due)__

**For
Wed 10/19: study the first two videos below**

5.9.4 Video #1: Weighted
Least Squares

__Chapter
6: Regression__

6.1 #2:
Line Fitting Revisited:
Confidence Intervals on True Slope, Intercept, and Regression Line

Hw8: Lease Square
Regression II: see hawaii.txt, faultstep.txt, and heaviside.m

***Thu
10/20 is the day of the Great Shake Out (click link to find out what and how).

Click to find out what you need to
know about earthquakes in Hawaii.

__Week
10: 10/24 & 10/26 (HW #8 due)__

**For
Mon 10/24: study the following video**

#3:
Derivation of Variances of True Slope, Intercept, and Regression Line

__Analysis
of Variance (ANOVA)__

**For
Wed 10/26: study this first video**

Video #1: Analysis of Variance (ANOVA) of Linear
Regression

See also Draper
& Smith excerpt

Hw9:
ANOVA, see Hw9_hf.txt, Hw9_Prob2_Chromium.txt, and Hw9_StudentPorosityMeasurements.txt

__Week
11: 10/31 & 10/2 (HW #9 due)__

**For
Mon 10/31 study the next two videos**

4.2.8 #2 One-way ANOVA

4.2.8 #3 Two-way ANOVA

__Chapter
7: Sequences and Time Series Analysis__

**For
Wed 11/2 study video #1 below**

7.1 Markov Chains:
videos #1 and #2

See detailed explanation
of Example 5-1

Hw10: Markov Chains and SOLUTIONS

__Week
12: 11/7 & 11/9 (HW #10 due)__

**For
Mon 11/7 study video #2 on Marcov
Chains**

**For
Wed 11/9 study the following video**

7.5 Autocorrelation,
Video #1

Matlab script shown in
videos, with data for auto- and cross-correlation

HW11:
Autocorrelation and Cross-Correlation, data files: TEMPER.TXT, Chesapeake_salinity.txt

__Week
13: 11/14 __

7.6 Cross-correlation, Video
#2

__Chapter
8: Spectral Analysis__

**For
Wed 11/17 study videos #1-#2. (HW 11 due)**

8.1 Spectral Analysis: Basic Terminology

Video #1: Introduction to spectral
analysis

Video #2: Orthogonality of periodic
functions

__Hw 12:
Spectral Analysis. __See
data file honolulu_resampled.txt

__Week
14: For Mon 11/21 study videos #3 &
#4 below__

8.2 Spectral Analysis: Fitting the Fourier Series Video #3

**Wed
11/23 Continue working on HW12**

8.3 The Periodogram or Discrete Power Spectrum, Video #4

Happy Thanksgiving!

__Chapter
9: Analysis of Directional Data__

__Week
15: Mon 11/28 (HW 12 due), study videos #1 & #2 __

Video #1: Polar histogram, computing means and variance

Video #2: Confidence intervals, One-sample tests of
means

Read Davis Hand out

**Hw13:
Analysis of Directional Data**

See data files Iceland_West.txt and Iceland_East.txt, as well as Matlab script polarhist.m

__For
Wed 11/30, study video #3__

Video #3: Two-sample F test of means

__Week
15: 12/5 & 12/7 (HW #13 due WED)__

Review for Final EXAM

__ __

__FINAL EXAM: Tuesday 12/13 8:10 to 11:30 10:30 a.m.__