| Week |
Course Content |
| Week 1 |
[W1]
Introduction to data analysis and AI applications.
Several examples for simple (but useful) analysis with data.
Statistical and mathematical modeling.
Introduction to linear regression, logistic regression and unsupervised machine learning. |
| Week 2 |
[W2]
DATA: Spam email data [Project 1]
DATA: breast cancer and biomarker (case control study) [Project 1]
Statistical sample size calculation.
Linear regression with examples.
Logistic regression and unsupervised machine learning.
Pearson correlation and Spearman’s correlation; Kendall’s tau.
PCA analysis and linear discriminant analysis. |
| Week 3 |
[W3] Probability and distributions. Bayes theorem.
Parametric and nonparametric methods.
Rank-based Mann-Whitney and Wilcoxon procedures.
ROC curve analysis with examples. |
| Week 4 |
[W4] ROC curve versus logistic regression.
Unsupervised learning: K-means method; hierarchical cluster analysis and DIANA.
Minimum entropy clustering.
Bootstrapping and Random Forest (RF) clustering. |
| Week 5 |
[W5] Presentation of Project 1.
Small topic_questionnaire data: internal reliability (Cronbach’s alpha) and inter-rater reliability (Cohen’s kappa) |
| Week 6 |
[W6] DATA: Taiwan PM2.5 data [Project 2]
DATA: currency of 19 country in 6 months [Project 2]
Gradient Boosting and AdaBoosting.
Functional data (regularly sampled) analysis. |
| Week 7 |
[W7] DATA: alpine insect fauna; cluster analysis after data transformation
Special topic: Response surface with 2nd-order model and experimental design |
| Week 8 |
[W8] Generalized linear model (GLM): an introduction.
Longitudinal data analysis, Poisson regression.
Generalized estimating equation (GEE) model, robust inference and generalized linear mixed model (GLMM). |
| Week 9 |
[W9] Presentation of Project 2.
Probability distributions and stochastic processes: a review
(including Markov chain, Brownianian motion and Brownian bridge process with applications in data analysis)
Assignment for online review: Introduction to the fundamentals of speech recognition. |
| Week 10 |
[W10] DATA: Taiwan chickenpox and herpes zoster. [Project 3]
DATA: Lead-exposure workers. [Project 3]
Two by two contingency tables. Epidemiology study design; odds ratio and relative risk. Estimating common effect in multilevel/multicenter studies, conditional logistic regression and risk-set sampling. |
| Week 11 |
[W11] Matching, match-pair design, McNemar procedure. Propensity score matching (PSM). Counter-matching. Stratified analysis and interaction. Basic ideas of modeling. |
| Week 12 |
[W12] DATA: Lung cancer data analysis (Project 3)
DATA: LOS (length of hospital stay) data analysis (Project 3)
Introduction to clinical (medical) data analysis and survival analysis.
Kaplan-Meier estimate, survival models, log-rank and weighted log-rank tests.
Weibull regression.
Cox proportional-hazards regression with applications.
|
| Week 13 |
[W13] Generalized additive model (GAM)
Special topic: seeking maximal association for Y and Xs.
DATA: Taiwan air pollutant data versus health-insurance data bank (several diseases) |
| Week 14 |
[W14] Presentation of Project 3.
Panel discussion.
Project 4 : proposals and discussion
General considerations for model-building sstrategy.
Large-P-small-N question.
|
| Week 15 |
[W15] Time-dependent clustering.
Spatial temporal data analysis and detection of spatial clustering.
DATA: eBird data and avian influenza outbreaks in poultry farms
DATA: Dengue outbreaks
自主學習: homework and exercises |
| Week 16 |
[W16] Functional clustering; noncentral chi-square, noncentral t, and noncentral F distributions; regression trees, CART, random forest (revisited)
DATA: Taiwan’s PM2.5 data
自主學習: homework and exercises
DATA: Ginseng (人蔘) 1H-NMR data
DATA: metabolomic (NMR spectroscopic) data of salmon smolts with integrated ANOVAs.
Project 4: final presentation. |
self-directed learning |
   03.Preparing presentations or reports related to industry and academia.
|