From a67da2ab11177d1a64c067214a9b54c13193fea1 Mon Sep 17 00:00:00 2001 From: scdavis50 Date: Wed, 23 Mar 2016 16:42:50 -0500 Subject: [PATCH 01/38] Transcript of my data science studies plan. --- transcripts/scott-davis-transcript.md | 130 ++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) create mode 100644 transcripts/scott-davis-transcript.md diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md new file mode 100644 index 00000000..452fa4c7 --- /dev/null +++ b/transcripts/scott-davis-transcript.md @@ -0,0 +1,130 @@ +

Scott Davis Transcript

+

Open Source Data Science Masters

+ +
I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow [R](https://www.r-project.org/)/[Python](www.python.org) and [QGIS](www.qgis.org). I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects.
+ +Want to collaborate? Get in touch: + * [linkedin](http://www.linkedin.com/in/scottcdavis); + * [twitter](http://www.twitter.com/scottdavisCRE); or + * [email](mailto:scott@tisonadevelopment.com) + + +

Open Source Curriculum

+

Base Introduction

+Data Science Introductions + - [ ] Intro to Data Science by UW / Coursera, online course + - [ ] Data Science Specialization by Johns Hopkins / Coursera + - [X] [Data Scientists Toolbox](https://www.coursera.org/account/accomplishments/certificate/UY4EBM46HL) + - [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL) + - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) + - [X] [Exploratory Data Analysis](https://www.coursera.org/account/accomplishments/records/2PPsRu2Us3sUehBQ) + - [ ] [Reproducible Research] (in progress) + - [ ] [Statistical Inference] + - [ ] [Regression Models] + - [ ] [Practical Machine Learning] (in progress) + - [ ] [Developing Data Products] + - [ ] [Data Science Capstone] +- [ ] [Data Science by Harvard](http://cs109.github.io/2015/) (online course) +- [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) +- [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) +- [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - in Excel, but also works in LibreOffice and so much of business analytics is still in Excel. + + +

Mathematics/Statistics

+ - [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) + - [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html) + - [ ] [Linear Algebra](http://www.amazon.com/Linear-Algebra-Dover-Books-Mathematics/dp/048663518X) + - [ ] Problem-Solving Heuristics: [How to Solve It](http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X) + +

Computing

+R: + - [ ] [R in Action](https://www.manning.com/books/r-in-action-second-edition?a_bid=5c2b1e1d&a_aid=RiA2ed) + - [ ] [R Cookbook](http://shop.oreilly.com/product/9780596809164.do) + - [ ] [Forecasting: Principles and Practice](http://otexts.com/fpp/) + +R Libraries/Task Views + * [ProjectTemplate](http://projecttemplate.net/index.html) + * Spatial Data [CRAN Task View: Analysis of Spatial Data](https://cran.r-project.org/web/views/Spatial.html) + * Spatio-Temporal Data [CRAN Task View: Handling and Analyzing Spatio-Temporal Data](https://cran.r-project.org/web/views/SpatioTemporal.html) + * Optimization [CRAN Task View: Optimization and Mathematical Programming](https://cran.r-project.org/web/views/Optimization.html) + * Finance [CRAN Task View: Empirical Finance](https://cran.r-project.org/web/views/Finance.html) + +Python: + - [ ] [Dive Into Python](http://www.diveintopython.net/) + - [ ] [Google's Python Class](code.google.com/edu/languages/google-python-class/) + - [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) + - [ ] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) + +QGIS: + - [ ] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/) + - [ ] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis) + - [ ] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis) + +MySQL: + - [ ] [Learn MySQL in One Video](https://www.youtube.com/watch?v=yPu6qV5byu4) + - [ ] [MySQL Workbench Starter](code.google.com/edu/languages/google-python-class/) + +Octave: + - [ ] [GNU Octave Beginners Guide](https://www.packtpub.com/big-data-and-business-intelligence/gnu-octave-beginners-guide) + - +PostGIS/PostGRESQL: + - [ ] [PostGIS Essentials](https://www.packtpub.com/big-data-and-business-intelligence/postgis-essentials) + - [ ] [PostGRESQL Tutorial](http://www.postgresqltutorial.com/) + - [ ] [PostgreSQL: Up and Running: A Practical Introduction to the Advanced Open Source Database](http://shop.oreilly.com/product/0636920032144.do) + +

Algorithms

+ - [ ] [Algorithms Design & Analysis](http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=IntroToAlgorithms) Stanford openclassroom + +

Distributed Computing Paradigms

+ - [ ] Intro to Hadoop and MapReduce by Cloudera and Udacity +*Note: I might swap the above course with an EdX course on Apache Spark and distributed computing* + +

Data Mining

+ - [ ] Mining Massive Data Sets, by Stanford and Coursera + - [ ] [Clean Data] (https://www.packtpub.com/big-data-and-business-intelligence/clean-data) + +

Machine Learning/Predictive Analytics - Foundational/Theoretical/Practical

+ - [ ] Machine Learning, by Ng Stanford and Coursera (NB this class requires a lot of higher level math) + - [ ] [An Introduction to Statistical Learning with Applications in R](http://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/) (by the authors of The Elements of Statistical Learning at Stanford.) + - [ ] [Machine Learning with R](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition) + - [ ] [Building a Recommendation System in R](https://www.packtpub.com/big-data-and-business-intelligence/building-recommendation-system-r) + - [ ] [Mastering Predictive Analytics in R](https://www.packtpub.com/application-development/mastering-predictive-analytics-r) + - [ ] [Bootstrapping Machine Learning](http://www.louisdorard.com/machine-learning-book/) + - [ ] [Applied Predictive Modeling] (http://www.amazon.com/gp/product/1461468485?psc=1&redirect=true&ref_=oh_aui_detailpage_o08_s00) + +

Analysis

+ - [ ] [Practical Data Science Cookbook](http://www.diveintopython.net/) + - [ ] [R Data Analysis Cookbook](code.google.com/edu/languages/google-python-class/) + +

Spatial Analysis

+ - [ ] [An Introduction to R for Spatial Analysis and Mapping](http://www.edwardtufte.com/tufte/books_be) + - [ ] [Applied Spatial Data Analysis with R](http://www.springer.com/us/book/9781461476177) + +

Land Use/Transport/Gravity Modeling

+ - [ ] [Integrated Land Use and Transport Modelling: Decision Chains and Hierarchies](http://www.amazon.com/gp/product/0521022177?psc=1&redirect=true&ref_=oh_aui_detailpage_o03_s00) + - [ ] [Gravity and Spatial Interaction Models (Scientific Geography Series)](http://www.amazon.com/gp/product/0803925441?psc=1&redirect=true&ref_=oh_aui_detailpage_o06_s00) + - [ ] [TRANUS Model](http://www.tranus.com/tranus-english) + - [ ] [Urban Sim](https://pypi.python.org/pypi/urbansim) + - [ ] [Huff-tools Package in R] (http://rstudio-pubs-static.s3.amazonaws.com/42357_1e6fcc5bcfec439096eb86a106ebf22e.html) + - +

Data Design/Data Viz

+ - [ ] [Beautiful Evidence](http://www.edwardtufte.com/tufte/books_be) + - [ ] [Semiology of Graphics](http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/1589482611) + - [ ] [Visual Complexity Mapping Patterns of Information](hhttp://www.visualcomplexity.com/vc/book/) + - [ ] [The Visual Display of Quantitative Information](http://www.edwardtufte.com/tufte/books_vdqi) + - [ ] [Design for Information](http://isabelmeirelles.com/book-design-for-information/) + - [ ] [Design Elements: A Graphical Style Manual](http://www.amazon.com/Design-Elements-Graphic-Style-Manual/dp/1592532616) + - [ ] [Storytelling with Data] (http://www.amazon.com/gp/product/1119002257?psc=1&redirect=true&ref_=oh_aui_detailpage_o09_s00) + - [ ] [Mastering Python Data Visualization](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) + - [ ] [The Grammar of Graphics](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) + - [ ] [R Graphics Cookbook](http://shop.oreilly.com/product/9780596809164.do) + +

Relevant prior studies

+ - [X] MS in Community and Regional Planning, UT-Austin + - [X] BA in Liberal Arts, concentration in geography, UT-Austin + +

OpenSource Data Science Masters Capstone Project

+I'm interesting in using data science approaches for better intelligence behind real estate decisions, specifically evaluating population growth, transactions and location decisions. I'd also like to evaluate statistical learning technqiues to make better pricing decisions. Finally, I'd like to develop a model to optimize real estate portfolios. + +If you'd like to pair up for the capstone, [let me know](http://www.twitter.com/scottdavisCRE) + From 244c3c6718fcdbdc5fa1cdcccca76d4471c18ef3 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Wed, 23 Mar 2016 16:46:33 -0500 Subject: [PATCH 02/38] Updated transcript fixing the formatting. --- transcripts/scott-davis-transcript.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 452fa4c7..70ec69f4 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -1,7 +1,7 @@ -

Scott Davis Transcript

-

Open Source Data Science Masters

+

Scott Davis Transcript

+

Open Source Data Science Masters

-
I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow [R](https://www.r-project.org/)/[Python](www.python.org) and [QGIS](www.qgis.org). I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects.
+I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow [R](https://www.r-project.org/)/[Python](www.python.org) and [QGIS](www.qgis.org). I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects. Want to collaborate? Get in touch: * [linkedin](http://www.linkedin.com/in/scottcdavis); @@ -29,7 +29,6 @@ Data Science Introductions - [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - in Excel, but also works in LibreOffice and so much of business analytics is still in Excel. -

Mathematics/Statistics

- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) - [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html) @@ -123,7 +122,7 @@ PostGIS/PostGRESQL: - [X] MS in Community and Regional Planning, UT-Austin - [X] BA in Liberal Arts, concentration in geography, UT-Austin -

OpenSource Data Science Masters Capstone Project

+

OpenSource Data Science Masters Capstone Project

I'm interesting in using data science approaches for better intelligence behind real estate decisions, specifically evaluating population growth, transactions and location decisions. I'd also like to evaluate statistical learning technqiues to make better pricing decisions. Finally, I'd like to develop a model to optimize real estate portfolios. If you'd like to pair up for the capstone, [let me know](http://www.twitter.com/scottdavisCRE) From bcbc0f4e8bb12b659036aba700abd0dfeff65904 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Wed, 23 Mar 2016 16:49:07 -0500 Subject: [PATCH 03/38] Update scott-davis-transcript.md --- transcripts/scott-davis-transcript.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 70ec69f4..31f89e44 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -1,5 +1,5 @@

Scott Davis Transcript

-

Open Source Data Science Masters

+

Open Source Data Science Masters

I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow [R](https://www.r-project.org/)/[Python](www.python.org) and [QGIS](www.qgis.org). I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects. @@ -29,7 +29,7 @@ Data Science Introductions - [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - in Excel, but also works in LibreOffice and so much of business analytics is still in Excel. -

Mathematics/Statistics

+

Mathematics/Statistics

- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) - [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html) - [ ] [Linear Algebra](http://www.amazon.com/Linear-Algebra-Dover-Books-Mathematics/dp/048663518X) @@ -71,18 +71,18 @@ PostGIS/PostGRESQL: - [ ] [PostGRESQL Tutorial](http://www.postgresqltutorial.com/) - [ ] [PostgreSQL: Up and Running: A Practical Introduction to the Advanced Open Source Database](http://shop.oreilly.com/product/0636920032144.do) -

Algorithms

+

Algorithms

- [ ] [Algorithms Design & Analysis](http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=IntroToAlgorithms) Stanford openclassroom

Distributed Computing Paradigms

- [ ] Intro to Hadoop and MapReduce by Cloudera and Udacity *Note: I might swap the above course with an EdX course on Apache Spark and distributed computing* -

Data Mining

+

Data Mining

- [ ] Mining Massive Data Sets, by Stanford and Coursera - [ ] [Clean Data] (https://www.packtpub.com/big-data-and-business-intelligence/clean-data) -

Machine Learning/Predictive Analytics - Foundational/Theoretical/Practical

+

Machine Learning/Predictive Analytics - Foundational/Theoretical/Practical

- [ ] Machine Learning, by Ng Stanford and Coursera (NB this class requires a lot of higher level math) - [ ] [An Introduction to Statistical Learning with Applications in R](http://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/) (by the authors of The Elements of Statistical Learning at Stanford.) - [ ] [Machine Learning with R](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition) @@ -91,7 +91,7 @@ PostGIS/PostGRESQL: - [ ] [Bootstrapping Machine Learning](http://www.louisdorard.com/machine-learning-book/) - [ ] [Applied Predictive Modeling] (http://www.amazon.com/gp/product/1461468485?psc=1&redirect=true&ref_=oh_aui_detailpage_o08_s00) -

Analysis

+

Analysis

- [ ] [Practical Data Science Cookbook](http://www.diveintopython.net/) - [ ] [R Data Analysis Cookbook](code.google.com/edu/languages/google-python-class/) @@ -122,7 +122,7 @@ PostGIS/PostGRESQL: - [X] MS in Community and Regional Planning, UT-Austin - [X] BA in Liberal Arts, concentration in geography, UT-Austin -

OpenSource Data Science Masters Capstone Project

+

OpenSource Data Science Masters Capstone Project

I'm interesting in using data science approaches for better intelligence behind real estate decisions, specifically evaluating population growth, transactions and location decisions. I'd also like to evaluate statistical learning technqiues to make better pricing decisions. Finally, I'd like to develop a model to optimize real estate portfolios. If you'd like to pair up for the capstone, [let me know](http://www.twitter.com/scottdavisCRE) From f306f389f60dfccc9ecf5edc5163e4232f451091 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Wed, 23 Mar 2016 16:50:27 -0500 Subject: [PATCH 04/38] Update scott-davis-transcript.md --- transcripts/scott-davis-transcript.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 31f89e44..a99cd68d 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -65,7 +65,7 @@ MySQL: Octave: - [ ] [GNU Octave Beginners Guide](https://www.packtpub.com/big-data-and-business-intelligence/gnu-octave-beginners-guide) - - + PostGIS/PostGRESQL: - [ ] [PostGIS Essentials](https://www.packtpub.com/big-data-and-business-intelligence/postgis-essentials) - [ ] [PostGRESQL Tutorial](http://www.postgresqltutorial.com/) @@ -74,7 +74,7 @@ PostGIS/PostGRESQL:

Algorithms

- [ ] [Algorithms Design & Analysis](http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=IntroToAlgorithms) Stanford openclassroom -

Distributed Computing Paradigms

+

Distributed Computing Paradigms

- [ ] Intro to Hadoop and MapReduce by Cloudera and Udacity *Note: I might swap the above course with an EdX course on Apache Spark and distributed computing* From 4464351e4392af295c94723fae98a11b269c5ecc Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Wed, 23 Mar 2016 16:51:28 -0500 Subject: [PATCH 05/38] Update scott-davis-transcript.md --- transcripts/scott-davis-transcript.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index a99cd68d..e0d2002c 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -109,7 +109,7 @@ PostGIS/PostGRESQL:

Data Design/Data Viz

- [ ] [Beautiful Evidence](http://www.edwardtufte.com/tufte/books_be) - [ ] [Semiology of Graphics](http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/1589482611) - - [ ] [Visual Complexity Mapping Patterns of Information](hhttp://www.visualcomplexity.com/vc/book/) + - [ ] [Visual Complexity Mapping Patterns of Information](http://www.visualcomplexity.com/vc/book/) - [ ] [The Visual Display of Quantitative Information](http://www.edwardtufte.com/tufte/books_vdqi) - [ ] [Design for Information](http://isabelmeirelles.com/book-design-for-information/) - [ ] [Design Elements: A Graphical Style Manual](http://www.amazon.com/Design-Elements-Graphic-Style-Manual/dp/1592532616) From 7e8245feaf3d7aa4b5c640eacbadf092b9b8988a Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Sat, 16 Apr 2016 20:46:52 -0600 Subject: [PATCH 06/38] Added a couple of resources, fixed tags --- scott-davis-transcript.md | 133 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 133 insertions(+) create mode 100644 scott-davis-transcript.md diff --git a/scott-davis-transcript.md b/scott-davis-transcript.md new file mode 100644 index 00000000..e054318a --- /dev/null +++ b/scott-davis-transcript.md @@ -0,0 +1,133 @@ +

Scott Davis Transcript

+

Open Source Data Science Masters

+ +
I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow [R](https://www.r-project.org/)/[Python](www.python.org) and [QGIS](www.qgis.org). I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects.
+ +Want to collaborate? Get in touch: + * [linkedin](http://www.linkedin.com/in/scottcdavis); + * [twitter](http://www.twitter.com/scottdavisCRE); or + * [email](mailto:scott@tisonadevelopment.com) + + +

Open Source Curriculum

+

Base Introduction

+Data Science Introductions + - [ ] Intro to Data Science by UW / Coursera, online course + - [ ] Data Science Specialization by Johns Hopkins / Coursera + - [X] [Data Scientists Toolbox](https://www.coursera.org/account/accomplishments/certificate/UY4EBM46HL) + - [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL) + - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) + - [X] [Exploratory Data Analysis](https://www.coursera.org/account/accomplishments/records/2PPsRu2Us3sUehBQ) + - [X] [Reproducible Research] + - [ ] [Statistical Inference] (in progress) + - [ ] [Regression Models] (in progress) + - [X] [Practical Machine Learning] + - [ ] [Developing Data Products] + - [ ] [Data Science Capstone] +- [ ] [Data Science by Harvard](http://cs109.github.io/2015/) (online course) +- [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) +- [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) +- [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - in Excel, but also works in LibreOffice and so much of business analytics is still in Excel. + + +

Mathematics/Statistics

+ - [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) + - [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html) + - [ ] [Linear Algebra](http://www.amazon.com/Linear-Algebra-Dover-Books-Mathematics/dp/048663518X) + - [ ] Problem-Solving Heuristics: [How to Solve It](http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X) + +

Computing

+R: + - [ ] [R in Action](https://www.manning.com/books/r-in-action-second-edition?a_bid=5c2b1e1d&a_aid=RiA2ed) + - [ ] [R Cookbook](http://shop.oreilly.com/product/9780596809164.do) + - [ ] [Forecasting: Principles and Practice](http://otexts.com/fpp/) + +R Libraries/Task Views + * [ProjectTemplate](http://projecttemplate.net/index.html) + * Spatial Data [CRAN Task View: Analysis of Spatial Data](https://cran.r-project.org/web/views/Spatial.html) + * Spatio-Temporal Data [CRAN Task View: Handling and Analyzing Spatio-Temporal Data](https://cran.r-project.org/web/views/SpatioTemporal.html) + * Optimization [CRAN Task View: Optimization and Mathematical Programming](https://cran.r-project.org/web/views/Optimization.html) + * Finance [CRAN Task View: Empirical Finance](https://cran.r-project.org/web/views/Finance.html) + +Python: + - [ ] [Dive Into Python](http://www.diveintopython.net/) + - [ ] [Google's Python Class](code.google.com/edu/languages/google-python-class/) + - [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) + - [ ] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) + +QGIS: + - [X] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/) + - [X] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis) + - [ ] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis) + - [ ] [GIS Tutorial Workbook 1](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=232&moduleID=1) This is for ArcView, but you can work the examples in QGIS too + - [ ] [GIS Tutorial Workbook 2: Spatial Analysis](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=230&moduleID=0) This is for ArcView, but you can work the examples in QGIS too + - [ ] [QGIS Map Design](https://locatepress.com/qmd) I've just thumbed through this, but it's beautiful and belongs on any list of GIS books. + +MySQL: + - [ ] [Learn MySQL in One Video](https://www.youtube.com/watch?v=yPu6qV5byu4) + - [ ] [MySQL Workbench Starter](code.google.com/edu/languages/google-python-class/) + +Octave: + - [ ] [GNU Octave Beginners Guide](https://www.packtpub.com/big-data-and-business-intelligence/gnu-octave-beginners-guide) + - +PostGIS/PostGRESQL: + - [ ] [PostGIS Essentials](https://www.packtpub.com/big-data-and-business-intelligence/postgis-essentials) + - [ ] [PostGRESQL Tutorial](http://www.postgresqltutorial.com/) + - [ ] [PostgreSQL: Up and Running: A Practical Introduction to the Advanced Open Source Database](http://shop.oreilly.com/product/0636920032144.do) + +

Algorithms

+ - [ ] [Algorithms Design & Analysis](http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=IntroToAlgorithms) Stanford openclassroom + +

Distributed Computing Paradigms

+ - [ ] Intro to Hadoop and MapReduce by Cloudera and Udacity +*Note: I might swap the above course with an EdX course on Apache Spark and distributed computing* + +

Data Mining

+ - [ ] Mining Massive Data Sets, by Stanford and Coursera + - [ ] [Clean Data](https://www.packtpub.com/big-data-and-business-intelligence/clean-data) + +

Machine Learning/Predictive Analytics - Foundational/Theoretical/Practical

+ - [ ] Machine Learning, by Ng Stanford and Coursera (NB this class requires a lot of higher level math) + - [ ] [An Introduction to Statistical Learning with Applications in R](http://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/) (by the authors of The Elements of Statistical Learning at Stanford.) + - [ ] [Machine Learning with R](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition) + - [ ] [Building a Recommendation System in R](https://www.packtpub.com/big-data-and-business-intelligence/building-recommendation-system-r) + - [ ] [Mastering Predictive Analytics in R](https://www.packtpub.com/application-development/mastering-predictive-analytics-r) + - [ ] [Bootstrapping Machine Learning](http://www.louisdorard.com/machine-learning-book/) + - [ ] [Applied Predictive Modeling](http://www.amazon.com/gp/product/1461468485?psc=1&redirect=true&ref_=oh_aui_detailpage_o08_s00) + +

Analysis

+ - [ ] [Practical Data Science Cookbook](http://www.diveintopython.net/) + - [ ] [R Data Analysis Cookbook](code.google.com/edu/languages/google-python-class/) + +

Spatial Analysis

+ - [ ] [An Introduction to R for Spatial Analysis and Mapping](https://us.sagepub.com/en-us/nam/an-introduction-to-r-for-spatial-analysis-and-mapping/book241031) + - [ ] [Applied Spatial Data Analysis with R](http://www.springer.com/us/book/9781461476177) + +

Land Use/Transport/Gravity Modeling

+ - [ ] [Integrated Land Use and Transport Modelling: Decision Chains and Hierarchies](http://www.amazon.com/gp/product/0521022177?psc=1&redirect=true&ref_=oh_aui_detailpage_o03_s00) + - [ ] [Gravity and Spatial Interaction Models (Scientific Geography Series)](http://www.amazon.com/gp/product/0803925441?psc=1&redirect=true&ref_=oh_aui_detailpage_o06_s00) + - [ ] [TRANUS Model](http://www.tranus.com/tranus-english) + - [ ] [Urban Sim](https://pypi.python.org/pypi/urbansim) + - [ ] [Huff-tools Package in R](http://rstudio-pubs-static.s3.amazonaws.com/42357_1e6fcc5bcfec439096eb86a106ebf22e.html) + - +

Data Design/Data Viz

+ - [ ] [Beautiful Evidence](http://www.edwardtufte.com/tufte/books_be) + - [ ] [Semiology of Graphics](http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/1589482611) + - [ ] [Visual Complexity Mapping Patterns of Information](hhttp://www.visualcomplexity.com/vc/book/) + - [ ] [The Visual Display of Quantitative Information](http://www.edwardtufte.com/tufte/books_vdqi) + - [ ] [Design for Information](http://isabelmeirelles.com/book-design-for-information/) + - [ ] [Design Elements: A Graphical Style Manual](http://www.amazon.com/Design-Elements-Graphic-Style-Manual/dp/1592532616) + - [ ] [Storytelling with Data](http://www.amazon.com/gp/product/1119002257?psc=1&redirect=true&ref_=oh_aui_detailpage_o09_s00) + - [ ] [Mastering Python Data Visualization](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) + - [ ] [The Grammar of Graphics](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) + - [ ] [R Graphics Cookbook](http://shop.oreilly.com/product/9780596809164.do) + +

Relevant prior studies

+ - [X] MS in Community and Regional Planning, UT-Austin + - [X] BA in Liberal Arts, concentration in geography, UT-Austin + +

OpenSource Data Science Masters Capstone Project

+I'm interesting in using data science approaches for better intelligence behind real estate decisions, specifically evaluating population growth, transactions and location decisions. I'd also like to evaluate statistical learning technqiues to make better pricing decisions. Finally, I'd like to develop a model to optimize real estate portfolios. + +If you'd like to pair up for the capstone, [let me know](http://www.twitter.com/scottdavisCRE) + From 70ca8069450152b45d44bd46a54362d8f82d6439 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Mon, 18 Apr 2016 18:17:45 -0500 Subject: [PATCH 07/38] Updates to transcript --- transcripts/scott-davis-transcript.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index e0d2002c..34ca55a9 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -18,22 +18,22 @@ Data Science Introductions - [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL) - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) - [X] [Exploratory Data Analysis](https://www.coursera.org/account/accomplishments/records/2PPsRu2Us3sUehBQ) - - [ ] [Reproducible Research] (in progress) + - [X] [Reproducible Research] - [ ] [Statistical Inference] - [ ] [Regression Models] - - [ ] [Practical Machine Learning] (in progress) + - [X] [Practical Machine Learning] - [ ] [Developing Data Products] - [ ] [Data Science Capstone] - [ ] [Data Science by Harvard](http://cs109.github.io/2015/) (online course) - [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) -- [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) +- [ ] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - in Excel, but also works in LibreOffice and so much of business analytics is still in Excel.

Mathematics/Statistics

- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) - [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html) - [ ] [Linear Algebra](http://www.amazon.com/Linear-Algebra-Dover-Books-Mathematics/dp/048663518X) - - [ ] Problem-Solving Heuristics: [How to Solve It](http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X) + - [X] Problem-Solving Heuristics: [How to Solve It](http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X)

Computing

R: @@ -55,8 +55,8 @@ Python: - [ ] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) QGIS: - - [ ] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/) - - [ ] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis) + - [X] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/) + - [X] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis) - [ ] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis) MySQL: From d22bab895c23096bcf15e8f171386e4dcc8d7981 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Mon, 18 Apr 2016 18:39:45 -0500 Subject: [PATCH 08/38] additional updates, fixed some formatting --- transcripts/scott-davis-transcript.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 34ca55a9..35a0e0b9 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -27,12 +27,12 @@ Data Science Introductions - [ ] [Data Science by Harvard](http://cs109.github.io/2015/) (online course) - [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) - [ ] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) -- [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - in Excel, but also works in LibreOffice and so much of business analytics is still in Excel. +- [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses.

Mathematics/Statistics

- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) - [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html) - - [ ] [Linear Algebra](http://www.amazon.com/Linear-Algebra-Dover-Books-Mathematics/dp/048663518X) + - [ ] [Linear Programming: An Introduction With Applications (Second Edition)](http://www.amazon.com/Linear-Programming-Introduction-Applications-Edition/dp/1463543670?ie=UTF8&psc=1&redirect=true&ref_=oh_aui_detailpage_o01_s00) - [X] Problem-Solving Heuristics: [How to Solve It](http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X)

Computing

@@ -58,6 +58,9 @@ QGIS: - [X] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/) - [X] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis) - [ ] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis) + - [X] [GIS Tutorial Workbook 1](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=232&moduleID=1) This is for ArcView, but you can work the examples in QGIS too + - [ ] [GIS Tutorial Workbook 2: Spatial Analysis](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=230&moduleID=0) This is for ArcView, but you can work the examples in QGIS too + - [ ] [QGIS Map Design](https://locatepress.com/qmd) I've just thumbed through this, but it's beautiful and belongs on any list of GIS books. MySQL: - [ ] [Learn MySQL in One Video](https://www.youtube.com/watch?v=yPu6qV5byu4) From de7679edd89294354ccc2f4aec30ae73404e4dde Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Mon, 18 Apr 2016 18:41:31 -0500 Subject: [PATCH 09/38] updates --- transcripts/scott-davis-transcript.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 35a0e0b9..157b4285 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -1,7 +1,7 @@

Scott Davis Transcript

Open Source Data Science Masters

-I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow [R](https://www.r-project.org/)/[Python](www.python.org) and [QGIS](www.qgis.org). I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects. +I'm going to have some time for indepedent study this year so I plan to work through as much as possible. I work in the real estate industry and we have so much data that isn't used for meaningful analysis and the tools, though readily available, haven't caught up for the needs of real estate users. That's what I'm interested in working on. I use a lot of GIS and R, so my curriculum is tailored to follow [R](https://www.r-project.org/)/[Python](www.python.org) and [QGIS](www.qgis.org). I'm a bit of an open-source nut so I like learning much better this way. I'm looking for people to connect with, and possibly to work on projects. Also, maybe not technically purely open source as I've used a lot of books - which I've linked to here. Want to collaborate? Get in touch: * [linkedin](http://www.linkedin.com/in/scottcdavis); From 386dce9f2761d09d5697b697aa82e0a2074595dd Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Tue, 19 Apr 2016 08:44:11 -0500 Subject: [PATCH 10/38] corrected a link --- transcripts/scott-davis-transcript.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 157b4285..854444e0 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -19,14 +19,14 @@ Data Science Introductions - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) - [X] [Exploratory Data Analysis](https://www.coursera.org/account/accomplishments/records/2PPsRu2Us3sUehBQ) - [X] [Reproducible Research] - - [ ] [Statistical Inference] - - [ ] [Regression Models] + - [ ] [Statistical Inference] in progress + - [ ] [Regression Models] in progress - [X] [Practical Machine Learning] - [ ] [Developing Data Products] - [ ] [Data Science Capstone] - [ ] [Data Science by Harvard](http://cs109.github.io/2015/) (online course) - [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) -- [ ] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) +- [X] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses.

Mathematics/Statistics

@@ -118,7 +118,7 @@ PostGIS/PostGRESQL: - [ ] [Design Elements: A Graphical Style Manual](http://www.amazon.com/Design-Elements-Graphic-Style-Manual/dp/1592532616) - [ ] [Storytelling with Data] (http://www.amazon.com/gp/product/1119002257?psc=1&redirect=true&ref_=oh_aui_detailpage_o09_s00) - [ ] [Mastering Python Data Visualization](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) - - [ ] [The Grammar of Graphics](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) + - [ ] [The Grammar of Graphics](http://www.springer.com/us/book/9780387245447) - [ ] [R Graphics Cookbook](http://shop.oreilly.com/product/9780596809164.do)

Relevant prior studies

From aef621d45ef3a5305e180ac4d16df40305ecb82b Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Fri, 22 Apr 2016 18:15:09 -0500 Subject: [PATCH 11/38] Updated with new algorithm certificaiton --- transcripts/scott-davis-transcript.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 854444e0..218c0a7e 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -18,16 +18,15 @@ Data Science Introductions - [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL) - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) - [X] [Exploratory Data Analysis](https://www.coursera.org/account/accomplishments/records/2PPsRu2Us3sUehBQ) - - [X] [Reproducible Research] + - [X] [Reproducible Research](https://www.coursera.org/account/accomplishments/certificate/YRP8NLFYPCV9) - [ ] [Statistical Inference] in progress - [ ] [Regression Models] in progress - - [X] [Practical Machine Learning] + - [X] [Practical Machine Learning](https://www.coursera.org/account/accomplishments/certificate/AJJS85KTU6GZ) - [ ] [Developing Data Products] - [ ] [Data Science Capstone] -- [ ] [Data Science by Harvard](http://cs109.github.io/2015/) (online course) - [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) - [X] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) -- [ ] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses. +- [X] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses.

Mathematics/Statistics

- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) @@ -75,7 +74,13 @@ PostGIS/PostGRESQL: - [ ] [PostgreSQL: Up and Running: A Practical Introduction to the Advanced Open Source Database](http://shop.oreilly.com/product/0636920032144.do)

Algorithms

- - [ ] [Algorithms Design & Analysis](http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=IntroToAlgorithms) Stanford openclassroom +- [ ] Data Structures and Algorithms by UCSD / Coursera + - [ ] [Algorithmic Toolbox] in progress + - [ ] [Data Structures] + - [ ] [Algorithms on Graphs and Trees] + - [ ] [Algorithms on Strings] + - [ ] [Advanced Algorithms and Complexity] + - [ ] [Assembling Genomes and Finding Disease-Causing Mutations]

Distributed Computing Paradigms

- [ ] Intro to Hadoop and MapReduce by Cloudera and Udacity From 5590047bcfe3f68e1221f5826e192d33ff9cdc6a Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Thu, 28 Apr 2016 22:35:20 -0500 Subject: [PATCH 12/38] Update with edx classes --- transcripts/scott-davis-transcript.md | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 218c0a7e..2614e5ed 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -12,7 +12,10 @@ Want to collaborate? Get in touch:

Open Source Curriculum

Base Introduction

Data Science Introductions - - [ ] Intro to Data Science by UW / Coursera, online course +- [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) +- [ ] [Data Science from Scratch](http://shop.oreilly.com/product/0636920033400.do) +- [X] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) +- [X] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses. - [ ] Data Science Specialization by Johns Hopkins / Coursera - [X] [Data Scientists Toolbox](https://www.coursera.org/account/accomplishments/certificate/UY4EBM46HL) - [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL) @@ -24,9 +27,7 @@ Data Science Introductions - [X] [Practical Machine Learning](https://www.coursera.org/account/accomplishments/certificate/AJJS85KTU6GZ) - [ ] [Developing Data Products] - [ ] [Data Science Capstone] -- [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) -- [X] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) -- [X] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses. +

Mathematics/Statistics

- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) @@ -50,6 +51,7 @@ R Libraries/Task Views Python: - [ ] [Dive Into Python](http://www.diveintopython.net/) - [ ] [Google's Python Class](code.google.com/edu/languages/google-python-class/) + - [ ] [Introduction to Python for Data Science - edx](https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info) - [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) - [ ] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) @@ -82,22 +84,22 @@ PostGIS/PostGRESQL: - [ ] [Advanced Algorithms and Complexity] - [ ] [Assembling Genomes and Finding Disease-Causing Mutations] -

Distributed Computing Paradigms

- - [ ] Intro to Hadoop and MapReduce by Cloudera and Udacity -*Note: I might swap the above course with an EdX course on Apache Spark and distributed computing* +

Disributed Computing

+ - [ ] Introduction to Spark, edx + - [ ] Machine Learning with Spark, edx

Data Mining

- [ ] Mining Massive Data Sets, by Stanford and Coursera - [ ] [Clean Data] (https://www.packtpub.com/big-data-and-business-intelligence/clean-data)

Machine Learning/Predictive Analytics - Foundational/Theoretical/Practical

- - [ ] Machine Learning, by Ng Stanford and Coursera (NB this class requires a lot of higher level math) + - [ ] [Statistical Learning with Trevor Hastie and Robert Tibshirani](http://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/) - [ ] [An Introduction to Statistical Learning with Applications in R](http://www.r-bloggers.com/in-depth-introduction-to-machine-learning-in-15-hours-of-expert-videos/) (by the authors of The Elements of Statistical Learning at Stanford.) - [ ] [Machine Learning with R](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition) - [ ] [Building a Recommendation System in R](https://www.packtpub.com/big-data-and-business-intelligence/building-recommendation-system-r) - [ ] [Mastering Predictive Analytics in R](https://www.packtpub.com/application-development/mastering-predictive-analytics-r) - [ ] [Bootstrapping Machine Learning](http://www.louisdorard.com/machine-learning-book/) - - [ ] [Applied Predictive Modeling] (http://www.amazon.com/gp/product/1461468485?psc=1&redirect=true&ref_=oh_aui_detailpage_o08_s00) + - [ ] [Applied Predictive Modeling](http://www.amazon.com/gp/product/1461468485?psc=1&redirect=true&ref_=oh_aui_detailpage_o08_s00)

Analysis

- [ ] [Practical Data Science Cookbook](http://www.diveintopython.net/) @@ -112,7 +114,7 @@ PostGIS/PostGRESQL: - [ ] [Gravity and Spatial Interaction Models (Scientific Geography Series)](http://www.amazon.com/gp/product/0803925441?psc=1&redirect=true&ref_=oh_aui_detailpage_o06_s00) - [ ] [TRANUS Model](http://www.tranus.com/tranus-english) - [ ] [Urban Sim](https://pypi.python.org/pypi/urbansim) - - [ ] [Huff-tools Package in R] (http://rstudio-pubs-static.s3.amazonaws.com/42357_1e6fcc5bcfec439096eb86a106ebf22e.html) + - [ ] [Huff-tools Package in R](http://rstudio-pubs-static.s3.amazonaws.com/42357_1e6fcc5bcfec439096eb86a106ebf22e.html) -

Data Design/Data Viz

- [ ] [Beautiful Evidence](http://www.edwardtufte.com/tufte/books_be) From 7be0b77b0de3fafe7905fbc8311c39cb6613e0d0 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Mon, 2 May 2016 16:08:09 -0500 Subject: [PATCH 13/38] Updated with websites, along with completions to date --- transcripts/scott-davis-transcript.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 2614e5ed..8baa8b60 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -51,7 +51,7 @@ R Libraries/Task Views Python: - [ ] [Dive Into Python](http://www.diveintopython.net/) - [ ] [Google's Python Class](code.google.com/edu/languages/google-python-class/) - - [ ] [Introduction to Python for Data Science - edx](https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info) + - [X] [Introduction to Python for Data Science - edx](https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info) - [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) - [ ] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) @@ -98,16 +98,17 @@ PostGIS/PostGRESQL: - [ ] [Machine Learning with R](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-r-second-edition) - [ ] [Building a Recommendation System in R](https://www.packtpub.com/big-data-and-business-intelligence/building-recommendation-system-r) - [ ] [Mastering Predictive Analytics in R](https://www.packtpub.com/application-development/mastering-predictive-analytics-r) - - [ ] [Bootstrapping Machine Learning](http://www.louisdorard.com/machine-learning-book/) + - [X] [Bootstrapping Machine Learning](http://www.louisdorard.com/machine-learning-book/) - [ ] [Applied Predictive Modeling](http://www.amazon.com/gp/product/1461468485?psc=1&redirect=true&ref_=oh_aui_detailpage_o08_s00)

Analysis

- - [ ] [Practical Data Science Cookbook](http://www.diveintopython.net/) - - [ ] [R Data Analysis Cookbook](code.google.com/edu/languages/google-python-class/) + - [ ] [Practical Data Science Cookbook](https://www.packtpub.com/big-data-and-business-intelligence/practical-data-science-cookbook) + - [ ] [R Data Analysis Cookbook](http://www.amazon.com/Data-Analysis-Cookbook-Recipes-Deliver/dp/1783989068)

Spatial Analysis

- - [ ] [An Introduction to R for Spatial Analysis and Mapping](http://www.edwardtufte.com/tufte/books_be) + - [ ] [An Introduction to R for Spatial Analysis and Mapping](https://uk.sagepub.com/en-gb/eur/an-introduction-to-r-for-spatial-analysis-and-mapping/book241031) - [ ] [Applied Spatial Data Analysis with R](http://www.springer.com/us/book/9781461476177) + - [ ] [Geospatial Analysis - 5th Edition, 2015 - de Smith, Goodchild, Longley](http://www.spatialanalysisonline.com/HTML/index.html)

Land Use/Transport/Gravity Modeling

- [ ] [Integrated Land Use and Transport Modelling: Decision Chains and Hierarchies](http://www.amazon.com/gp/product/0521022177?psc=1&redirect=true&ref_=oh_aui_detailpage_o03_s00) @@ -123,7 +124,7 @@ PostGIS/PostGRESQL: - [ ] [The Visual Display of Quantitative Information](http://www.edwardtufte.com/tufte/books_vdqi) - [ ] [Design for Information](http://isabelmeirelles.com/book-design-for-information/) - [ ] [Design Elements: A Graphical Style Manual](http://www.amazon.com/Design-Elements-Graphic-Style-Manual/dp/1592532616) - - [ ] [Storytelling with Data] (http://www.amazon.com/gp/product/1119002257?psc=1&redirect=true&ref_=oh_aui_detailpage_o09_s00) + - [X] [Storytelling with Data](http://www.amazon.com/gp/product/1119002257?psc=1&redirect=true&ref_=oh_aui_detailpage_o09_s00) - [ ] [Mastering Python Data Visualization](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) - [ ] [The Grammar of Graphics](http://www.springer.com/us/book/9780387245447) - [ ] [R Graphics Cookbook](http://shop.oreilly.com/product/9780596809164.do) From ee559df5eadd2cede68c7c1463360170ca527eb9 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Sat, 7 May 2016 08:22:08 -0500 Subject: [PATCH 14/38] Updated with course completions --- transcripts/scott-davis-transcript.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 8baa8b60..27fc7a96 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -22,10 +22,10 @@ Data Science Introductions - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) - [X] [Exploratory Data Analysis](https://www.coursera.org/account/accomplishments/records/2PPsRu2Us3sUehBQ) - [X] [Reproducible Research](https://www.coursera.org/account/accomplishments/certificate/YRP8NLFYPCV9) - - [ ] [Statistical Inference] in progress - - [ ] [Regression Models] in progress + - [X] [Statistical Inference](https://www.coursera.org/account/accomplishments/records/9733QCP94GEF) + - [X] [Regression Models](https://www.coursera.org/account/accomplishments/records/PP8SKS7CPSDC) - [X] [Practical Machine Learning](https://www.coursera.org/account/accomplishments/certificate/AJJS85KTU6GZ) - - [ ] [Developing Data Products] + - [ ] [Developing Data Products] in progress - [ ] [Data Science Capstone] From 2c84ef2ef5b8238097a8f91ffbfd1054d9693a41 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Wed, 11 May 2016 19:33:43 -0500 Subject: [PATCH 15/38] Updated completions --- transcripts/scott-davis-transcript.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 27fc7a96..d35e270c 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -50,7 +50,7 @@ R Libraries/Task Views Python: - [ ] [Dive Into Python](http://www.diveintopython.net/) - - [ ] [Google's Python Class](code.google.com/edu/languages/google-python-class/) + - [X] [Google's Python Class](code.google.com/edu/languages/google-python-class/) - [X] [Introduction to Python for Data Science - edx](https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info) - [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) - [ ] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) From 208dd78ab0b12cd162a8f3770577103d2dcb5be1 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Mon, 23 May 2016 15:46:04 -0500 Subject: [PATCH 16/38] Completed Developing Data products class --- transcripts/scott-davis-transcript.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index d35e270c..dd961653 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -25,7 +25,7 @@ Data Science Introductions - [X] [Statistical Inference](https://www.coursera.org/account/accomplishments/records/9733QCP94GEF) - [X] [Regression Models](https://www.coursera.org/account/accomplishments/records/PP8SKS7CPSDC) - [X] [Practical Machine Learning](https://www.coursera.org/account/accomplishments/certificate/AJJS85KTU6GZ) - - [ ] [Developing Data Products] in progress + - [X] [Developing Data Products](https://www.coursera.org/account/accomplishments/certificate/6QREL457PPKE) - [ ] [Data Science Capstone] From e47f6998952591e75f536c3ce47a13b43244b10a Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Mon, 23 May 2016 15:53:40 -0500 Subject: [PATCH 17/38] updated with edx materials --- transcripts/scott-davis-transcript.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index dd961653..62733735 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -86,7 +86,7 @@ PostGIS/PostGRESQL:

Disributed Computing

- [ ] Introduction to Spark, edx - - [ ] Machine Learning with Spark, edx + - [ ] Distributed Machine Learning with Apache Spark, edx

Data Mining

- [ ] Mining Massive Data Sets, by Stanford and Coursera @@ -116,7 +116,8 @@ PostGIS/PostGRESQL: - [ ] [TRANUS Model](http://www.tranus.com/tranus-english) - [ ] [Urban Sim](https://pypi.python.org/pypi/urbansim) - [ ] [Huff-tools Package in R](http://rstudio-pubs-static.s3.amazonaws.com/42357_1e6fcc5bcfec439096eb86a106ebf22e.html) - - + - [ ] [Big Data for Smart Cities](https://courses.edx.org/courses/course-v1:IEEEx+IntroData.x+2016_T3/info) +

Data Design/Data Viz

- [ ] [Beautiful Evidence](http://www.edwardtufte.com/tufte/books_be) - [ ] [Semiology of Graphics](http://www.amazon.com/Semiology-Graphics-Diagrams-Networks-Maps/dp/1589482611) From 791674ea3e83fb5d07a384f7795a23f7223c38cd Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Mon, 23 May 2016 19:47:48 -0500 Subject: [PATCH 18/38] Finished webscraping with python --- transcripts/scott-davis-transcript.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 62733735..20c2d405 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -53,7 +53,7 @@ Python: - [X] [Google's Python Class](code.google.com/edu/languages/google-python-class/) - [X] [Introduction to Python for Data Science - edx](https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info) - [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) - - [ ] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) + - [X] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) QGIS: - [X] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/) From 5cf800e872bcaccecda901b03cbf9da2e82f4dae Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Tue, 24 May 2016 14:09:15 -0500 Subject: [PATCH 19/38] finished data science from scratch --- transcripts/scott-davis-transcript.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 20c2d405..9ce60026 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -13,7 +13,7 @@ Want to collaborate? Get in touch:

Base Introduction

Data Science Introductions - [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) -- [ ] [Data Science from Scratch](http://shop.oreilly.com/product/0636920033400.do) +- [X] [Data Science from Scratch](http://shop.oreilly.com/product/0636920033400.do) - [X] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [X] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses. - [ ] Data Science Specialization by Johns Hopkins / Coursera From bd079aa12611bbaa8936a764bc9ae3b1b7fcf4c2 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Mon, 13 Jun 2016 14:08:45 -0500 Subject: [PATCH 20/38] Updates for completions Completions for data science specialization Some Python completions Deleted some algorithm classes and added more geospatial --- transcripts/scott-davis-transcript.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 9ce60026..ea4b82b5 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -16,7 +16,7 @@ Data Science Introductions - [X] [Data Science from Scratch](http://shop.oreilly.com/product/0636920033400.do) - [X] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [X] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses. - - [ ] Data Science Specialization by Johns Hopkins / Coursera + - [X] Data Science Specialization by Johns Hopkins / Coursera - [X] [Data Scientists Toolbox](https://www.coursera.org/account/accomplishments/certificate/UY4EBM46HL) - [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL) - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) @@ -26,13 +26,13 @@ Data Science Introductions - [X] [Regression Models](https://www.coursera.org/account/accomplishments/records/PP8SKS7CPSDC) - [X] [Practical Machine Learning](https://www.coursera.org/account/accomplishments/certificate/AJJS85KTU6GZ) - [X] [Developing Data Products](https://www.coursera.org/account/accomplishments/certificate/6QREL457PPKE) - - [ ] [Data Science Capstone] + - [X] [Data Science Capstone]

Mathematics/Statistics

- [ ] [Statistics for Spatial Data, Revised Edition](http://www.wiley.com/WileyCDA/WileyTitle/productCd-1119114616.html) - [ ] [Statistics for Spatio-Temporal Data](http://www.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002348.html) - - [ ] [Linear Programming: An Introduction With Applications (Second Edition)](http://www.amazon.com/Linear-Programming-Introduction-Applications-Edition/dp/1463543670?ie=UTF8&psc=1&redirect=true&ref_=oh_aui_detailpage_o01_s00) + - [X] [Linear Programming: An Introduction With Applications (Second Edition)](http://www.amazon.com/Linear-Programming-Introduction-Applications-Edition/dp/1463543670?ie=UTF8&psc=1&redirect=true&ref_=oh_aui_detailpage_o01_s00) - [X] Problem-Solving Heuristics: [How to Solve It](http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X)

Computing

@@ -49,7 +49,7 @@ R Libraries/Task Views * Finance [CRAN Task View: Empirical Finance](https://cran.r-project.org/web/views/Finance.html) Python: - - [ ] [Dive Into Python](http://www.diveintopython.net/) + - [X] [Dive Into Python](http://www.diveintopython.net/) - [X] [Google's Python Class](code.google.com/edu/languages/google-python-class/) - [X] [Introduction to Python for Data Science - edx](https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info) - [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) @@ -64,7 +64,7 @@ QGIS: - [ ] [QGIS Map Design](https://locatepress.com/qmd) I've just thumbed through this, but it's beautiful and belongs on any list of GIS books. MySQL: - - [ ] [Learn MySQL in One Video](https://www.youtube.com/watch?v=yPu6qV5byu4) + - [X] [Learn MySQL in One Video](https://www.youtube.com/watch?v=yPu6qV5byu4) - [ ] [MySQL Workbench Starter](code.google.com/edu/languages/google-python-class/) Octave: @@ -78,15 +78,12 @@ PostGIS/PostGRESQL:

Algorithms

- [ ] Data Structures and Algorithms by UCSD / Coursera - [ ] [Algorithmic Toolbox] in progress - - [ ] [Data Structures] - - [ ] [Algorithms on Graphs and Trees] - - [ ] [Algorithms on Strings] - - [ ] [Advanced Algorithms and Complexity] - - [ ] [Assembling Genomes and Finding Disease-Causing Mutations] +

Disributed Computing

- [ ] Introduction to Spark, edx - [ ] Distributed Machine Learning with Apache Spark, edx + - [ ] [Big Data for Smart Cities](https://courses.edx.org/courses/course-v1:IEEEx+IntroData.x+2016_T3/info)

Data Mining

- [ ] Mining Massive Data Sets, by Stanford and Coursera @@ -104,11 +101,14 @@ PostGIS/PostGRESQL:

Analysis

- [ ] [Practical Data Science Cookbook](https://www.packtpub.com/big-data-and-business-intelligence/practical-data-science-cookbook) - [ ] [R Data Analysis Cookbook](http://www.amazon.com/Data-Analysis-Cookbook-Recipes-Deliver/dp/1783989068) + - [ ] [Python Data Science Essentials](https://www.packtpub.com/big-data-and-business-intelligence/python-data-science-essentials)

Spatial Analysis

- [ ] [An Introduction to R for Spatial Analysis and Mapping](https://uk.sagepub.com/en-gb/eur/an-introduction-to-r-for-spatial-analysis-and-mapping/book241031) - [ ] [Applied Spatial Data Analysis with R](http://www.springer.com/us/book/9781461476177) - [ ] [Geospatial Analysis - 5th Edition, 2015 - de Smith, Goodchild, Longley](http://www.spatialanalysisonline.com/HTML/index.html) + - [ ] [Learning Geospatial Analysis with Python](https://www.packtpub.com/application-development/learning-geospatial-analysis-python) + - [ ] [Python Geospatial Development - Second Edition](https://www.packtpub.com/application-development/python-geospatial-development-second-edition)

Land Use/Transport/Gravity Modeling

- [ ] [Integrated Land Use and Transport Modelling: Decision Chains and Hierarchies](http://www.amazon.com/gp/product/0521022177?psc=1&redirect=true&ref_=oh_aui_detailpage_o03_s00) @@ -116,7 +116,7 @@ PostGIS/PostGRESQL: - [ ] [TRANUS Model](http://www.tranus.com/tranus-english) - [ ] [Urban Sim](https://pypi.python.org/pypi/urbansim) - [ ] [Huff-tools Package in R](http://rstudio-pubs-static.s3.amazonaws.com/42357_1e6fcc5bcfec439096eb86a106ebf22e.html) - - [ ] [Big Data for Smart Cities](https://courses.edx.org/courses/course-v1:IEEEx+IntroData.x+2016_T3/info) +

Data Design/Data Viz

- [ ] [Beautiful Evidence](http://www.edwardtufte.com/tufte/books_be) @@ -128,7 +128,7 @@ PostGIS/PostGRESQL: - [X] [Storytelling with Data](http://www.amazon.com/gp/product/1119002257?psc=1&redirect=true&ref_=oh_aui_detailpage_o09_s00) - [ ] [Mastering Python Data Visualization](https://www.packtpub.com/big-data-and-business-intelligence/mastering-python-data-visualization) - [ ] [The Grammar of Graphics](http://www.springer.com/us/book/9780387245447) - - [ ] [R Graphics Cookbook](http://shop.oreilly.com/product/9780596809164.do) + - [X] [R Graphics Cookbook](http://shop.oreilly.com/product/9780596809164.do)

Relevant prior studies

- [X] MS in Community and Regional Planning, UT-Austin From 5c4ef33ec086d907cfc15d6a4218ae4a9a185832 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Sun, 17 Jul 2016 15:42:08 -0500 Subject: [PATCH 21/38] Added completion of coursera data science --- transcripts/scott-davis-transcript.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index ea4b82b5..71d51a2d 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -16,7 +16,7 @@ Data Science Introductions - [X] [Data Science from Scratch](http://shop.oreilly.com/product/0636920033400.do) - [X] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [X] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses. - - [X] Data Science Specialization by Johns Hopkins / Coursera + - [X] [Data Science Specialization by Johns Hopkins / Coursera](https://www.coursera.org/account/accomplishments/specialization/3WN77YYQ7QK7) - [X] [Data Scientists Toolbox](https://www.coursera.org/account/accomplishments/certificate/UY4EBM46HL) - [X] [R Programming](https://www.coursera.org/account/accomplishments/records/Va5vuEvGKyr7UyHEL) - [X] [Getting and Cleaning Data](https://www.coursera.org/account/accomplishments/records/ENSGmvNfx24sANRW) @@ -26,7 +26,7 @@ Data Science Introductions - [X] [Regression Models](https://www.coursera.org/account/accomplishments/records/PP8SKS7CPSDC) - [X] [Practical Machine Learning](https://www.coursera.org/account/accomplishments/certificate/AJJS85KTU6GZ) - [X] [Developing Data Products](https://www.coursera.org/account/accomplishments/certificate/6QREL457PPKE) - - [X] [Data Science Capstone] + - [X] [Data Science Capstone](https://www.coursera.org/account/accomplishments/certificate/A9M48VWHBAMT)

Mathematics/Statistics

From ef353f9685115ca1c59adc8e98374222c5c91415 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Sat, 20 Aug 2016 14:29:33 -0500 Subject: [PATCH 22/38] updated with algorithmic toolbox completion --- transcripts/scott-davis-transcript.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 71d51a2d..f8625e2f 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -76,8 +76,8 @@ PostGIS/PostGRESQL: - [ ] [PostgreSQL: Up and Running: A Practical Introduction to the Advanced Open Source Database](http://shop.oreilly.com/product/0636920032144.do)

Algorithms

-- [ ] Data Structures and Algorithms by UCSD / Coursera - - [ ] [Algorithmic Toolbox] in progress +- [ ] Data Structures and Algorithms by UCSD / Coursera [Decided not to take the balance of the specialization) + - [X] [Algorithmic Toolbox] in progress (https://www.coursera.org/account/accomplishments/certificate/RUKKXTCFDAPV)

Disributed Computing

@@ -100,8 +100,8 @@ PostGIS/PostGRESQL:

Analysis

- [ ] [Practical Data Science Cookbook](https://www.packtpub.com/big-data-and-business-intelligence/practical-data-science-cookbook) - - [ ] [R Data Analysis Cookbook](http://www.amazon.com/Data-Analysis-Cookbook-Recipes-Deliver/dp/1783989068) - - [ ] [Python Data Science Essentials](https://www.packtpub.com/big-data-and-business-intelligence/python-data-science-essentials) + - [X] [R Data Analysis Cookbook](http://www.amazon.com/Data-Analysis-Cookbook-Recipes-Deliver/dp/1783989068) + - [X] [Python Data Science Essentials](https://www.packtpub.com/big-data-and-business-intelligence/python-data-science-essentials)

Spatial Analysis

- [ ] [An Introduction to R for Spatial Analysis and Mapping](https://uk.sagepub.com/en-gb/eur/an-introduction-to-r-for-spatial-analysis-and-mapping/book241031) From 8397e0ebd33a581cc8ccf91ca57ad68a602c9067 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Tue, 30 Aug 2016 22:38:08 -0500 Subject: [PATCH 23/38] updated with some additional books finished. --- transcripts/scott-davis-transcript.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index f8625e2f..5a634562 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -52,7 +52,7 @@ Python: - [X] [Dive Into Python](http://www.diveintopython.net/) - [X] [Google's Python Class](code.google.com/edu/languages/google-python-class/) - [X] [Introduction to Python for Data Science - edx](https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info) - - [ ] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) + - [X] [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) - [X] [Webscraping with Python](https://www.packtpub.com/big-data-and-business-intelligence/web-scraping-python) QGIS: @@ -107,8 +107,8 @@ PostGIS/PostGRESQL: - [ ] [An Introduction to R for Spatial Analysis and Mapping](https://uk.sagepub.com/en-gb/eur/an-introduction-to-r-for-spatial-analysis-and-mapping/book241031) - [ ] [Applied Spatial Data Analysis with R](http://www.springer.com/us/book/9781461476177) - [ ] [Geospatial Analysis - 5th Edition, 2015 - de Smith, Goodchild, Longley](http://www.spatialanalysisonline.com/HTML/index.html) - - [ ] [Learning Geospatial Analysis with Python](https://www.packtpub.com/application-development/learning-geospatial-analysis-python) - - [ ] [Python Geospatial Development - Second Edition](https://www.packtpub.com/application-development/python-geospatial-development-second-edition) + - [X] [Learning Geospatial Analysis with Python](https://www.packtpub.com/application-development/learning-geospatial-analysis-python) + - [X] [Python Geospatial Development - Second Edition](https://www.packtpub.com/application-development/python-geospatial-development-second-edition)

Land Use/Transport/Gravity Modeling

- [ ] [Integrated Land Use and Transport Modelling: Decision Chains and Hierarchies](http://www.amazon.com/gp/product/0521022177?psc=1&redirect=true&ref_=oh_aui_detailpage_o03_s00) From 732cc156e4a11602e0889275d723d0f9ebaabd27 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Thu, 8 Sep 2016 08:51:01 -0500 Subject: [PATCH 24/38] updated with book completions --- transcripts/scott-davis-transcript.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index 5a634562..ad8e379b 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -60,7 +60,7 @@ QGIS: - [X] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis) - [ ] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis) - [X] [GIS Tutorial Workbook 1](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=232&moduleID=1) This is for ArcView, but you can work the examples in QGIS too - - [ ] [GIS Tutorial Workbook 2: Spatial Analysis](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=230&moduleID=0) This is for ArcView, but you can work the examples in QGIS too + - [X] [GIS Tutorial Workbook 2: Spatial Analysis](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=230&moduleID=0) This is for ArcView, but you can work the examples in QGIS too - [ ] [QGIS Map Design](https://locatepress.com/qmd) I've just thumbed through this, but it's beautiful and belongs on any list of GIS books. MySQL: From 29b5eefd07af1cb8db13299e38fabd45d92e88e4 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Thu, 15 Sep 2016 14:49:25 -0500 Subject: [PATCH 25/38] updated with book completions --- transcripts/scott-davis-transcript.md | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index ad8e379b..b72b5dbe 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -12,7 +12,7 @@ Want to collaborate? Get in touch:

Open Source Curriculum

Base Introduction

Data Science Introductions -- [ ] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) +- [X] [Data Science with Open Source Tools](http://shop.oreilly.com/product/9780596802363.do) - [X] [Data Science from Scratch](http://shop.oreilly.com/product/0636920033400.do) - [X] [50 Years of Data Science](http://pages.cs.wisc.edu/~anhai/courses/784-fall15/50YearsDataScience.pdf) - [X] [Datasmart](http://www.amazon.com/Data-Smart-Science-Transform-Information/dp/111866146X/ref=sr_1_1?s=books&ie=UTF8&qid=1458768727&sr=1-1&keywords=datasmart) - This book is a thorough review of using Excel for data science tools. Every aspiring data scientist should work through this book because (1) you'll learn a lot because Excel makes you do every step and (2) you'll realize you need to learn R or python or some other way to do these analyses. @@ -39,7 +39,7 @@ Data Science Introductions R: - [ ] [R in Action](https://www.manning.com/books/r-in-action-second-edition?a_bid=5c2b1e1d&a_aid=RiA2ed) - [ ] [R Cookbook](http://shop.oreilly.com/product/9780596809164.do) - - [ ] [Forecasting: Principles and Practice](http://otexts.com/fpp/) + - [X] [Forecasting: Principles and Practice](http://otexts.com/fpp/) R Libraries/Task Views * [ProjectTemplate](http://projecttemplate.net/index.html) @@ -58,14 +58,14 @@ Python: QGIS: - [X] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/) - [X] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis) - - [ ] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis) + - [X] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis) - [X] [GIS Tutorial Workbook 1](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=232&moduleID=1) This is for ArcView, but you can work the examples in QGIS too - [X] [GIS Tutorial Workbook 2: Spatial Analysis](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=230&moduleID=0) This is for ArcView, but you can work the examples in QGIS too - [ ] [QGIS Map Design](https://locatepress.com/qmd) I've just thumbed through this, but it's beautiful and belongs on any list of GIS books. MySQL: - [X] [Learn MySQL in One Video](https://www.youtube.com/watch?v=yPu6qV5byu4) - - [ ] [MySQL Workbench Starter](code.google.com/edu/languages/google-python-class/) + - [X] [MySQL Workbench Starter](code.google.com/edu/languages/google-python-class/) Octave: - [ ] [GNU Octave Beginners Guide](https://www.packtpub.com/big-data-and-business-intelligence/gnu-octave-beginners-guide) @@ -79,12 +79,6 @@ PostGIS/PostGRESQL: - [ ] Data Structures and Algorithms by UCSD / Coursera [Decided not to take the balance of the specialization) - [X] [Algorithmic Toolbox] in progress (https://www.coursera.org/account/accomplishments/certificate/RUKKXTCFDAPV) - -

Disributed Computing

- - [ ] Introduction to Spark, edx - - [ ] Distributed Machine Learning with Apache Spark, edx - - [ ] [Big Data for Smart Cities](https://courses.edx.org/courses/course-v1:IEEEx+IntroData.x+2016_T3/info) -

Data Mining

- [ ] Mining Massive Data Sets, by Stanford and Coursera - [ ] [Clean Data] (https://www.packtpub.com/big-data-and-business-intelligence/clean-data) From f06e81c15c7ae33d2681f13d0e4cf46272c8e254 Mon Sep 17 00:00:00 2001 From: Scott Davis Date: Sat, 17 Sep 2016 08:44:47 -0500 Subject: [PATCH 26/38] updated with some additional books finished. --- transcripts/scott-davis-transcript.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/transcripts/scott-davis-transcript.md b/transcripts/scott-davis-transcript.md index b72b5dbe..d6ba2428 100644 --- a/transcripts/scott-davis-transcript.md +++ b/transcripts/scott-davis-transcript.md @@ -49,6 +49,7 @@ R Libraries/Task Views * Finance [CRAN Task View: Empirical Finance](https://cran.r-project.org/web/views/Finance.html) Python: + - [X] [Jumpstart Python by Building 10 Apps](https://training.talkpython.fm/courses/details/python-language-jumpstart-building-10-apps) This is probably the best introduction to Python that I have seen. - [X] [Dive Into Python](http://www.diveintopython.net/) - [X] [Google's Python Class](code.google.com/edu/languages/google-python-class/) - [X] [Introduction to Python for Data Science - edx](https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info) @@ -58,14 +59,17 @@ Python: QGIS: - [X] [QGIS Tutorials and Tips](http://www.qgistutorials.com/en/) - [X] [Mastering QGIS](https://www.packtpub.com/application-development/mastering-qgis) + - [X] [QGIS 2.0 Cookbook](https://www.packtpub.com/application-development/qgis-2-cookbook) Advanced data management, data visualization and spatial analysis techniques with QGIS. - [X] [Building Mapping Applications with QGIS](https://www.packtpub.com/application-development/building-mapping-applications-qgis) - [X] [GIS Tutorial Workbook 1](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=232&moduleID=1) This is for ArcView, but you can work the examples in QGIS too - [X] [GIS Tutorial Workbook 2: Spatial Analysis](https://esripress.esri.com/display/index.cfm?fuseaction=display&websiteID=230&moduleID=0) This is for ArcView, but you can work the examples in QGIS too + - [ ] QGIS Python Programming Cookbook (https://www.packtpub.com/application-development/qgis-python-programming-cookbook) Automated desktop QGIS processing. - [ ] [QGIS Map Design](https://locatepress.com/qmd) I've just thumbed through this, but it's beautiful and belongs on any list of GIS books. + MySQL: - [X] [Learn MySQL in One Video](https://www.youtube.com/watch?v=yPu6qV5byu4) - - [X] [MySQL Workbench Starter](code.google.com/edu/languages/google-python-class/) + - [X] [MySQL Explained](https://www.ostraining.com/books/mysql/about/) Octave: - [ ] [GNU Octave Beginners Guide](https://www.packtpub.com/big-data-and-business-intelligence/gnu-octave-beginners-guide) @@ -80,7 +84,6 @@ PostGIS/PostGRESQL: - [X] [Algorithmic Toolbox] in progress (https://www.coursera.org/account/accomplishments/certificate/RUKKXTCFDAPV)

Data Mining

- - [ ] Mining Massive Data Sets, by Stanford and Coursera - [ ] [Clean Data] (https://www.packtpub.com/big-data-and-business-intelligence/clean-data)

Machine Learning/Predictive Analytics - Foundational/Theoretical/Practical

From e836ef1260b8ac49275552d17d4c6386261d17f6 Mon Sep 17 00:00:00 2001 From: Clare Date: Tue, 15 Nov 2022 14:36:49 -0800 Subject: [PATCH 27/38] Create jekyll-gh-pages.yml --- .github/workflows/jekyll-gh-pages.yml | 50 +++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 .github/workflows/jekyll-gh-pages.yml diff --git a/.github/workflows/jekyll-gh-pages.yml b/.github/workflows/jekyll-gh-pages.yml new file mode 100644 index 00000000..8d7c7590 --- /dev/null +++ b/.github/workflows/jekyll-gh-pages.yml @@ -0,0 +1,50 @@ +# Sample workflow for building and deploying a Jekyll site to GitHub Pages +name: Deploy Jekyll with GitHub Pages dependencies preinstalled + +on: + # Runs on pushes targeting the default branch + push: + branches: ["master"] + + # Allows you to run this workflow manually from the Actions tab + workflow_dispatch: + +# Sets permissions of the GITHUB_TOKEN to allow deployment to GitHub Pages +permissions: + contents: read + pages: write + id-token: write + +# Allow one concurrent deployment +concurrency: + group: "pages" + cancel-in-progress: true + +jobs: + # Build job + build: + runs-on: ubuntu-latest + steps: + - name: Checkout + uses: actions/checkout@v3 + - name: Setup Pages + uses: actions/configure-pages@v2 + - name: Build with Jekyll + uses: actions/jekyll-build-pages@v1 + with: + source: ./ + destination: ./_site + - name: Upload artifact + uses: actions/upload-pages-artifact@v1 + + # Deployment job + deploy: + environment: + name: github-pages + url: ${{ steps.deployment.outputs.page_url }} + runs-on: ubuntu-latest + needs: build + steps: + - name: Deploy to GitHub Pages + id: deployment + uses: actions/deploy-pages@v1 From e46445007f2aa35e5cfc9b1a21dbb65512adadae Mon Sep 17 00:00:00 2001 From: Harry Doan <35623720+phuongdoan13@users.noreply.github.com> Date: Sat, 19 Nov 2022 00:28:05 +1100 Subject: [PATCH 28/38] Update expired link The old link was expired and warned as phsing link. I updated the new link (licensed and free) --- r-resources.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/r-resources.md b/r-resources.md index cdfa2414..0cbdfd6f 100644 --- a/r-resources.md +++ b/r-resources.md @@ -13,7 +13,7 @@ _[Note: The core of The Open Source Data Science Masters focuses on programmatic #### Basic Statistics with R - * An Introduction to Statistical Learning [Book pdf](http://www-bcf.usc.edu/~gareth/ISL/ISLR%20First%20Printing.pdf) ^also a Machine Learning resource + * An Introduction to Statistical Learning [Book pdf](https://www.statlearning.com/) ^also a Machine Learning resource #### Data Science with R * Introduction to Data Science [Syracuse University / ebook](http://jsresearch.net/index.html) From da46337399691fc70374e87da1f771de353248b8 Mon Sep 17 00:00:00 2001 From: Florian Buetow <2320560+florianbuetow@users.noreply.github.com> Date: Thu, 2 Mar 2023 10:40:12 +0000 Subject: [PATCH 29/38] NLTK book URL fix bit.ly link 404-ed --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6b2cc442..d1fcb37c 100644 --- a/README.md +++ b/README.md @@ -120,7 +120,7 @@ A branch of statistics that uses graphical models and specialized statistics to ### Natural Language Processing The imperfect and immensely useful art (science?) of transforming human language into data. * From Languages to Information / Stanford CS147 [Materials](http://bit.ly/nlpcs124) - * NLP with Python (NLTK library) [Digital](http://bit.ly/ebook-nltk), [Book ```$55```](https://bookshop.org/a/2958/9780596516499) + * NLP with Python (NLTK library) [Digital](https://www.nltk.org/book/), [Book ```$55```](https://bookshop.org/a/2958/9780596516499) * How to Write a Spelling Correcter / Norvig [Tutorial](http://norvig.com/spell-correct.html) ### Graph Analysis From 2d91ff0e45b4fbfad124446ae0f9f6072daaf13b Mon Sep 17 00:00:00 2001 From: Aniket Potabatti Date: Tue, 11 Apr 2023 08:15:36 +0530 Subject: [PATCH 30/38] README.md Updated --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 6b2cc442..56be5af4 100644 --- a/README.md +++ b/README.md @@ -197,7 +197,7 @@ A document conveying the motives, direction, investment, and expected value of t #### Results Presentation A slide deck or document with the goal of conveying the results of the work and how the findings support an important decision(s). -Best appended to the Spec, and summarized in a slide deck for easy consumption. Depending on the culture of the group, slides or a short docuemnt may be easier to look through to understand the results of the work. In the remote work era, think about how your work will be passed around and make sure your "above the fold" is easy to understand and clearly conveys the "why" and results in particular. +Best appended to the Spec, and summarized in a slide deck for easy consumption. Depending on the culture of the group, slides or a short document may be easier to look through to understand the results of the work. In the remote work era, think about how your work will be passed around and make sure your "above the fold" is easy to understand and clearly conveys the "why" and results in particular. __Example__: A particularly polished [presentation](https://medium.com/lyft-engineering/how-lyft-discovered-openstreetmap-is-the-freshest-map-for-rideshare-a7a41bf92ec) of [map quality study results](https://drive.google.com/file/d/1Sb-dOUjeP1Ljqz4ra931D3Pe8B5C3pde/view) showing higher data quality in US maps on OSM than commercially available alternatives. The impact of this work was a) increased confidence in service reliability and b) enabled the company to decide against buying a commercially available annual license costing ~$10mi/yr. From 85d12570274e5dab24a73b392ff2317874f31333 Mon Sep 17 00:00:00 2001 From: Clare Date: Sun, 16 Apr 2023 14:02:46 -0700 Subject: [PATCH 31/38] Update README.md nltk bitly fix --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 00ae3927..15dd4a9a 100644 --- a/README.md +++ b/README.md @@ -120,7 +120,7 @@ A branch of statistics that uses graphical models and specialized statistics to ### Natural Language Processing The imperfect and immensely useful art (science?) of transforming human language into data. * From Languages to Information / Stanford CS147 [Materials](http://bit.ly/nlpcs124) - * NLP with Python (NLTK library) [Digital](https://www.nltk.org/book/), [Book ```$55```](https://bookshop.org/a/2958/9780596516499) + * NLP with Python (NLTK library) [Digital](http://bit.ly/py-nltk), [Book ```$55```](https://bookshop.org/a/2958/9780596516499) * How to Write a Spelling Correcter / Norvig [Tutorial](http://norvig.com/spell-correct.html) ### Graph Analysis From 2c3e09c287b3c57e1659c882efa091296efabedd Mon Sep 17 00:00:00 2001 From: Clare Date: Mon, 17 Apr 2023 16:34:04 -0700 Subject: [PATCH 32/38] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 15dd4a9a..b7364c63 100644 --- a/README.md +++ b/README.md @@ -231,7 +231,7 @@ Show the process you used to disprove your hypothesis, preferably in a jupyter n * Introduction to Information Retrieval / Stanford [Digital](http://bit.ly/ebook-stanford-inforetrieval) & [Book ```$70```](https://bookshop.org/a/2958/9780521865715) * [Data Science in IPython Notebooks](http://bit.ly/ipynb-ds) (Linear Regression, Logistic Regression, Random Forests, K-Means Clustering) * Probabilistic Graphical Models [Stanford / Coursera](http://bit.ly/stanford-pgm) - * Differential Equations in Data Science [Python Tutorial](http://bit.ly/ipynb-differentialeq) + * Differential Equations in Data Science [Python Tutorial](https://web.archive.org/web/20190617023702/https://nbviewer.jupyter.org/github/URXtech/techblog/blob/master/continuousTimeMarkovChain/markovChain.ipynb) * Algorithm Design, Kleinberg & Tardos [Book ```$125```](http://amzn.to/1iMnWm5) * [Tidy Data in Python](http://www.jeannicholashould.com/tidy-data-in-python.html) * Designing, Visualizing and Understanding Deep Neural Networks [Berkeley CS294-129](https://bcourses.berkeley.edu/courses/1453965/pages/cs294-129-designing-visualizing-and-understanding-deep-neural-networks) From 3fd6f158fbc968c776020873179fd073ba71dea2 Mon Sep 17 00:00:00 2001 From: Clare Date: Mon, 17 Apr 2023 16:36:33 -0700 Subject: [PATCH 33/38] Update README.md --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index b7364c63..16441a87 100644 --- a/README.md +++ b/README.md @@ -153,6 +153,7 @@ If you have interest in operations management, manufacturing, supply chains, or ### Deep Learning / Neural Networks * Neural Networks [Andrej Karpathy / Python Walkthrough](http://bit.ly/karpathyneuralnets) + * Neural Networks for Machine Learning [Geoffrey Hinton / U Toronto](https://www.youtube.com/playlist?list=PLoRl3Ht4JOcdU872GhiYWf6jwrk_SNhz9) * Deep Learning for Natural Language Processing CS224d [Stanford](http://cs224d.stanford.edu/syllabus.html) ## 🤝 Doing Data Science From 2d8010a5ed465c6c62833b1a95d568e48114bb38 Mon Sep 17 00:00:00 2001 From: Clare Date: Mon, 17 Apr 2023 16:43:51 -0700 Subject: [PATCH 34/38] Update README.md rm dead link --- README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/README.md b/README.md index 16441a87..856fe0d8 100644 --- a/README.md +++ b/README.md @@ -230,7 +230,6 @@ Show the process you used to disprove your hypothesis, preferably in a jupyter n * Exploratory Data Analysis [Tukey / Book ```$81```](http://amzn.to/1kNUEPa) [```$113```](https://bookshop.org/books/exploratory-data-analysis-classic-version/9780134995458) * Mining Massive Data Sets / Stanford [Course & Digital Textbook](http://bit.ly/mmds-course) & [Book ```$58```](https://bookshop.org/a/2958/9781108476348) * Introduction to Information Retrieval / Stanford [Digital](http://bit.ly/ebook-stanford-inforetrieval) & [Book ```$70```](https://bookshop.org/a/2958/9780521865715) - * [Data Science in IPython Notebooks](http://bit.ly/ipynb-ds) (Linear Regression, Logistic Regression, Random Forests, K-Means Clustering) * Probabilistic Graphical Models [Stanford / Coursera](http://bit.ly/stanford-pgm) * Differential Equations in Data Science [Python Tutorial](https://web.archive.org/web/20190617023702/https://nbviewer.jupyter.org/github/URXtech/techblog/blob/master/continuousTimeMarkovChain/markovChain.ipynb) * Algorithm Design, Kleinberg & Tardos [Book ```$125```](http://amzn.to/1iMnWm5) From a958072e9fe8ac6a062315b1337675c2ff77ba48 Mon Sep 17 00:00:00 2001 From: Clare Date: Mon, 17 Apr 2023 17:46:07 -0700 Subject: [PATCH 35/38] Update README.md Thanks @nanofaroque for the suggestion! --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 856fe0d8..7d42ded0 100644 --- a/README.md +++ b/README.md @@ -243,6 +243,7 @@ Show the process you used to disprove your hypothesis, preferably in a jupyter n * SQL Tutorials [SQLZOO / Tutorials](http://bit.ly/tut-sqlzoo) * Machine Learning [Caltech / Edx](http://bit.ly/caltech-ml) * A Course in Machine Learning [UMD / Digital Book](http://bit.ly/22WyV3N) + * Designing Data Intensive Applications [Book ```$56```](https://bookshop.org/a/2958/9781449373320) *** From c1109b4431b26eb7f3d747b677e18138e1c69957 Mon Sep 17 00:00:00 2001 From: Clare Date: Mon, 17 Apr 2023 18:11:47 -0700 Subject: [PATCH 36/38] Update README.md Thanks to @NathanEpstein for the foundational calculus links! --- README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.md b/README.md index 7d42ded0..b215628f 100644 --- a/README.md +++ b/README.md @@ -64,6 +64,11 @@ Get familiar and comfortable with manipulating data in a database with a common * SQL School [Mode Analytics / Tutorials](http://bit.ly/sqlschool) ### Math & Statistics + +#### Calculus + * Single Variable Calculus [MIT OpenCourseWare](http://ocw.mit.edu/courses/mathematics/18-01-single-variable-calculus-fall-2006/) + * Multivariable Calculus [MIT OpenCourseWare](http://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/) + #### Linear Algebra The foundational mathematics for working with large samples of data. Spend time in exercises until you feel highly confident in the key topics of Linear Algebra. It will serve you well. * An Intuitive Guide to Linear Algebra [Better Explained / Article](https://betterexplained.com/articles/linear-algebra-guide/) From c5da9ac74e3baf183cf1a3b0897788d66e14aa2b Mon Sep 17 00:00:00 2001 From: Clare Date: Mon, 17 Apr 2023 18:13:30 -0700 Subject: [PATCH 37/38] Update README.md Credit @NathanEpstein for the linear algebra OCW link --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index b215628f..95ec7b11 100644 --- a/README.md +++ b/README.md @@ -76,6 +76,7 @@ The foundational mathematics for working with large samples of data. Spend time * Vector Calculus: Understanding the Cross Product [Better Explained / Article](https://betterexplained.com/articles/cross-product/) * Vector Calculus: Understanding the Dot Product [Better Explained / Article](https://betterexplained.com/articles/vector-calculus-understanding-the-dot-product/) * Linear Algebra [Khan Academy / Videos](http://bit.ly/khanlinalg) + * Linear Algebra [MIT](http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/) #### Statistics How can we answer questions with data? Everywhere you look, you'll see methods from statistics. Spend a lot of time here! From 3e4e19c12a93e11a29c65807844ceb2db32edf96 Mon Sep 17 00:00:00 2001 From: Clare Date: Mon, 17 Apr 2023 20:23:28 -0700 Subject: [PATCH 38/38] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 95ec7b11..385a104f 100644 --- a/README.md +++ b/README.md @@ -206,7 +206,7 @@ A slide deck or document with the goal of conveying the results of the work and Best appended to the Spec, and summarized in a slide deck for easy consumption. Depending on the culture of the group, slides or a short document may be easier to look through to understand the results of the work. In the remote work era, think about how your work will be passed around and make sure your "above the fold" is easy to understand and clearly conveys the "why" and results in particular. -__Example__: A particularly polished [presentation](https://medium.com/lyft-engineering/how-lyft-discovered-openstreetmap-is-the-freshest-map-for-rideshare-a7a41bf92ec) of [map quality study results](https://drive.google.com/file/d/1Sb-dOUjeP1Ljqz4ra931D3Pe8B5C3pde/view) showing higher data quality in US maps on OSM than commercially available alternatives. The impact of this work was a) increased confidence in service reliability and b) enabled the company to decide against buying a commercially available annual license costing ~$10mi/yr. +__Example__: A particularly polished [presentation](https://medium.com/lyft-engineering/how-lyft-discovered-openstreetmap-is-the-freshest-map-for-rideshare-a7a41bf92ec) of [map quality study results](https://drive.google.com/file/d/1Sb-dOUjeP1Ljqz4ra931D3Pe8B5C3pde/view) showing higher data quality in US maps on OSM than commercially available alternatives. The impact of this work was a) increased confidence in service reliability for the company and b) enabled the company to decide against buying a commercially available annual license costing millions of dollars annually. ## 🧑‍💻 Capstone Project _Choose a meaningful project or dataset to demonstrate what you've learned._