UM Data Science Infrastructure & Consulting Resources
- Find out more about the Advanced Research Computing Technology Services at the University of Michigan. This presentation and video provide an overview of the tools available.
Experimental design / Working with machine learning algorithms / Feature engineering / Prediction vs. explanation / Network analysis / Collaborative filtering / Code up machine learning algorithms on single machines and on clusters of machines / Amazon AWS / Working on problems with terabytes of data / Machine learning pipelines for petabyte-scale data / Algorithmic design / Parallel computing (with MapReduce)
Python / Python libraries for linear algebra, plotting, machine learning: numpy, matplotlib, sk-learn / Github for submitting project code / MapReduce / Hadoop / MrJob / Spark / Spark Core / data frames / Spark Shell / Spark Streaming / Spark SQL / MLlib
The Future of Big Data / ZDNET article
Cloud / Distributed Storage / Ethereum Blockchain / Apache Spark / Docker / CouchDB / Apache Cassandra / OpenStack Swift / Apache Solr / BVLC Caffe / Nvidia Digits / Keras / IBM Watson / GATK
* Elizabeth Austic, Data Science Resources, GitHub Repository