Notes for Big Data Class December 1, 2019

This will be both a paper for the Law Center and a module of my Big Data class. The plan is to take in a bunch of Word documents(Supreme Court decisions) as input, and then subject them to a series of automated textual analyses: classification, clustering, and sentiment analysis. Possibly a couple of visualizations as well.

Annotation 2019-12-01 175556.png

I'm going to try this out with KNIME, which is turning out to be an easy to use (translation:non-STEM undergraduate friendly, no coding) data analytics platform. So far, I was abIe was able to bang out a workflow to ingest text files and turn it into a data structure. Next step is to map it to another data structure that maps words with sentiments.

I wonder what will happen if I feed Republic v. Sereno into a neural network, and then plug that into a chatbot. Will it be a rambling misogynist? Or the very model of reason?

Notes for Big Data Class, November 11, 2019

I’m going to start posting notes of the papers and books I am writing, as well as the lessons I’m planning. At the very least, I can have a low-stakes way of mapping out the idea, documenting it for any claim of priority. Maybe it will even get some feedback, who knows.  


Screen Shot 2019-11-11 at 8.11.14 AM.png

Right now I’m designing the toolchain for the Big Data course I’m teaching next year. Just installed Jupyter, which allows me to intersperse notes with running code and visualizations. It will be the default way to demonstrate and document each lesson (and the way students can turn in exercises). Kaggle also looks interesting. It has some data sets, and can host Jupyter notebooks.The next problem is what language/execution environment to use and embed to the notebooks - RPythonOctave? Right now I am partial towards Ovtave. And then there’s Anaconda, which offers a way to have all these tools in one convenient package, with environments and dependencies automatically managed. 

The Rundown September 27, 2019