Posts

Time to Make the Switch

Image
Upgrade your MacOS default from Python 2 to Python 3 2020 is now in full swing and it’s time to switch your default python if you haven’t done so already. Maintenance of Python 2.7 has stopped as of January 1st. This means that your current version of python is now legacy code! While keeping a version of Python 2 on your system may still be handy for older scripts, now is the time to update your system. In this briefing, we will download Python 3, make it the default and store Python 2 as an alias. Now to begin, let's explore our current Python environment. To find where our default Python is located on the system, simply type: which python This will show you the path of the default Python. This may be informative if you have a default Python outside the /usr/local/bin system as I do. My default Python is a part of an Anaconda distribution. To see how to change the default Python in an Anaconda distribution, scroll down a bit past the default methods. Default MacOS System T

College Football Coaching Salaries

Image
Web Scraping to Find the Price Per Student by School, State, and Conference As I am writing this story, the 2020 National Championship game is playing in the background. The pageantry of the college football playoff system rivals that of the NFL’s own tournament bracket, however, this is orchestrated by a non-profit. A non-profit reaping billions in ad revenue, TV deals, and merchandise licensing then distributing to the member institution’s athletic departments. All this money has to go somewhere right? In a strange paradox, these academic institutions spend millions to recruit the top coaches and many millions more to support the big name coaches with capable assistants. The justification for such spending is to compete internally in the NCAA to recruit the best staff to succeed and receive an even larger distribution of the non-profit’s revenue stream. More articles about coaches salaries in college football can be found [here] , [here] , or [here] . The main goal of this article

What or Why in Machine Learning

Image
A comprehensive guide to interpreting models using Python Machine learning using big data is all the rage in business. These phrases are giving “synergistic office speak” a run for their money. Behind these buzzwords, the development of machine learning techniques and the machines that implement them in the past decade has been truly remarkable. The increasing complexity of models has allowed for machines to better classify, label, and predict continuous values. However, as the models become more complex, how can we be sure the models are not utilizing training biases or predicting on subtle changes to the background noise. Machines make errors differently than humans. (See examples here , here , and here ) Using the python libraries ELI5 , PDPbox , Lime , and SHAP , we can visualize how a model predicts an outcome, weights the importance of features, or distinguishes boundaries in an image. Without further ado, let’s peek behind the curtain of the black box model to see how our mod

Google’s Project Nightingale and Emergent Medical Data

Image
What Other’s Health Data Says About You You know phone apps collect location, call history, social media posts, and more. These data collecting machines are perpetually improving at collecting relevant information to identify customers for selective advertisements. While this dystopian targeted advertising landscape we occupy is frightening, it is merely the tip of iceberg as to the potential of commercialization or discrimination from personal data. Data holds enormous value in our digital economy and no data holds a higher valuation than patient medical records. This data can be used by advertisers to push expensive pharmaceuticals and tailor behavioral ads to target a person’s medical conditions. This medical data can also be collected by insurance companies to calculate patient premiums. Health data influences so many aspects of our lives and our identity. A diagnoses could subject us to lifestyle changes or life-long treatments. For those not wanting to announce an underlying m

RpoB Processing

Image
This blog details the dereplication and taxnomic assignment of reads using rpoB. Also included in this markdown script is taxonomic visualizations. rpoB_analysis Cody Glickman 11/18/2018 Purpose This pipeline details the steps to process rpoB sequencing reads into a format for BLAST processing and subsequent analysis. The workflow details the steps needed to process the data from raw reads to taxonomic identification. This pipeline requires additional tools detailed below and was tested on a MAC OS Mohave. More information can be found at my GitHub Requirements BLAST // NR Database // seqkit // vsearch // Concatenation Script Concatenate rpoB Reads The chunk below contains commands executed on the command line. The script concatenate_reads should be located in the folder above the paired unzipped fasta files. ## Unzip paired end reads for f in folder/*; do gzip -d $f; done; ## Make directory and move zippe