Linear regression in Python (UPenn ENM 375 guest lecture)

gilgi

2019-04-16

I was recently invited to give a guest lecture in the course ENM 375 Biological Data Science I - Fundamentals of Biostatistics at the University of Pennsylvania on the topic of linear regression in Python. As part of my lecture, I walked through this notebook. It might serve as a useful reference, covering everything from simulation and fitting to a wide variety of diagnostics. The walkthrough includes explanations of how to do everything in vanilla numpy/scipy, scikit-learn, and statsmodels. As a bonus, there's even a section on logistic regression at the end.

Read on for more!

Stein's paradox

gilgi

2018-11-29

Comments

I recently heard of Stein's paradox, and at first I couldn't believe it! In this post, I'll convince myself by comparing the risk of a James–Stein estimator to a naive estimator on a simulated high-dimensional dataset.

Creating Dota 2 hero embeddings with Word2vec

gilgi

2018-05-27

Comments

One of the coolest results in natural language processing is the success of word embedding models like Word2vec. These models are able to extract rich semantic information from words using surprisingly simple models like CBOW or skip-gram. What if we could use these generic modelling strategies to learn embeddings for something completely different - say, Dota 2 heroes.

In this post, we'll use the OpenDota API to collect data from professional Dota 2 matches and use Keras to train a Word2vec-like model for hero embeddings.

Estimating enzyme kinetics parameters from steady-state observations

gilgi

2018-01-12

Comments

In yesterday's post, we did some simple fitting of a Michaelis-Menten enzyme kinetics model for a single step of an isolated reaction. What happens when we have multiple reactions with multiple species involved occuring at the same time? Is it possible to infer something about the kinetic parameters of such a system by only looking at the steady-state concentrations of the species in the system under different experimental conditions? In this post, we'll apply some differential equations and simple optimization in Python to try to find out.

Fitting a Michaelis-Menten model to biochemical kinetics data

gilgi

2018-01-11

Comments

Biochem students will likely remember the mathematical beauty of enzyme kinetics models like the Michaelis-Menten model. In this short post, we'll take a look at how we can fit this kind of model to experimental data in Python using some staightforward optimization.

MyAnimeList hype visualization with D3.js

gilgi

2017-11-10

Comments

Everyone loves watching anime, but when there are so many shows airing it can be hard to keep track of what to watch and who's watching what. In this post, we'll use data from MyAnimeList to drive a custom D3.js visualization showing the ratings and number of episodes watched for all our friends.