Welcome!

This is a blog primarily about topics in mathematics, machine learning, and technology, but occasionally about other things.

New posts every couple of months!

Featured Content

Browse a selection of my finest words:

Latest Posts:

Transfer learning with PyTorch and Huggingface Transformers

Almost exactly as easy as it sounds
September 10, 2024

One of the most powerful arguments for incorporating deep learning models into your workflow is the possibility of transfer learning: using a pre-trained model’s latent representations as a starting point for your own modeling task. This can be particularly useful when you have a fairly small number of labeled examples, but the task in question is similar to a pre-existing model’s task. So how easy is it to do transfer learning with an LLM? As we’ll see, with HuggingFace’s transformers library, it’s actually quite easy.

Keep reading...

Deep learning in low dimensions

Or, how to build bad implicit factor models
September 3, 2024

Nets work great for problems involving lots of high-dimensional data (such as image or text problems). But, they often struggle for problems with low-dimensional inputs, particularly in cases where the response is not especially smooth with respect to the input. In this post, we’ll explore one way to get around this using “Fourier encodings”, and tackle a couple of interesting applications of this technique, including as a tool for portfolio construction.

Keep reading...

GPT and Technofetishistic Egotism

Hey ChatGPT, can you write a blog post for me?
March 21, 2023

AI is having a moment! New models like ChatGPT and Stable Diffusion have captured the imagination and challenged assumptions about the capabilities and limits of machine learning. But how do these “generative” models work? What does a future where these models are commonplace look like? And what are their limitations? I’ll focus primarily on GPT, but some of this analysis will also apply to image generation models like Stable Diffusion as well (and indeed, with GPT-4’s new visual capabilities, the line between these two categories of models is now rather blurry).

Keep reading...

Did I write something you want to use in your own work? (I'm seriously flattered!) Please do: All code is MIT licensed; any other content is licensed CC-BY-NC unless otherwise indicated. (If neither of those licenses work, let's get in touch.)