Data Science Common architectures in convolutional neural networks. In this post, I'll discuss commonly used architectures for convolutional networks. As you'll see, almost all CNN architectures follow the same general design principles of successively applying convolutional layers to the input, periodically

Data Science Variational autoencoders. In my introductory post on autoencoders, I discussed various models (undercomplete, sparse, denoising, contractive) which take data as input and discover some latent state representation of that data. More specifically, our input data

Data Science Introduction to autoencoders. Autoencoders are an unsupervised learning technique in which we leverage neural networks for the task of representation learning. Specifically, we'll design a neural network architecture such that we impose a bottleneck in the

Data Science Setting the learning rate of your neural network. In previous posts, I've discussed how we can train neural networks using backpropagation with gradient descent. One of the key hyperparameters to set in order to train a neural network is the learning

Data Science Learning from imbalanced data. In this blog post, I'll discuss a number of considerations and techniques for dealing with imbalanced data when training a machine learning model. The blog post will rely heavily on a sklearn contributor

Data Science Normalizing your data (specifically, input and batch normalization). In this post, I'll discuss considerations for normalizing your data - with a specific focus on neural networks. In order to understand the concepts discussed, it's important to have an understanding of gradient

Data Science Hyper-parameter tuning for machine learning models. When creating a machine learning model, you'll be presented with design choices as to how to define your model architecture. Often times, we don't immediately know what the optimal model architecture should be

Data Science Generalizing value functions for large state spaces. Up until now, we've discussed the concept of a value function primarily as a lookup table. As our agent visits specific state-action pairs and continues to explore an environment, we update the value

Data Science Implementations of Monte Carlo and Temporal Difference learning. In the previous post, I discussed two different learning methods for reinforcement learning, Monte Carlo learning and temporal difference learning. I then provided a unifying view by considering $n$-step TD learning and

Data Science Learning in a stochastic environment. Previously, I discussed how we can use the Markov Decision Process for planning in stochastic environments. For the process of planning, we already have an understanding of our environment via access to information

Data Science Overview of reinforcement learning. Reinforcement learning is a method of learning where we teach the computer to perform some task by providing it with feedback as it performs actions. This is different from supervised learning in that

Data Science SQL for data analysis. As a data scientist, you deal with a lot of data. For small datasets, maybe you just store this information in a CSV file and load it into Pandas. However, this isn't really

Data Science Convolutional neural networks. In my introductory post on neural networks, I introduced the concept of a neural network that looked something like this. As it turns out, there are many different neural network architectures, each with

Data Science Deep neural networks: preventing overfitting. In previous posts, I've introduced the concept of neural networks and discussed how we can train neural networks. For these posts, we examined neural networks that looked like this. However, many of the

Data Science Planning in a stochastic environment. In this post, I'll be discussing how to calculate the best set of actions to complete a task whilst operating in a known environment, otherwise known as planning. For this scenario, we have

Data Science Evaluating a machine learning model. So you've built a machine learning model and trained it on some data... now what? In this post, I'll discuss how to evaluate your model, and practical advice for improving the model based

Data Science Neural networks: training with backpropagation. In my first post on neural networks, I discussed a model representation for neural networks and how we can feed in inputs and calculate an output. We calculated this output, layer by layer,

Data Science Gradient descent. Gradient descent is an optimization technique commonly used in training machine learning algorithms. Often when we're building a machine learning model, we'll develop a cost function which is capable of measuring how well

Data Science Principal components analysis (PCA). Principal components analysis (PCA) is the most popular dimensionality reduction technique to date. It allows us to take an $n$-dimensional feature-space and reduce it to a $k$-dimensional feature-space while maintaining as

Data Science Neural networks: activation functions. Activation functions are used to determine the firing of neurons in a neural network. Given a linear combination of inputs and weights from the previous layer, the activation function controls how we'll pass

Data Science Feature selection for a machine learning model. Feature selection can be an important part of the machine learning process as it has the ability to greatly improve the performance of our models. While it might seem intuitive to provide your

Data Science Soft clustering with Gaussian mixed models (EM). Sometimes when we're performing clustering on a dataset, there exist points which don't belong strongly to any given cluster. If we were to use something like k-means clustering, we're forced to make a

Data Science Neural networks: representation. This post aims to discuss what a neural network is and how we represent it in a machine learning model. Subsequent posts will cover more advanced topics such as training and optimizing a

Data Science Support vector machines. Today we'll be talking about support vector machines (SVM); this classifier works well in complicated feature domains, albeit requiring clear separation between classes. SVMs don't work well with noisy data, and the algorithm

Data Science Ensemble learning. An ensemble approach to machine learning involves building a collection of models, trained on subsets of data, which are then combined to provide a robust model for classification or prediction. The basic idea