Cat or not a cat with Logistic Regression: Part 1 - Understanding the Basics

Binary classification, a fundamental concept in machine learning, involves categorizing data into one of two distinct classes. Like predicting whether an email is spam or not, whether an image is of a cat or not, or whether a tumor is malignant or benign.
In this blog post, we will explore the concept of binary classification, focusing on logistic regression. We will cover the mathematical foundations, practical implementation, and provide interactive examples to illustrate the concepts.

Binary Classification?
Let's say we have an image of a cat and we want to classify it as either a cat or not a cat. This is a binary classification problem, where the two classes are "cat" and "not cat". The goal is to develop a model that can accurately predict the class of new images based on the features extracted from the images.
Image Decomposition
Every image can be represented as a vector of pixel values. For example, a image can be represented as a vector of values, where each value represents the intensity of a pixel in one of the three color channels (red, green, blue).
So, the feature vector for a image can be represented as:
where is the intensity of the pixel in the image. Let be the number of features in our dataset. The feature vector is a vector, where is the number of features (pixel values). In this case, .
and the target variable can be represented as:
so if we have training examples in our dataset, we can represent the training set as:
where is the feature matrix with X.shape=(n_x, m)
, and is the number of training examples. The target variable can be represented as:
where is the target variable vector with Y.shape=(1, m)
, representing the output labels for each training example.
Logistic Regression
Logistic regression is a statistical method used for binary classification. It models the probability that a given input belongs to a particular class. The logistic regression model uses the logistic function (also known as the sigmoid function) to map any real-valued number into the range of 0 and 1.
In simple words, Given (the input image), we want or what is the probability that the image is a cat given the pixel values.
here, and , and we have some extra parameters and , where is the weight vector and is the bias term.
Output,
is the sigmoid function. The output is interpreted as the probability that the input belongs to class 1 (cat). If , we classify the input as class 1 (cat), otherwise we classify it as class 0 (not cat).
This is the basic idea of logistic regression. In the next post, we will discuss the cost function, gradient descent optimization, and implement a complete logistic regression model from scratch.
Stay tuned for the next part of this series!