Machine Learning - Logistic Regression


Logistic Regression

Logistic regression is a statistical method used for binary classification, which means it is employed to predict the probability of an event occurring, with two possible outcomes typically denoted as 0 and 1 (or "yes" and "no," "true" and "false," etc.).

Here's how logistic regression works:

  1. Binary Outcome: In logistic regression, you have a dependent variable (the one you want to predict) that is binary. For example, it could be whether a customer will buy a product (yes/no), whether a patient has a disease (positive/negative), or whether an email is spam (spam/not spam).

  2. Linear Combination: Logistic regression makes use of a linear combination of predictor variables (also known as independent variables or features). These predictor variables are combined using weights (coefficients) and summed up, similar to linear regression.

  3. Logistic Function (Sigmoid): Unlike linear regression, where the output is a continuous value, logistic regression uses a logistic function (also known as the sigmoid function) to transform the linear combination into a value between 0 and 1. The sigmoid function has an S-shaped curve and is defined as:

  1. Decision Boundary: The logistic regression model calculates probabilities between 0 and 1. You can set a threshold (usually 0.5) to classify the outcomes. If the predicted probability is greater than the threshold, you classify it as 1; otherwise, you classify it as 0.

hθ(x) = g(θTx)

let z = θTx

We can define sigmoid function g(z) as

Cost Function

If we choose previous cost-function, we will get "non-convex" result (where there are many local minima)





Comments