Neural Networks Unplugged: Create Your First Neural Network!

Welcome to this beginner-friendly tutorial! If you've ever been curious about how Neural networks work but felt overwhelmed by the…

Zuhair Haleem

~8 min read · September 8, 2024 (Updated: September 8, 2024) · Free: Yes

Welcome to this beginner-friendly tutorial! If you've ever been curious about how Neural networks work but felt overwhelmed by the complexity, you're in the right place. In this guide, I'll walk you through creating a basic neural network from scratch. Whether you're just starting out or looking to brush up on the fundamentals, I'll break down each step in a way that's easy to understand and follow. Let's dive in and demystify the magic behind these powerful algorithms together!

1. Preparing the Data

Data can come in many forms, but to start, we'll generate a straightforward line of data for simplicity. (y=mx+c)

2. Constructing the Model

We'll build a model designed to recognize patterns in our data. This involves selecting a loss function, an optimizer, and setting up a training loop.

3. Training the Model

With our data and model in place, we'll train the model to discover patterns by feeding it the training data.

4. Making Predictions and Assessing the Model

After training, we'll evaluate the model's predictions against the testing data to see how well it has learned.

5. Conclusion

Machine learning is a game of two parts:

Turn your data, whatever it is, into numbers (a representation).
Pick or build a model to learn the representation as best as possible.

Sometimes one and two can be done at the same time.

But what if you don't have data?

Well, that's where we're at now.

No data.

But we can create some.

1. Preparing the Data

import torch
from torch import nn # nn contains all building blocks for neural networks
import matplotlib.pyplot as plt

Let's create our data as a straight line.

We'll use linear regression to create the data with known parameters (things that can be learned by a model) and then we'll use PyTorch to see if we can build a model to estimate these parameters using gradient descent.

# Lets create some known parameters for now
weight = 0.7
bias = 0.3

# Create data
start = 0
end = 1
step = 0.02
X = torch.arange(start, end, step).unsqueeze(dim=1)
y = weight * X + bias

X[:10], y[:10]

Before building a model, split your data into training and test sets to train and evaluate the model effectively.

"But you just said we're going to create a model!" Think like you're training a baby step by step. First by teaching (training and testing)

# Create train/test split
train_split = int(0.8 * len(X)) # 80% of data used for training set, 20% for testing
X_train, y_train = X[:train_split], y[:train_split]
X_test, y_test = X[train_split:], y[train_split:]

len(X_train), len(y_train), len(X_test), len(y_test)

Wonderful, we have 40 samples for training (X_train & y_train) and 10 samples for testing (X_test & y_test). Hang on we're almost there!

Our model will learn the relationship between X_train & y_train, and then we'll assess its performance on X_test and y_test.

Currently, our data is just raw numbers. Let's create a function to visualize it.

def plot_predictions(train_data=X_train,
                     train_labels=y_train,
                     test_data=X_test,
                     test_labels=y_test,
                     predictions=None):
  """
  Plots training data, test data and compares predictions.
  """
  plt.figure(figsize=(10, 7))

  # Plot training data in blue
  plt.scatter(train_data, train_labels, c="b", s=4, label="Training data")

  # Plot test data in green
  plt.scatter(test_data, test_labels, c="g", s=4, label="Testing data")

  if predictions is not None:
    # Plot the predictions in red (predictions were made on the test data)
    plt.scatter(test_data, predictions, c="r", s=4, label="Predictions")

  # Show the legend
  plt.legend(prop={"size": 14});
plot_predictions(); //visualising

NO prediction data input

Awesome!

Now our data isn't just raw numbers — it's visualized as a straight line.

Remember the data explorer's motto: "visualize, visualize, visualize!" Visualizing data helps both machines and humans grasp insights better.

2. Constructing the Model

Let's dive straight into writing the linear regression function-

(I'm assuming you are familiar with basic OOPs concepts)

# Create a Linear Regression model class
class LinearRegressionModel(nn.Module): # <- almost everything in PyTorch is a nn.Module (think of this as neural network lego blocks)
    def __init__(self):
        super().__init__()
        self.weights = nn.Parameter(torch.randn(1, # <- start with random weights (this will get adjusted as the model learns)
                                                dtype=torch.float), # <- PyTorch loves float32 by default
                                   requires_grad=True) # <- can we update this value with gradient descent?)

        self.bias = nn.Parameter(torch.randn(1, # <- start with random bias (this will get adjusted as the model learns)
                                            dtype=torch.float), # float32 is by default
                                requires_grad=True) # <- can we update this value with gradient descent?))

    # Forward defines the computation in the model
    def forward(self, x: torch.Tensor) -> torch.Tensor: # <- "x" is the input data (e.g. training/testing features)
        return self.weights * x + self.bias # <- this is the linear regression formula (y = m*x + b)

Now that we've handled that, let's create an instance of our model class and inspect its parameters using .parameters().

# Set manual seed since nn.Parameter are randomly initialized
torch.manual_seed(42)

# Create an instance of the model (this is a subclass of nn.Module that contains nn.Parameter(s))
model_0 = LinearRegressionModel()

# Check the nn.Parameter(s) within the nn.Module subclass we created
list(model_0.parameters())

Output

You'll see that the weights and bias values are random float tensors. This randomness comes from initializing them with torch.randn().

The idea is to start with random parameters and adjust them through training so that they better match the true parameters we used to generate our straight line data.

Using torch.inference_mode() to make predictions, we can pass the test data X_test to our model and see how well it predicts y_test.

When we feed data into the model, it goes through the forward() method and generates results based on the computations we've set up.

Now, let's see how our model performs with some predictions.

# Make predictions with model
with torch.inference_mode(): 
    y_preds = model_0(X_test)

Lets print the predictions

# Check the predictions
print(f"Number of testing samples: {len(X_test)}") 
print(f"Number of predictions made: {len(y_preds)}")
print(f"Predicted values:\n{y_preds}")
Number of testing samples: 10
Number of predictions made: 10
Predicted values:
tensor([[0.3982],
        [0.4049],
        [0.4116],
        [0.4184],
        [0.4251],
        [0.4318],
        [0.4386],
        [0.4453],
        [0.4520],
        [0.4588]])

Our predictions are still numbers on a page, let's visualize them with our plot_predictions() function we created above.

plot_predictions(predictions=y_preds)

The position of red dots that we see are totally NOT we want. Red dots should be replicating the green dots is when the machine is learning.

No problem, lets fix that!

3. Training the Model

Right now, our model is just making random guesses with its parameters. To improve this, we need to adjust these parameters (weights and biases) so they better fit the data. Instead of manually setting them to default values like 0.7 for weights and 0.3 for biases, it's more interesting to let the model figure out the best parameters on its own through learning.

Let's set up a loss function and an optimizer to help our model get better at making predictions. In our case we'll be using MAE(Mean Absolute Error).

# Create the loss function
loss_fn = nn.L1Loss() # MAE loss 

# Create the optimizer
optimizer = torch.optim.SGD(params=model_0.parameters(), # parameters of target model to optimize
                            lr=0.01) # learning rate (how much the optimizer should change parameters at each step, higher=more (less stable), lower=less (might take a long time))

With our loss function and optimizer ready, we now need to create two loops: one for training and one for testing.

The training loop will let the model learn from the training data by finding patterns between features and labels.

The testing loop will then evaluate how well the model performs on new, unseen data. Each loop ensures the model reviews every sample in the respective dataset.

torch.manual_seed(42)

# Set the number of epochs (how many times the model will pass over the training data)
epochs = 100

# Create empty loss lists to track values
train_loss_values = []
test_loss_values = []
epoch_count = []

for epoch in range(epochs):
    # Training

    # Put model in training mode (defaultl)
    model_0.train()

    # 1. Forward pass on train data using the forward() method inside 
    y_pred = model_0(X_train)
    # print(y_pred)

    # 2. Calculate the loss (how different are our models predictions to the ground truth)
    loss = loss_fn(y_pred, y_train)

    # 3. Zero grad of the optimizer
    optimizer.zero_grad()

    # 4. Loss backwards
    loss.backward()

    # 5. Progress the optimizer
    optimizer.step()

    ### Testing

    # Put the model in evaluation mode
    model_0.eval()

    with torch.inference_mode():
      # 1. Forward pass on test data
      test_pred = model_0(X_test)

      # 2. Caculate loss on test data
      test_loss = loss_fn(test_pred, y_test.type(torch.float)) # predictions come in torch.float datatype, so comparisons need to be done with tensors of the same type

      # Print out what's happening
      if epoch % 10 == 0:
            epoch_count.append(epoch)
            train_loss_values.append(loss.detach().numpy())
            test_loss_values.append(test_loss.detach().numpy())
            print(f"Epoch: {epoch} | MAE Train Loss: {loss} | MAE Test Loss: {test_loss} ")

Woah whats that? Let me simply in simple steps-

Training Loop

Forward Pass: The model processes the training data and makes predictions.
Calculate Loss: Compares the model's predictions with the actual values to measure accuracy.
Zero Gradients: Resets the gradients to prepare for the new calculations.
Backpropagation: Computes how much each parameter needs to change to reduce the loss.
Update Optimizer: Adjusts the model's parameters based on the computed gradients to improve performance.

Testing Loop

Forward Pass: The model processes the test data to make predictions.
Calculate Loss: Measures how far off the model's predictions are from the actual values.
Calculate Evaluation Metrics: Optionally, you can compute additional metrics, like accuracy, to further assess the model's performance.

Notice how there isn't backpropagation method in testing loop as the parameters have already been calculated or the model is trained. We're interested only in the forward pass

4. Making Predictions and Assessing the Model

As we can notice the loss is reducing through every epoch.

We're still not there, we can reduce the loss by increasing the number of epochs.

epoch=300

Cool isn't it. Our Model is smart now. But again do you remember the motto? "visualize, visualize, visualize!"

Lets plot the loss curves!

plt.plot(epoch_count, train_loss_values, label="Train loss")
plt.plot(epoch_count, test_loss_values, label="Test loss")
plt.title("Training and test loss curves")
plt.ylabel("Loss")
plt.xlabel("Epochs")
plt.legend();

The loss curves show the model improving over time because the loss decreases. This happens because our loss function and optimizer adjust the model's weights and biases to better match the data patterns.

5. Conclusion

We started with a model making random guesses and aimed to improve it by adjusting its parameters. We implemented a loss function and optimizer, then created training and testing loops to help the model learn and evaluate its performance. As the model's parameters were refined, the loss decreased, showing that it was better capturing the data patterns and making more accurate predictions.

Thanks for reading! Stay Curious.

#neural-networks #machine-learning #artificial-intelligence #pytorch #data-science

< Go to the original

Neural Networks Unplugged: Create Your First Neural Network!

Welcome to this beginner-friendly tutorial! If you've ever been curious about how Neural networks work but felt overwhelmed by the…

1. Preparing the Data

2. Constructing the Model

3. Training the Model

4. Making Predictions and Assessing the Model

5. Conclusion

1. Preparing the Data

2. Constructing the Model

3. Training the Model

Training Loop

Testing Loop

4. Making Predictions and Assessing the Model

5. Conclusion

Reporting a Problem