Adrian Rubio

Finalizing SSD Model and Training

April 13, 2025 | 2 Minute Read

Hello and welcome! This week, I focused on deepening my understanding of last week’s model by creating inference code and making further improvements. The final results are looking even more promising than before!

During this week, I spent most of my time reviewing the code and making sure I fully understood some of the harder concepts. Since I’m not creating the model completely from scratch — I use Claude to help take it to the next level — some parts of the code can be harder to grasp (for example, the loss function definition). To stay organized, I followed some Org mode notes I made a while back:

# Tips to improve coding
- Start by learning concepts for new model using coding journal
- Use claude effectively when learning new concepts:
    - Make small modifications to code generated by claude
    - Ask for lots of explanations of concepts
    - Request code outlines to fill in 
    - Ask "why" questions about how the generated code works
- Challenge to write small functions I know how to write:
    - Have claude review the code
- Write in blog weekly to recap week's work
- Create podcasts for the finished model and listen to further deepen knowledge

To deepen my knowledge of the model, I created some podcasts using Notebook LM (AI-generated podcasts). I would listen to them in the car, at the library, while studying, and more — helping me take my understanding to the next level. By the end of Thursday, I felt that I understood the code much better.



Once I understood the code better, I created some simple inference code for the model (with Claude’s help) to test it out. I didn’t upload the code to GitHub, but running the tests showed that Claude’s predictions from last week were a bit off.

The real model wasn’t able to detect objects and place bounding boxes with 75–80% mAP as expected. It recognized something, but the placements were often inaccurate, and the classification results had noticeable defects too.

SSD model at this phase

So, I knew what I had to do.

The improved SSD model now uses an EfficientNet-B1 backbone with a Feature Pyramid Network (FPN). The training process combines SGD optimization with momentum (0.9) and a three-phase learning rate schedule. I also applied extensive data augmentation techniques, including mosaic transformations, which merge four training images to help improve small object detection.

For more precise localization, I implemented Generalized IoU (GIoU) loss for bounding box regression. To boost classification accuracy and better handle class imbalance, I used Focal Loss. According to Claude’s predictions, the updated model should achieve 80–84% mAP on the Pascal VOC dataset.

After making these changes, I created new podcasts with the updated code and plan to start listening to them next week.



This post documents the state of the SSD model as of April 13, 2025
To see the current state of the model visit:

SSD object detection model