AI Pytorch Linear model

Overview

PyPi module N/A
git repository https://bitbucket.org/arrizza-public/ai-pytorch-linear
git command git clone git@bitbucket.org:arrizza-public/ai-pytorch-linear.git
Verification Report https://arrizza.com/web-ver/ai-pytorch-linear-report.html
Version Info
  • macOS 14.5, Python 3.10
  • Ubuntu 20.04 focal, Python 3.10
  • Ubuntu 22.04 jammy, Python 3.10
  • Ubuntu 24.04 noble, Python 3.10

Summary

This project creates a simple AI model to determine if a value is >3 or not.

Kudos for this project to my son Thomas. He's studying AI in his Master's and understands Linear vs Convolution models and the remaining aspects needed to make these models work correctly.

Why >3?

As a starting point for learning AI, this has some good properties:

  • I can create thousands or millions of samples easily
  • I know the answer, there is no subtlety about what is and is not ">3". In some models it is difficult to know what a sample should result in. In this case, easy as pie.
  • There are no gaps in the training data. In some models it is difficult to have representative sample data for all possible conditions. In this case, I can have samples less than 3, equal to 3 and above 3. No problem.

The downside? The resulting curve is non-linear. All values below 3 and equal to 3 are false, and then it immediately jumps to true right afterward. AI models normally deal with smooth continuous and differentiable functions.

The solution is to use a sigmoid function. https://en.wikipedia.org/wiki/Sigmoid_function This uses e to make the jump a smooth, monotonically increasing (or decreasing) function.

To check all of this out, you can run either a linear model or a sigmoid model.

To run

You must first train the model:

./doit --train

Note that training also invokes a run of the model:

OK   running 1000 tests
     accuracy:  98.0% passed

To just run the linear model

./doit

You can explicitly run the linear model or the sigmoid model:

./doit --model linear --train # train the linear model
./doit --model linear # load the local model in model_linear.pt and run sample data

./doit --model sigmoid --train # train the sigmoid model
./doit --model sigmoid # load the local model in model_sigmoid.pt and run sample data

Linear model

The linear model works! ...most of the time. If during training you see low loss values:

$ ./doit --model linear --train

 training linear...
 loss: 0.11485961079597473
   0] loss: 0.092
   1] loss: 0.083
   2] loss: 0.076
   3] loss: 0.071
   4] loss: 0.067
   5] loss: 0.065
   6] loss: 0.062
   7] loss: 0.060
   8] loss: 0.058
   9] loss: 0.057

The accuracy normally turns out pretty good:

OK   running 1000 tests
     accuracy:  98.0% passed

But every so often, it doesn't go as well:

$ ./doit --model linear --train

loss: 0.41200000047683716
 0] loss: 0.395
 1] loss: 0.395
 2] loss: 0.395
 3] loss: 0.395
 4] loss: 0.395
 5] loss: 0.395
 6] loss: 0.395
 7] loss: 0.395
 8] loss: 0.395
 9] loss: 0.395
<snip>
OK   running 1000 tests
     accuracy:  63.8% passed

Sigmoid model

The sigmoid model is more reliable, and seems to be more accurate. But it takes longer to run the training.

$ ./doit --model sigmoid --train
training sigmoid...
loss: 0.21909745037555695
 0] loss: 0.069
 1] loss: 0.054
 2] loss: 0.047
 3] loss: 0.043
 4] loss: 0.041
 5] loss: 0.039
 6] loss: 0.037
 7] loss: 0.036
 8] loss: 0.035
 9] loss: 0.035
10] loss: 0.034
11] loss: 0.033
12] loss: 0.033
13] loss: 0.032
14] loss: 0.032
15] loss: 0.032
16] loss: 0.031
17] loss: 0.031
18] loss: 0.030
19] loss: 0.030
<snip>
OK   running 1000 tests
     accuracy:  99.9% passed

That training run took 2 minutes compared to a typical linear training run of 10s.

Note that the sigmoid model can sometimes be inaccurate too.

training sigmoid...
loss: 0.25008127093315125
 0] loss: 0.318
 1] loss: 0.318
 <snip>
19] loss: 0.318
OK   running 1000 tests
     accuracy:  60.3% passed

Overall

A few lessons:

  • you must always test the resulting model! No matter how long it takes to generate or how elegant the model is, there can be inaccuracies.
  • A "loss" value that doesn't shrink or isn't tiny seems to be a pretty good clue that the model won't be very accurate.
  • Once you have an accurate model, save it! Especially if it takes a long time to build.

- John Arrizza