Overview
PyPi module | N/A |
git repository | https://bitbucket.org/arrizza-public/ai-pytorch-linear |
git command | git clone git@bitbucket.org:arrizza-public/ai-pytorch-linear.git |
Verification Report | https://arrizza.com/web-ver/ai-pytorch-linear-report.html |
Version Info |
|
- installation: see https://arrizza.com/setup-common
Summary
This project creates a simple AI model to determine if a value is >3 or not.
Kudos for this project to my son Thomas. He's studying AI in his Master's and understands Linear vs Convolution models and the remaining aspects needed to make these models work correctly.
Why >3?
As a starting point for learning AI, this has some good properties:
- I can create thousands or millions of samples easily
- I know the answer, there is no subtlety about what is and is not ">3". In some models it is difficult to know what a sample should result in. In this case, easy as pie.
- There are no gaps in the training data. In some models it is difficult to have representative sample data for all possible conditions. In this case, I can have samples less than 3, equal to 3 and above 3. No problem.
The downside? The resulting curve is non-linear. All values below 3 and equal to 3 are false, and then it immediately jumps to true right afterward. AI models normally deal with smooth continuous and differentiable functions.
The solution is to use a sigmoid function. https://en.wikipedia.org/wiki/Sigmoid_function
This uses e
to make the jump a smooth, monotonically increasing (or decreasing) function.
To check all of this out, you can run either a linear model or a sigmoid model.
To run
You must first train the model:
./doit --train
Note that training also invokes a run of the model:
OK running 1000 tests
accuracy: 98.0% passed
To just run the linear model
./doit
You can explicitly run the linear model or the sigmoid model:
./doit --model linear --train # train the linear model
./doit --model linear # load the local model in model_linear.pt and run sample data
./doit --model sigmoid --train # train the sigmoid model
./doit --model sigmoid # load the local model in model_sigmoid.pt and run sample data
Linear model
The linear model works! ...most of the time. If during training you see low loss values:
$ ./doit --model linear --train
training linear...
loss: 0.11485961079597473
0] loss: 0.092
1] loss: 0.083
2] loss: 0.076
3] loss: 0.071
4] loss: 0.067
5] loss: 0.065
6] loss: 0.062
7] loss: 0.060
8] loss: 0.058
9] loss: 0.057
The accuracy normally turns out pretty good:
OK running 1000 tests
accuracy: 98.0% passed
But every so often, it doesn't go as well:
$ ./doit --model linear --train
loss: 0.41200000047683716
0] loss: 0.395
1] loss: 0.395
2] loss: 0.395
3] loss: 0.395
4] loss: 0.395
5] loss: 0.395
6] loss: 0.395
7] loss: 0.395
8] loss: 0.395
9] loss: 0.395
<snip>
OK running 1000 tests
accuracy: 63.8% passed
Sigmoid model
The sigmoid model is more reliable, and seems to be more accurate. But it takes longer to run the training.
$ ./doit --model sigmoid --train
training sigmoid...
loss: 0.21909745037555695
0] loss: 0.069
1] loss: 0.054
2] loss: 0.047
3] loss: 0.043
4] loss: 0.041
5] loss: 0.039
6] loss: 0.037
7] loss: 0.036
8] loss: 0.035
9] loss: 0.035
10] loss: 0.034
11] loss: 0.033
12] loss: 0.033
13] loss: 0.032
14] loss: 0.032
15] loss: 0.032
16] loss: 0.031
17] loss: 0.031
18] loss: 0.030
19] loss: 0.030
<snip>
OK running 1000 tests
accuracy: 99.9% passed
That training run took 2 minutes compared to a typical linear training run of 10s.
Note that the sigmoid model can sometimes be inaccurate too.
training sigmoid...
loss: 0.25008127093315125
0] loss: 0.318
1] loss: 0.318
<snip>
19] loss: 0.318
OK running 1000 tests
accuracy: 60.3% passed
Overall
A few lessons:
- you must always test the resulting model! No matter how long it takes to generate or how elegant the model is, there can be inaccuracies.
- A "loss" value that doesn't shrink or isn't tiny seems to be a pretty good clue that the model won't be very accurate.
- Once you have an accurate model, save it! Especially if it takes a long time to build.