When You Realise Your Dataset is Too Small

Oh no! My dataset’s too small!

OK so you have been brilliant at planning and have acquired a dataset of 20,000 images of Blackgrass. You proudly train the AI and it works in the field. Some of the time.

You move to another field and the classification drops to around 10%. Oh dear. What’s gone wrong?

Well the AI does as its trained to do. You haven’t provided enough image data that is representative of the Real World. OK lets get some more then!

And that’s where it all goes wrong.

The first 20,000 image were of two weed species in a few fields close to each other on a single farm with a specific crop variety on a specific soil type. That’s what the AI knows. In a part of that field there may be poor growth due to soil conditions and so the crop and weeds can look slightly different. We therefore need more data.

But that was in February and its now 3rd March and all the fields you know are growing away rapidly. We will need to move to a sprig crop to get those species. Trouble is they look different as they have developed really fast.

Any that’s when you realise that you need to wait to next year and even then there are no guarantees.