Hi everyones
I’m new to google automl tables and have a basic question about which data is worthwhile including in the training of my model.
I have a dataset of golfers and will be looking at the averages of scores over different periods. For example, average over the past 3 months, 6 months, 1 year etc.
My question is, is it worthwhile also including the sample size for each date range for each player. For example, over the past 3 months, some players will have a sample size of 28 while some will only have 2. Those players that have 28 rounds will have more accurate averages than those with 2. However, I didn’t know whether google automl tables would pick up this link automatically, whether I could create a different weighting/reliability variable, or whether there’s a way to specify a link between columns? Or if this automated type of automl isn’t really suitable or just leave out that sample size variable?
Thanks in advance