My Grandpa asked if the programs I was using could calculate his Golf League’s handicaps, so I decided to play around with SQL and Google Sheets to see if I could functionally recreate what they were doing.
The goal is to calculate a player’s handicap, which is the average of the last six months of their scores minus 29. The average is calculated based on how many games they have actually played in the last six months, and the number of scores averaged correlates to total games. For example, Clem played over 20 games so his handicap will be calculated with the maximum possible scores accounted for, that being 8. Schomo only played six games, so the lowest 4 will be used for their average. Handicap is always calculated with the lowest available scores.
This league uses Excel, so upon receiving the data I converted it into a CSV and uploaded it into bigQuery.
First thing I did was change column names to best represent what they were and simplify things in the code. It is much easier to remember ‘someone_scores’ than ‘int64_field_number’. It also seemed to confuse SQL less, as int64 can mean something independently.
(ALTER TABLE grandpa-golf.grandpas_golf_35.should only need the one
RENAME COLUMN int64_field_4 TO schomo_scores;)
To Find the average of Clem’s scores:
SELECT AVG(clem_scores)
FROM grandpa-golf.grandpas_golf_35.should only need the one
LIMIT 8; RESULT: 43.1
Remembering that handicap is the average minus 29, the final computation looks like:
SELECT AVG(clem_scores) - 29
FROM grandpa-golf.grandpas_golf_35.should only need the one
LIMIT 8; RESULT: 14.1
Find the average of Schomo’s scores:
SELECT AVG(schomo_scores) - 29
FROM grandpa-golf.grandpas_golf_35.should only need the one
LIMIT 6; RESULT: 10.5
This data was already automated to calculate a handicap in the league’s excel spreadsheet, so I asked for more data to see if i could recreate those functions.
Grandpa provided the past three years of league data. The names were all replaced with generic “Golfer 001, Golfer 002, etc”. I had planned on converting this Excel sheet into a CSV and manipulating it in SQL like with the smaller sample, but this did not work.
Immediately, there were problems. I had initially tried to just convert the file into a CSV and drop it into SQL, but there were functions that did not transfer properly from what was functionally the PDF I had been emailed. So instead of working with SQL, I decided to pull this into google sheets and recreate the functions for this spreadsheet. We only need the most recent 6 months of scores to calculate our handicap, so once I made a working copy I deleted the data from before this time period. Once that was cleaned up, I started working on a function that would pull the working average from these values, which is still determined by how many total values there were. This correlates as follows: for 20 or more scores average the lowest 8, for 15 to 19 scores average the lowest 6, for 6 to 14 scores average the lowest 4 and for 6 or fewer scores average the lowest 2. We also need to ensure that an average value of 0 returns a value of 0 so our handicap calculator works. My formula ended up being:
=IF(COUNT(E2:AT2)>=20, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&8)))), IF(COUNT(E2:AT2)>=15, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&6)))), IF(COUNT(E2:AT2)>=6, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&4)))), IF(COUNT(E2:AT2)>=1, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&2)))), IF(COUNT(E2:AT2)=0, 0, "")))))
The handicap is just this value minus 29, so for the handicap column the script is relatively simple: =IF(D2=0,0,IF(D2>47,18,D2-29)) This ensures that we will not get a negative value for our handicap, and pulls the basic average from the right place. It also sets the handicap to zero if there are no scores present.
Now that we have our spreadsheet back in working order with our new scripts, we are functionally done. We have recreated what my Grandpa’s league uses to generate handicaps.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
My Grandpa asked if the programs I was using could calculate his Golf League’s handicaps, so I decided to play around with SQL and Google Sheets to see if I could functionally recreate what they were doing.
The goal is to calculate a player’s handicap, which is the average of the last six months of their scores minus 29. The average is calculated based on how many games they have actually played in the last six months, and the number of scores averaged correlates to total games. For example, Clem played over 20 games so his handicap will be calculated with the maximum possible scores accounted for, that being 8. Schomo only played six games, so the lowest 4 will be used for their average. Handicap is always calculated with the lowest available scores.
This league uses Excel, so upon receiving the data I converted it into a CSV and uploaded it into bigQuery.
First thing I did was change column names to best represent what they were and simplify things in the code. It is much easier to remember ‘someone_scores’ than ‘int64_field_number’. It also seemed to confuse SQL less, as int64 can mean something independently.
(ALTER TABLE grandpa-golf.grandpas_golf_35.should only need the one
RENAME COLUMN int64_field_4 TO schomo_scores;)
To Find the average of Clem’s scores:
SELECT AVG(clem_scores)
FROM grandpa-golf.grandpas_golf_35.should only need the one
LIMIT 8; RESULT: 43.1
Remembering that handicap is the average minus 29, the final computation looks like:
SELECT AVG(clem_scores) - 29
FROM grandpa-golf.grandpas_golf_35.should only need the one
LIMIT 8; RESULT: 14.1
Find the average of Schomo’s scores:
SELECT AVG(schomo_scores) - 29
FROM grandpa-golf.grandpas_golf_35.should only need the one
LIMIT 6; RESULT: 10.5
This data was already automated to calculate a handicap in the league’s excel spreadsheet, so I asked for more data to see if i could recreate those functions.
Grandpa provided the past three years of league data. The names were all replaced with generic “Golfer 001, Golfer 002, etc”. I had planned on converting this Excel sheet into a CSV and manipulating it in SQL like with the smaller sample, but this did not work.
Immediately, there were problems. I had initially tried to just convert the file into a CSV and drop it into SQL, but there were functions that did not transfer properly from what was functionally the PDF I had been emailed. So instead of working with SQL, I decided to pull this into google sheets and recreate the functions for this spreadsheet. We only need the most recent 6 months of scores to calculate our handicap, so once I made a working copy I deleted the data from before this time period. Once that was cleaned up, I started working on a function that would pull the working average from these values, which is still determined by how many total values there were. This correlates as follows: for 20 or more scores average the lowest 8, for 15 to 19 scores average the lowest 6, for 6 to 14 scores average the lowest 4 and for 6 or fewer scores average the lowest 2. We also need to ensure that an average value of 0 returns a value of 0 so our handicap calculator works. My formula ended up being:
=IF(COUNT(E2:AT2)>=20, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&8)))), IF(COUNT(E2:AT2)>=15, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&6)))), IF(COUNT(E2:AT2)>=6, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&4)))), IF(COUNT(E2:AT2)>=1, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&2)))), IF(COUNT(E2:AT2)=0, 0, "")))))
The handicap is just this value minus 29, so for the handicap column the script is relatively simple: =IF(D2=0,0,IF(D2>47,18,D2-29)) This ensures that we will not get a negative value for our handicap, and pulls the basic average from the right place. It also sets the handicap to zero if there are no scores present.
Now that we have our spreadsheet back in working order with our new scripts, we are functionally done. We have recreated what my Grandpa’s league uses to generate handicaps.