Grand River Handicap

I've been working on this handicapping system long enough that I thought that I should give it a name. What could be better than Grand River Handicap (GRH), so there it is. I thought I'd try to describe how it's derived. I'll try to keep the technical details to a minimum.

GRH is based on actual boat performance. In order to rate a boat I need actual race data for that boat in a fleet of other boats to be rated. I need lots of boats racing together in lots of races. The races should span true wind speeds ranging from very light to very heavy. (This is very different from how PHRF does it. PHRF throws out all of the light air races because the boats spread out too much and the data aren't reliable. But they're wrong to do this!) The GRSC race result archive that I've gotten from Gary is just what's required. My current database has 18 boats and almost 90 races. I start by creating a big table of elapsed times with boats in columns and races in rows. I take each boat's elapsed time and divide it by the course length to get what's called a time allowance or TA. This number indicates the number of seconds per mile that a boat takes to complete the course. There's actually a big problem with the GRSC data at this point. We aren’t careful about recording the actual starting time since it isn't a factor in calculating corrected times under PHRF (time on distance). Consequently there are lots of cases where we start the race early or late but the elapsed times are based on the theoretical starting time. This really compromises the data. I need to approach other clubs who score more carefully, perhaps using time on time, to get a really good database to demonstrate how GRH works.

Now for the tricks. When a boat sails in heavy air she sails fast so her TA will be low. When the same boat sails in light air she sails slow so her TA will be high. Consequently if you know a boat's TA for a race you have some idea of how much wind there was. (If it takes 400 sec/mi to complete a course it was honking. If it takes 1200 sec/mi to complete a course the wind was light.) This is how GRH (and Amerciap) avoid the problem of having to measure the wind speed.

The next trick: in any wind speed each boat has its own performance characteristic and consequently its own TA. However, it only sails to that TA if it sails without mistakes. Whenever you make a mistake it takes you longer to complete the course and your TA goes up. Assuming that everyone is sailing in the same wind conditions the boat with the lowest relative TA is the one that made the best use of the available wind, made the fewest mistakes, and is the one who deserves to win.

Once I've got my matrix of TAs created for all of the different boats in different races I need to estimate the wind speed. It's not really important that I get it right, just that I have something to start with. What I usually do is I use the IMS data for one of the boats in the fleet to estimate the TWS. Then the fun starts. I use the predicted wind speed from the first boat to derive performance models for all of the other boats. I use those models to recalculate wind speed predictions and I search for the boat that found the most wind. This is the boat with the lowest relative TA, the one which made the fewest mistakes, …, and the one who deserves to win. This one wind speed is the value that's used to score the race. The GRH wind speeds are not quite true wind speeds. They would be if I knew how just one boat actually performed in different wind conditions. I can make GRH conform to the IMS VPP model, but I don't trust the IMS code and it's not necessary to be concerned with IMS. GRH is a stand-alone performance based model. The software actually processes the data over and over again to refine the wind speeds and performance models. One nice thing about this approach is that regardless of where I start, I usually end up in the same place.

Here's some propaganda I wrote on GRH:

 

Grand River Handicap: A performance based two number handicap system using course length (d) and elapsed time (telapsed) to determine corrected times. The correction equation is:

tcorrected = Atelapsed-Bd

where A and B are determined from actual performance data and reflect how boats actually sail in different wind conditions.

These graphs show the number of seconds per mile that Moonshadow, Bird of Prey, and Thriller finished behind a theoretical reference boat (RB) in many races and in different wind conditions. I've taken the RB to be a Santa Cruz 70. The line that marks the lower bound of the races for a boat corresponds to the performance limit that the boat can approach. By definition a boat cannot cross their performance line. Points that fall on or near the line correspond to wins or at least close finishes behind the leader. The vertical distance between a point and the line corresponds to the seconds per mile that the boat sailed slower than the winning boat.

The graphs clearly show that a PHRF rating based on the distance sailed must fail. The one number PHRF rating corresponds to a single horizontal line drawn through these graphs, such as at 170 sec/mi for Moonshadow (Moonshadow's PHRF rating is 117 and an SC70 is about minus 54) . This is clearly an inadequate model for Moonshadow's actual performance. In heavy air Moonshadow can expect to be about 150 sec/mi behind the RB but in very light air they will be 400 sec/mi behind. Any single number handicap cannot capture the differences in speed that are inherent in real boat performance.

The Grand River Handicap A and B rating coefficients are determined from the equation for the performance limit line. One advantage of the system is that there is no need to record or even guess at the TWS for the race. The only data needed to correct a boat's time is its A and B GRH coefficients, the boat's elapsed time, and the course length.

There are some problems with the data presented here. The biggest problem is the quality of my database, specifically inaccuracies in the elapsed times. Committee boats often start races late but the actual start time never gets recorded so elapsed times are wrong. This doesn't matter for PHRF (time on distance) but it is critical to GRH or any other time on time based scoring system. These time errors compromise the calculation of the A and B rating coefficients. Errors like these are what can make it look like Thriller was 400 sec/mi behind the leader in some races. Given a clean data set with accurate elapsed times and course lengths I can generate an accurate set of rating coefficients for each boat.

Grand River Handicap: A performance based two number handicap system using course length (d) and elapsed time (telapsed) to determine corrected times. The correction equation is:

tcorrected = Atelapsed-Bd

where A and B are determined from actual performance data and reflect how boats actually sail in different wind conditions.

These graphs show the number of seconds per mile that Moonshadow, Bird of Prey, and Thriller finished behind a theoretical reference boat (RB) in many races and in different wind conditions. I've taken the RB to be a Santa Cruz 70. The line that marks the lower bound of the races for a boat corresponds to the performance limit that the boat can approach. The GRH A and B coefficients are determined from this line. By definition a boat cannot cross their performance line. Points that fall on or near the line correspond to wins or at least close finishes behind the leader. The vertical distance between a point and the line corresponds to the seconds per mile that the boat sailed slower than the winning boat.

The graphs clearly show that a PHRF rating based on the distance sailed must fail. The one number PHRF rating corresponds to a single horizontal line drawn through these graphs, such as at 170 sec/mi for Moonshadow (Moonshadow's PHRF rating is 117 and an SC70 is about minus 54) . This is clearly an inadequate model for Moonshadow's actual performance. In heavy air Moonshadow can expect to be about 150 sec/mi behind the RB but in very light air they will be 400 sec/mi behind. Any single number handicap cannot correct for the differences in speed that are inherent in real boat performance.

Another PHRF flaw becomes apparent by comparing the performance of the three boats. At high wind speeds everyone goes fast and boats finish in a pack. In low wind speeds everyone goes slow so boats spread out and finish with big gaps between them. PHRF ignores this behavior by throwing out the light air race data. The result is that boats with high PHRF ratings have an advantage in heavy air and boats with low PHRF ratings have an advantage in light air. Since Lake Erie sailing is mostly in light air everyone's been forced to buy faster and faster boats to stay competitive. This makes it unfair for the older slower boats and drives them out of the sport. Some people might argue that older slower boats shouldn't expect to win, but where will that leave us five or ten years from now?

One advantage of GRH is that there is no need to record or even guess at the TWS for the race. The only data needed to correct a boat's time is its A and B GRH coefficients, the elapsed time, and the course length.

There are some problems with the data presented here. The biggest problem is the quality of my database, specifically inaccuracies in the elapsed times. Committee boats often start races late but the actual start time never gets recorded so elapsed times are wrong. This doesn't matter for PHRF (time on distance) but it is critical to GRH or any other time on time based scoring system. These time errors compromise the calculation of the A and B rating coefficients. Errors like these are what can make it look like Thriller was 400 sec/mi behind the leader in some races. Given a clean data set with accurate elapsed times and course lengths I can generate an accurate set of rating coefficients for each boat.

 Last Revised: 01/26/2002. Counter started on 02/09/02.