TheAnalyser
Listed user 936
Offline
Posts : 339
|
|
|
Original Post 2010-Apr-25, 05:19 PM
|
As Requested here is a topic to talk about calcuations/rules/ratings formulas.
As an added bonus I will release my code in the next few days(when I get time to write the install instructions) so that others who can not program will be able to collect their own data and devise their own rules/systems/formula.
If anyone wishes to add comments on ideas they use in their computer programs this would be greatly appreciated.
|
|
|
|
|
Logged
|
|
|
|
| |
Wenona
VIP Club
Group 2 user 175
Online
Posts : 4254
|
|
|
 2010-Jul-19, 06:15 PM
|
"Tables 3&4 - In virtually every range the actual frequency is significantly different than the fundamental model's estimate, and always in the direction of being closer to the public's estimate."[/i]
When they say closer to the public estimate, they are commenting only on the direction of the difference. Eg the model says 10/1 the public say 3/1 the actual may be 9/1. The model says 10/1 the public says 15/1 the actual may be 11/1. They're not saying the public estimate is a better or more accurate approximation. You replace the words closer to with toward and you would get the same meaning and the one they intended.
|
|
|
|
« Last Edit: 2010-Jul-19, 06:23 PM by Wenona »
|
Logged
|
|
|
|
Wenona
VIP Club
Group 2 user 175
Online
Posts : 4254
|
|
|
 2010-Jul-19, 06:21 PM
|
For example the bias in the Harville formula is not a function of probability, but of favouritism.
So why bucket the data in meaningless probability groups?
Bucketing by favourite rank instead would provide far more meaningful results.
I'll think about this more but I don't think that's so. I'd guess the magnitude of bias within the ranks would vary a lot depending on the actual probabilities. By analysing by probability groups the effect of rank would be adjusted for when the probability of running second for example is adjusted back to 100% wouldn't it?
|
|
|
|
« Last Edit: 2010-Jul-19, 06:24 PM by Wenona »
|
Logged
|
|
|
|
jfc
Group 2 user 723
Online
Alias: jfc
Posts : 1201
|
|
|
 2010-Jul-19, 06:50 PM
|
I'll think about this more but I don't think that's so.
I'd guess the magnitude of bias within the ranks would vary a lot depending on the actual probabilities.
By analysing by probability groups the effect of rank would be adjusted for when the probability of running second for example is adjusted back to 100% wouldn't it?
I am sure it is so. In a 100% market a $5 selection might be favourite or (say) 3rd favourite. The favourite will always underperform Harville for 2nd. The 3rd favourite will either outperform Harville for 2nd or underperform by less. Benter would lump these 2 conflicting results in the same bucket. My assertion is based on my personal computer simulations of such situations.
|
|
|
|
|
Logged
|
|
|
|
Wenona
VIP Club
Group 2 user 175
Online
Posts : 4254
|
|
|
 2010-Jul-19, 07:05 PM
|
But wouldn't there be at least a similar discrepancy in the actual bias relating to $3 second favourites and $8 second favourites?
|
|
|
|
|
Logged
|
|
|
|
Wenona
VIP Club
Group 2 user 175
Online
Posts : 4254
|
|
|
 2010-Jul-19, 07:08 PM
|
I am sure it is so.
In a 100% market a $5 selection might be favourite or (say) 3rd favourite.
But don't the calculations when processing the probabilities for second for example, take into account the relative probabilities when working the numbers back to 100% for the field?
So what I'm saying is that the resultant probability for second for those $5 chances, be they favourites or third favourites, won't all end up the same just because they are all $5.
The probabilities for those $5 chances will differ because their relative probability to the others in the race including the favourite (if they aren't favourite) differs and those differences are taken into account in the calculations required to adjust the market back to 100%.
To illustrate the point with an extreme example, if the betting was $5 the field in a field of 6 runners the traditional workout would in fact end up with each runner having zero bias. ie it would calculate each runners probability for second at 16.66%. It would end up at that conclusion because it does adjust for relative probabilities.
|
|
|
|
« Last Edit: 2010-Jul-19, 07:23 PM by Wenona »
|
Logged
|
|
|
|
Wenona
VIP Club
Group 2 user 175
Online
Posts : 4254
|
|
|
 2010-Jul-19, 07:37 PM
|
On another point, the thing that's always pissed me off with the multitude of papers on harville bias is the fact they all just do the math to prove it exists.
Well thanks a lot, but why does it need to be proven a hundred times?
It would be far more interesting to hypothesize about WHY it exists and what factors cause it. When you start to think about why it exists, you begin to realise the factors that cause it are not uniform across all runners in a field nor all runners within a price band.
A more fruitful study would be to try and isolate the variables that can influence the level of harville bias for each particular runner and work towards a method of calculating the individual harville bias for each runner. I believe this would lead to even better estimates of place probabilities.
That would be a real step in advancing all this theory, but I've never read a paper that takes it to that next level.
I've got my own theories and have played around with various approaches for many years, but am hampered as always by time and resource constraints.
I have to admit it's an area that's always engrossed me, but I do despair at the level of intellectual rigour or inquiry found in the published material.
|
|
|
|
« Last Edit: 2010-Jul-19, 07:39 PM by Wenona »
|
Logged
|
|
|
|
jfc
Group 2 user 723
Online
Alias: jfc
Posts : 1201
|
|
|
 2010-Jul-20, 05:35 AM
|
Before possibly addressing Wenona's points, some might find this useful:
Benter's pertinent formula (7) on page 193 can easily be made accessible to many via a spreadsheet.
If Column A contains probabilities for 1st.
Then these formulae in B and C provide the answer for running 2nd.
=POWER(A1,0.81)
=B1/SUM(B:B)
Not that hard to grasp.
However Benter's effort for the B formula is:
=EXP(0.81*LN(A1))
For some bizarre reason Benter uses the expression
"e raised to the power of the natural logarithm"
instead of merely using the equivalent
"power".
And obviously, had he instead provided those spreadsheet cells then far more people might have derived some use from his material.
Perhaps he thinks his style makes him appear extremely clever.
Certainly not the impression I'm left with.
|
|
|
|
|
Logged
|
|
|
|
Wenona
VIP Club
Group 2 user 175
Online
Posts : 4254
|
|
|
 2010-Jul-20, 07:20 AM
|
Determining if .81 is correct for the markets you're betting in is the real issue isn't it?
|
|
|
|
|
Logged
|
|
|
|
jfc
Group 2 user 723
Online
Alias: jfc
Posts : 1201
|
|
|
 2010-Jul-20, 07:48 AM
|
Determining if .81 is correct for the markets you're betting in is the real issue isn't it?
Whether the Benter value for Rho of 0.81 applies elsewhere is only one of a number of irksome issues. When I was trying my own calculations for local races, I wasted a lot of time using Betfair data. Betfair data only provides last price matched. That figure is so erratic that it is impossible to get any meaningful results from it. Many older local studies have used SP. But this figure is now also useless, particularly with markets in the 200% vicinity.
|
|
|
|
|
Logged
|
|
|
|
Wenona
VIP Club
Group 2 user 175
Online
Posts : 4254
|
|
|
 2010-Jul-20, 08:01 AM
|
I used the final TAB dividends for every Brisbane Metropolitan race in the year 2000, but could not find a methodology that could profitably identify value in the place market over that year.
The real point was to inform my quinella betting based on my own markets and getting my head around this stuff was a great asset in doing that.
|
|
|
|
|
Logged
|
|
|
|
jfc
Group 2 user 723
Online
Alias: jfc
Posts : 1201
|
|
|
 2010-Jul-20, 08:26 AM
|
Now I suggest anyone trying to get their head around this material try working through simple examples.
The simplest I can think of is a game with 3 dies A,B,C. Each has pairs of faces marked as:
A: 1 2 5 B: 1 2 4 C: 1 2 3
There are 3^3 (=27) outcomes. But if you ignore dead heats you are left with a manageable sample of 12.
The average winning scores are (obviously):
A: 5 B: 4 C: 3
If they fail to win their average scores are:
A: 1.6 B: 2.33 C: 2.09
And the gaps are
A: 3.4 B: 1.67 C: 0.11
Note in particular the situation with the favourite - A.
If it fails to win its average performance declines far more severely than the others so much so that it can't hope to come 2nd anyway near as well as winning.
Here it wins 7 from 12.
But it only comes 2nd only 2 from 5.
And strangely it runs last more often than 2nd.
Assuming my calculations are right that paradox could lend itself to a neat betting proposition.
In a 3 way contest most would be happy to back at evens about a losing favourite coming second. But here they would lose.
|
|
|
|
|
Logged
|
|
|
|
Grega9430
Listed user 267
Offline
Posts : 476
|
|
|
 2010-Jul-20, 03:53 PM
|
When they say closer to the public estimate, they are commenting only on the direction of the difference. Eg the model says 10/1 the public say 3/1 the actual may be 9/1. The model says 10/1 the public says 15/1 the actual may be 11/1.
They're not saying the public estimate is a better or more accurate approximation. You replace the words closer to with toward and you would get the same meaning and the one they intended.
Wenona, Table's 3 and 4 are all unders and overs and therefore in their probability groups they are all by default toward the public estimate so no point in them saying toward. If the model says 10/1, the public say 15/1 the actual may be 5/1, but as a group when they said closer I think that is what they meant, i.e for their 10/1 probability group when the public says 15/1 the actual is 13.5/1 and therefore "closer" to the public estimate.
|
|
|
|
|
Logged
|
|
|
|
el zoro
Group 2 user 367
Offline
Posts : 4654
|
|
|
 2010-Jul-20, 04:34 PM
|
These are only theoretical & cannot be proved as close to correct unless you establish what the correct result is. To theorise as to the 'correctness' of one theory over another is merely a conclusion not to be confused with a fact.
|
|
|
|
|
Logged
|
|
|
|
jfc
Group 2 user 723
Online
Alias: jfc
Posts : 1201
|
|
|
 2010-Jul-21, 07:04 AM
|
A die has its performance marked on its faces.
But in horseracing no one has yet managed to properly measure performance on a universal scale. Let alone know the performance distribution of a runner.
However key principles are common to both contests.
If you trial every possible outcome and compute the performance gap between winners and non-winners, then the largest decline will be for favourites. And the gaps will monotonically decrease with favourite rank.
In turn this means an outright favourite will always underperform Harville for running 2nd.
And the situation will improve as favouritism decreases. At some point 2nd performance will overtake Harville.
Benter seems unaware of this.
His claim that "The tendency ... for high probability horses to finish 2nd and 3rd less often" is disturbing and misguided.
The key factor is performance rank not probability. And then it is not merely a tendency but an inescapable fact.
Anyone bothering to work through my simple die example should see why and thus know more about this than Benter appears to.
|
|
|
|
|
Logged
|
|
|
|
Wenona
VIP Club
Group 2 user 175
Online
Posts : 4254
|
|
|
 2010-Jul-21, 08:09 AM
|
Let alone know the performance distribution of a runner.
I agree with this. Everything I've read about weight/class handicapping works on the principal of assigning a single value to a horses expected performance and then assuming a normal distribution around that value even if they assign differing standard deviations to various runners in a race. I think one of the best improvements I ever made to my handicapping and market compilation was to let go of that strict assumption of normal distribuion.
|
|
|
|
|
Logged
|
|
|
|
|