Announcement

Collapse
No announcement yet.

Reverse engineering of scoring algorithm

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reverse engineering of scoring algorithm

    Since the scoring algorithm has been regarded as a state secret. I took a shot of reverse engineering the results via a simple regression analysis. First a few words about the data that I used. I have done about 350 puzzles and have solved 65% of them (typical of most puzzles). However I have only collected data from my last 130 puzzles. These puzzle included from 9 to 26 clues. I am no 'wizz' as my best result on a puzzles 110% of the average. My highest score is 480 points. My lowest score is 100 points. My average solution time is about 2.5 time the average time.

    When I started fiddling with the results I noticed that if I calculated the % of points that I obtained on a particular puzzle and compared it to the ratio average time/my time, there was monotonic relationship between the % of points and the quantity average/my time for puzzles where when my average time was less than or equal to twice the average time (30 puzzles). I also noticed that the per cent solution rate was totally uncorrelated with determining the number of points awarded.I applied a linear regression analysis to the data and developed the following relationship

    points awarded = ((0.971 * average time/my time)-0.263)*maximum points on the puzzle.

    The results of the regression analysis showed a correlation analysis of 0.99 with a standard deviation of 2.8 points. Now what about the puzzles that took longer than twice the average time to finish?

    As I mentioned before there seems to be a floor of 100 in awarding points (I never use hints). Thus as the solution time goes above 2.0, the score calculated by the equation above drops below 100, especially for puzzles below 15 clues. I haven't been able to figure out how the equation has been changed but I do notice that if my time/average time goes above 2.25, the awarded scores are all between 100 and 125 with no obvious methodology.

    Since my data is limited to puzzle solution time that are greater than the average time, I would be interested for people that have solved puzzles in faster times to PM me the results in points, times and number of clues. I'm sure that the equation doesn't hold for very, very fast solutions as if the solution is about 1/3.5 times the average, maximum points would be awarded.

    Maybe the administrator will pipe in and let me know if I am on the right track.


  • #2
    Originally posted by duhmel View Post
    Maybe the administrator will pipe in and let me know if I am on the right track.
    You're warm. :-)

    If you enjoy our puzzles, please consider upgrading to a premium account to remove all ads and help support us financially. Thanks for your support!

    Comment


    • #3
      I have now solved over 400 puzzles and have been able to deduce a very accurate algorithm to estimate the number of points awarded per puzzle with a few caveats. While my first attempt to derive the formulas utilized a linear regression analysis, as I obtained more results it became apparent that a quadratic regression model fit the data more accurately. The algorithm is broken into three parts - 1) puzzles solved in less than the average time, 2) those solved between the average time and 2.2 times the average time and 3) puzzles solved in greater than 2.2 times the average time. The models for 1) and 2) have a correlation coefficient of 0.99 (for you probability junkies) with a standard deviation for the amount that the estimated number of points differs from the actual points by about 5 points. Important to note that for items 1) and 2), the derivation of points is ONLY dependent on the quantity of (my time / average time). Neither puzzle solution percentage nor time to solve enters into determination of points.

      1) For puzzles solved in less time than the average but more that 1/3 times the average (fastest time I have solved a puzzle)

      points awarded = ((-0.0835*x*x+0.451*x+0.240)*maximum points on the puzzle.

      where x=my time/average time. For a puzzle solved in the average time, the award is 0.61 times the maximum points for the puzzle

      2) For puzzles solved in a time longer than the average to 2.2 times the average

      points awarded = ((-1.450*x*x+2.914*x-0.896)*maximum points on the puzzle.

      where x=my time/average time. For a puzzle solved in the average time, the award is 0.57 times the maximum points for the puzzle. The model is slightly off at the crossover point of my time=average time.

      3) It became apparent that these equations do not hold when solution time is greater than 2.2 times the average. For these cases the points that are awarded range between 100 and 132. I see no consistency in the way points are awarded for results n this region. Obviously solving a puzzle without getting 'hints' gets you at least 100 points and up to 132 points. But how our friendly Administrator derives the award for solution times greater than 2.2 is mysterious. It is clear to me that it is not related to solution time, (solution time/average time), per cent of people solving the puzzle or number of clues.

      I welcome any comments or data on very fast solution times.
      Last edited by duhmel; 11-24-2021, 01:52 AM.

      Comment


      • #4
        Maybe I'm missing something, but it looks like formula #2 is zero at about x = 1.63, which would mean negative points for 1.63<x<2.2.

        Comment


        • #5
          Originally posted by M Schereau View Post
          Maybe I'm missing something, but it looks like formula #2 is zero at about x = 1.63, which would mean negative points for 1.63<x<2.2.
          Good catch - typo error in the formula. The x-quantity should be = (average time/my time). Now x=1.63 would give a score of about 0.36 x available points. I have corrected the formula in my original post.

          Comment

          Working...
          X