Announcement

Collapse
No announcement yet.

How are statistics computed: success rate, penalty rate, median time

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How are statistics computed: success rate, penalty rate, median time

    Upon completion, there is a box that shows statistics for the completed puzzle. The statistics include:
    Success Rate: 68%
    Penalty Rate: 20.3%
    Median Time: 321 seconds
    Record Time: 71 seconds by babs528
    Your Time: 473 seconds
    But how are these computed?

    Success Rate - Is this the number of attempts that finished divided by the number of times the puzzle was presented?
    Penalty Rate - Is this, for the number of attempts that finished, how many had at least one penalty?
    Median Time - Is this, for the number of attempts that finished, the average or the median (part of the website uses median and average as interchangeable terms, but they are different: the median is the point where half are below and half are above, while the average is the total time divided by the number of (attempts that finished?).

    Thanks.

    Update, 13 June 2020:
    Ok, I think I finally understand.
    Attempts (100%) = SuccessRate (%) + PenaltyRate (%) + Didn'tFinishRate (% not shown)
    where
    Attempts = presentations of a puzzle, regardless of what the user did thereafter;
    SuccessRate = finished the puzzle without any penalties for hints or wrong entries;
    PenaltyRate = finished the puzzle but had at least one penalty for hints or wrong entries; and
    Didn'tFinishRate = did not finish (didn't even start, computer died, abandoned without finishing)

    MedianTime = (durations of puzzle finishings, 0-n penalties) / (number of puzzle finishings)

    So, say the puzzle for the statistics above was presented 1,000 times to various users (Attempts = 1,000).
    680 users finished the puzzle with 0 penalties.
    203 users finished the puzzle with at least 1 penalty.
    117 users didn't finish the puzzle (117 = 1,000 - 680 - 203)
    The so-called MedianTime = (durations of puzzles for the 680 + 203 users) / (680 + 203)

    Thus, a low SuccessRate means a hard puzzle.
    Given a low SuccessRate, a low PenaltyRate means a fiendishly difficult puzzle, because it implies a high Didn'tFinishRate.
    Given a low SuccessRate, a high PenaltyRate means a moderately difficult puzzle, because it implies a low Didn'tFinishRate.

    Fascinating factoid: the few times I've tried, I can finish a puzzle in less than the MedianTime by NOT reading the clues at all, (a) just putting X in a few boxes, (b) then checking if they were correct, (c) revising as needed, (d) repeating steps a-c until done. Of course, the numerous penalties make my adjusted time awful.

    Reply to uigrad (see original post below):

    "The chart doesn't ever show the "slowest", and there is no way to know the slowest."

    That is not necessarily correct. On several puzzles, I have had the slowest time, because of numerous penalties due to not reading clues (see Fascinating factoid, above), or because I saved the puzzle to finish later, not realizing that (a) the saved time counts as part of my time taken to finish, and (b) the maximum save time is only a single-digit number of hours, not adequate if you need to do other stuff for a day. When I have the slowest time, it seems to immediately update as a new slowest-of-all-time on the far right of the chart.

    Maybe someday I will have a new fastest-of-all-time. Sure. I can die happy if, just once, I finish in the fastest category. I've finished in the next-to-fastest category many times, so it is conceivable that I will eventually, once, finish in the fastest category. I need to get a life.
    Last edited by hxxhxx; 06-13-2020, 07:55 PM.

  • #2
    I expect you are correct with what you believe the success rate represents.

    With regards to times, 'Median' is a type of average used to represent an exact result from a range, so it is useful when we have many results, as on a website like this, as it will end up being quite accurate. What you and parts of the site are referring to as the 'average' is technically the 'Mean' average, which provides an exact value that usually lies between individual results, making it more useful where you have a smaller range of results to compare or need a very precise average for scientific purposes for example. I expect you are correct on this again, in that it is just the Median average result of all users who successfully completed the puzzle.

    The one I have often wondered about, especially in the books, in the Record Time. I am hoping that it is the fastest time for each user's very first attempt, ideally discarding the result if they restarted at any point, otherwise it is effectively worthless as people could just be filling them in from rote memory, having already worked out the answers. Considering I have seen some of the Record Times for very difficult puzzles down around the 2-minute mark, it does seems likely some people are just restarting them for fast times.

    Comment


    • #3
      I've seen this site says "Median" in some places and "Average" in others. The term "average" is imprecise, but usually means "mean". However, I think on this website when it says average, it actually is giving the median (it seems that hxxhxx thinks this also).

      The real question is what data is used to compute the median? I don't think any values are thrown out, even though it seems that slower times are off the chart. The chart doesn't ever show the "slowest", and there is no way to know the slowest. The last point in the chart is just the median multiplied by 5, which isn't terribly useful, but that's what we get. Best times aren't thrown out either, but if they are faster than the minimum for the puzzle (30 seconds, for example for the 3x4 puzzles), then they are set to the minimum instead. This is conjecture, partially based off my experience from other puzzle sites here at puzzle baron.

      it is effectively worthless as people could just be filling them in from rote memory, having already worked out the answers. Considering I have seen some of the Record Times for very difficult puzzles down around the 2-minute mark, it does seems likely some people are just restarting them for fast times.
      Since there are approximately 50,000 puzzles at the site, I don't think it is likely any has them memorized.

      Restarting them for fast times doesn't work either. If you spend 10 minutes solving it on paper, then reset it, and solve it in 30 seconds on the site, your time recorded is 630 seconds. All that matters is the time on the server when you started the puzzle the first time and the time on the server when you finished it.

      If there are any times that are "unfair", it's either because people guessed at the answer, or wrote a script to solve the puzzle for them. I don't think it's possible to eliminate either of these possibilities, but I am fairly certain that most people who have set records have done it the same way that I have, which is no guessing, just lots and lots of practice.

      I suppose it is possible that if a single user has solved the same puzzle 4 times, the system could throw out the 3 slowest times for that user for computing the median. I know that if you play the "Circuit" puzzles, the system does keep track of the fastest time for each player (the chart shows if one of your times is in the top 10, but doesn't show if you have 2 or more in the top 10). Only Stephen (I think that's his name) would be able to answer that question.

      One thing that I have realized is that the median (and the succcess rate and penalty rate) isn't updated immediately after finishing a puzzle. There is another user that has been putting in the comments those stats after he or she finishes the puzzle, and when I come across the same puzzle, often all 3 stats are still the same. When one stat changes, all 3 change.

      So, I think that the stats are only updated after a certain amount of time has passed, or if a new record is set. I think it works something like this. If the 1000th solve of a puzzle sets a record, then the stats are all updated at that point. Then they aren't updated again until either A) a new record is set or B) the number of solves has increased by more than 10%. This is entirely conjecture, based solely on how often the stats that I see match with that other person who puts them in the comments.

      Success Rate - Is this the number of attempts that finished divided by the number of times the puzzle was presented?
      Penalty Rate - Is this, for the number of attempts that finished, how many had at least one penalty?
      That's how I interpret those stats, and I appreciate the precise terms you used. I can't think of any way to test this, though.

      Comment


      • #4
        Ok, I have actually discovered now that setting a record doesn't necessarily update the stats. That must have been a bad assumption.

        My evidence is in this screenshot: https://i.imgur.com/04HwHr7.png

        In fact, it looks like it has been 368 days since mohamm1 recorded the stats for that puzzle, and they still haven't been updated.
        Last edited by uigrad; 05-28-2020, 10:27 AM.

        Comment


        • #5
          Originally posted by uigrad View Post
          Since there are approximately 50,000 puzzles at the site, I don't think it is likely any has them memorized.

          Restarting them for fast times doesn't work either. If you spend 10 minutes solving it on paper, then reset it, and solve it in 30 seconds on the site, your time recorded is 630 seconds. All that matters is the time on the server when you started the puzzle the first time and the time on the server when you finished it.

          If there are any times that are "unfair", it's either because people guessed at the answer, or wrote a script to solve the puzzle for them. I don't think it's possible to eliminate either of these possibilities, but I am fairly certain that most people who have set records have done it the same way that I have, which is no guessing, just lots and lots of practice.
          Thank you, that is interesting to know; mostly just working my way through the old books so it was really conjecture on my part, with regards to the online record times.

          Comment


          • #6
            I think it's quite interesting that "Hints" have less of a penalty than an incorrect guess....Use hints after an incorrect guess and it drops down to 12 points (on the easy 4x4 grid) no matter how fast your time is??? The "cost" of the penalties seem to constantly vary???? What's up with that?

            Comment


            • #7
              Originally posted by hxxhxx View Post

              "The chart doesn't ever show the "slowest", and there is no way to know the slowest."

              That is not necessarily correct. On several puzzles, I have had the slowest time, because of numerous penalties due to not reading clues (see Fascinating factoid, above), or because I saved the puzzle to finish later, not realizing that (a) the saved time counts as part of my time taken to finish, and (b) the maximum save time is only a single-digit number of hours, not adequate if you need to do other stuff for a day. When I have the slowest time, it seems to immediately update as a new slowest-of-all-time on the far right of the chart.
              I'm responding to this now, because I didn't see it until now.

              When your time is slower than the "calculated" slowest time, then it shows yours in the bottom of the graph, and I believe the point between the median and yours is simply the average. For example, if the median is 400, and your time is 20000, then it shows 400 as the median, 10200 as the 4th point, and 20000 as the 5th point (slowest time). But the next time someone else gets the puzzle (or even you), that "record" that you set will be gone.

              Comment

              Working...
              X