CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 1 - Posted on 3/20/2019 5:57:32

First of all, I'm enjoying the site; there's a lot of useful stuff that's working pretty well here.

I was wondering if we could have the option of using a geometric mean for games that don't have a lot of completion times and may have outliers. For example, Okami HD currently has an average time (arithmetic mean) of about 2714 hours to complete, largely due to a single completion time of 16000 hours. The geometric mean, however, is around 144.3 hours, which feels more realistic even for a game of Okami's length (excluding the outlier, the arithmetic mean is 56.9 hours). One site even went so far to say that that the geometric mean is the "most accurate measure of the middle task time for sample sizes less than 25."1 It's main weakness would be an inability to work with negative numbers and a huge disadvantage working with zeros (essentially making the mean zero), but I don't think that would be a problem here. Also, even though graphs are great, would it be possible to have more statistics or at least be able to export them to a spreadsheet (along with rating, difficulty, completions, etc.)? It may have the potential to provide some interesting insights. If you're interested, Rosetta Code has some code samples for computing means2.

As for bugs, clicking on someone's name in the Mercenary Guild takes me to my collection page. Also, unlike the other bars on a game page, the difficulty bar has each difficulty as a percent of its own total, essentially making each individual difficulty 100% of the total.

Thanks and sorry for the wall of text!

dhobo Curator Backer Posts: 1965 Registered: 1/5/2015
darwinsocialism
# 2 - Posted on 3/20/2019 16:58:35

Hrm, it used to be that curators could go in and mark any "erroneous" playthrough times as invalid so they wouldn't count towards site statistics, but I can't seem to find that anymore. Either got lost in the shuffle to the new site layout or the functionality removed since it was rarely needed and I simply don't remember.

Not as elegant a solution as using the geometric mean, but it would have at least fixed this particular instance until moho_00 was able to address it personally.

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 3 - Posted on 3/21/2019 3:50:21

Sounds pretty cool. I guess if that strategy works long term, there's also the option of using a trimmed mean to automatically prune the outliers or certain percentiles from the top and bottom. About removing times, I guess it's probably up to a curator's discretion if the times are legitimate, right? I mean, who's to say that a particular time (for example, the 16000 hours or about 1.8 years) is actually fake? It's really unlikely, yeah, but it's still possible. But yeah, whatever you all decide will hopefully be good. Thanks for replying back, too. I've been wanting to ask about this whole thing for a while.

moho_00 Curator Backer Posts: 6846 Registered: 6/10/2011
moho_00
# 4 - Posted on 3/21/2019 11:42:05

Currently, all completion times are considered "valid" as long as they aren't 00:00:00. And by "valid", I mean they will show up on the public Game Details page and will be included in aggregated statistics (i.e. average completion time). As @dhobo mentioned, I added the ability for curators to basically mark completion times as invalid in a more sophisticated manner, but this was never applied throughout the site due to the sheer amount of maintenance it required. It's something I would like to implement, but there are a lot of completion times and it's very difficult to identify "invalid" ones unless you happen to stumble upon it. I think this is probably the best solution since it would keep things as "clean" as possible when viewing the completion times and their averages, but perhaps having a different way of performing aggregation would make more sense so we don't have to monitor the completion queue.

As a side-note, I've long wanted to include 00:00:00 times on the Game Details page, just in a separate section so they wouldn't be impacting statistics.

I've added the bugs you reported to my list though and those should be included in the next release (probably 1-2 weeks or so)

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 5 - Posted on 3/22/2019 2:55:13

Alright, sounds good.

Marcelloz Curator Backer Posts: 277 Registered: 9/14/2014
Marcelloz071
# 6 - Posted on 3/23/2019 1:35:34

(I have seen this several times and actually invalidated them but that didn't work. Always wanted to ask why but if it was not actually implemented that explains a lot. )
Geometric mean looks like a good alternative and not hard to implement as it looks like. You could also do something like ditch the 10% lowest and 10% highest values to get rid of these artifacts.
And also put a max on the hours entered. 999 should be impressive enough ;)

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 7 - Posted on 3/23/2019 8:48:11

I looked for a code example for the truncated/trimmed mean, which removes the desired percentages from the highest and lowest numbers in the group before calculating a mean, but I couldn't find any. The truncated/trimmed mean is described in more detail here: 1. I did find a Javascript example of the interquartile mean, an average related to the trimmed mean, though. The interquartile mean gets rid of the lowest 25% and the highest 25% of the numbers and then calculates the mean. The code example is here: 2. It kind of looks like the code could be modified to work with the top and bottom 10% of the data as well (or whatever other percentages you choose).

Thanks again!

Post Edited on 3/23/2019 16:01:59
dhobo Curator Backer Posts: 1965 Registered: 1/5/2015
darwinsocialism
# 8 - Posted on 3/23/2019 16:35:44

@Marcelloz

Putting a max on hours doesn't fit terribly great with certain kinds of games. MMOs, clicker/idle games, etc...
While some of these aren't technically able to be "beaten" in the traditional sense, I can think of at least one clicker/idle game on steam (Realm Grinder) that requires at least a thousand hours for the average player to earn every in game achievement... and it's still adding content.

I can actually think of one more game that would likely be legitimately excessively long at this point. White Knight Chronicles would take at minimum hundreds, possibly a thousand+ hours to platinum since the online servers are kaput. Makes the grind so very, very brutal.

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 9 - Posted on 3/27/2019 13:25:08

Sorry for bumping again, but can we also get a "seconds" field for the playthrough time entries? I'm using a stopwatch to time my play sessions and it'd be slightly easier (and more accurate) to put the whole time as-is without having to round. It would also work more seamlessly with some of the other time entry sections, such as completion time, because the actual completion time could just be taken from the sum of the playtimes (which is partially done already) and input directly into the completion time fields, seconds and all. Thanks!

Marcelloz Curator Backer Posts: 277 Registered: 9/14/2014
Marcelloz071
# 10 - Posted on 3/27/2019 16:11:05

@Dhobo

Yeah, I played Clicker Heroes myself and have 300+ (idle) hours on that game and I have completed 20% or so. A max is indeed not fitting for some games.
Perhaps a list with suspicious entry's would be best, so we can flag them and they will be excluded. Like all entry's above 100 hours or that deviate too much from the global average. I guess it's up to moho to decide :)

moho_00 Curator Backer Posts: 6846 Registered: 6/10/2011
moho_00
# 11 - Posted on 3/29/2019 17:14:09

@CartoonFan - I updated the list of items in this thread to include seconds on playthroughs. It's actually been on my list for a while...somewhere.

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 12 - Posted on 3/29/2019 21:55:42

Alright, thanks!

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 13 - Posted on 3/30/2019 19:39:47

Awesome job, moho! I just noticed the big update went live; I'm going to see if things are working.

Post Edited on 3/30/2019 19:40:55
moho_00 Curator Backer Posts: 6846 Registered: 6/10/2011
moho_00
# 14 - Posted on 3/30/2019 19:43:09

I only did a couple of things for this release, but more to come in the future!

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 15 - Posted on 3/30/2019 19:58:49

Alright, good to hear. So far, the bugs seem to be fixed and the playlist entries have seconds, so, great work! I'm awaiting more in the future!

Post Edited on 3/30/2019 19:59:05
CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 16 - Posted on 6/24/2019 18:47:46

Sorry for bumping an old topic, but I had a quick question. The 16000 hours playthrough time has been removed from Okami HD's details page, but the average time on the collection page is still 2714 hours, while on the details page, there are averages of 53.5 and 70.5 hours. Is this...a bug?

moho_00 Curator Backer Posts: 6846 Registered: 6/10/2011
moho_00
# 17 - Posted on 6/25/2019 1:20:47

Ah, good catch, I must've missed that one! I'll be sure to squish that bug in the next release

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 18 - Posted on 6/25/2019 3:57:54

OK, thanks. I'm glad to hear that wasn't just me somehow

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 19 - Posted on 8/2/2019 4:49:13

Hi! Sorry to bother you again, but it looks like it still says 2,714.0 hours for Okami HD on the Collection page on my end. Can you check to see if it's the same for you? Thanks!

Post Edited on 8/2/2019 4:49:50
moho_00 Curator Backer Posts: 6846 Registered: 6/10/2011
moho_00
# 20 - Posted on 8/2/2019 12:31:59

Should be fixed now

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 21 - Posted on 8/2/2019 22:10:36

Awesome, thanks a lot!

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 22 - Posted on 10/5/2019 17:35:01

Hey there, it's me again, reviving a dead topic. Some time ago, I noticed that some of the average community times were off in the collection view. For example, Counter-Strike (avg. 499h, max. 395h), Saints Row 2 (avg. 420h, max. 73h), and Europa Universalis IV (avg. 349, max. 276). I didn't post right away because I didn't want to seem bothersome by bumping a somewhat old topic (plus I was busy too). When I was working on the data today, I noticed a trend with longer community playtimes being associated with weirdly large means, so I decided to check out the database view to find some really long games. I was surprised to find that the database view seems to be (correctly) averaging from the current playtimes as opposed to what the collection view is doing. Is there a way to use the database averages in the collection view? I mean, it seems like it would be a win-win situation since it doesn't seem like the calculations need to be updated manually to get the correct results. Thanks again and I hope whatever you decide to do goes well!

UPDATE: So, I checked Super Smash Bros. Melee after I posted this, and it seems that the calculations are a little...off. The average I got was 338.22 hours, and the database view has 338.0 hours. It's not a huge difference, but it's there. On an unrelated note, someone apparently has the game on their PC-FX...a pretty amazing feat, to be sure

Post Edited on 10/5/2019 17:49:47
moho_00 Curator Backer Posts: 6846 Registered: 6/10/2011
moho_00
# 23 - Posted on 10/5/2019 21:17:51

It looks like when I made some of these changes a while back, I forgot to update something and that was causing things to be out of sync in some scenarios. It should be fixed now though. There might still be some slight variations such as your Super Smash Bros. Melee example, but it looks like it's due to rounding differences that I need to look into. As a side-note, I did clean up some of the SSBM entries so the average time will be a bit less now

CartoonFan Curator Posts: 33 Registered: 8/25/2016
CartoonFan555
# 24 - Posted on 10/5/2019 21:35:28

Woah, thanks for the fast reply. I'll keep on the lookout for other stuff, but I'm glad you found it quickly. I'll probably still feel like "This probably deserves a separate topic", but I'll probably just end up posting something here again if I find something else. Thanks for your hard work!