The METR chart was super interesting Casey! Ive been thinking recently that I presume a lot of these measurements of human performance are maybe idealised right? Ie. it should take me 2 minutes to find a data point, but actually it takes 10mins (or 10 hours!) because I get distracted.
I guess my mind then goes towards distractions/procrastination/ pure abandonment compounding with task complexity. I eventually get round to finding the quote online, but might not ever get round to writing that screenplay. Maybe a more realistic chart should be hockey sticking given the non-linear nature? Wdyt
Yeah that's interesting! In this case they actually measured how long it took humans, but absolutely that would be an idealised version given they probably didn't procrastinate mid-measurement! That said, if it were me I'd leave the measurement as is to make the "AGI bar" slightly higher for each task type. I dont think that hurts.
The METR chart was super interesting Casey! Ive been thinking recently that I presume a lot of these measurements of human performance are maybe idealised right? Ie. it should take me 2 minutes to find a data point, but actually it takes 10mins (or 10 hours!) because I get distracted.
I guess my mind then goes towards distractions/procrastination/ pure abandonment compounding with task complexity. I eventually get round to finding the quote online, but might not ever get round to writing that screenplay. Maybe a more realistic chart should be hockey sticking given the non-linear nature? Wdyt
Yeah that's interesting! In this case they actually measured how long it took humans, but absolutely that would be an idealised version given they probably didn't procrastinate mid-measurement! That said, if it were me I'd leave the measurement as is to make the "AGI bar" slightly higher for each task type. I dont think that hurts.
Stretch targets defo good!