Jonathan StrayJonathan is a research fellow at the Partnership on AI, where he works on the design of recommender systems for pro-social outcomes. He previously taught the dual masters degree in computer science and journalism at Columbia University and built document mining software for investigative journalism. He has worked as an editor at the Associated Press and a research scientist at Adobe Systems, and holds an MSc in Computer Science from the University of Toronto and an MA in Journalism from the University of Hong Kong.
|
Recommender systems drive YouTube, Facebook, Twitter, Amazon, Netflix, LinkedIn and their global equivalents. These are our new machine gatekeepers. and choose the posts we see, the news we read, the products suggested for purchase. Right now they are all driven by behavioral metrics — what we click on, share, like, comment, also known as “engagement.”
This is easy to measure, and correlates with both revenue and value for the users — a system that no one wants to use isn’t helping anybody. The catch is that engagement is very incomplete measure. It’s very much like GDP in this regard: it’s easy to measure, but misses so much about what it means to have a flourishing society.
The simplest alternative is simply asking people about how these systems are affecting their well-being. And this is exactly the work we are undertaking at PAI in 2021, developing concrete methods in collaboration with our industry, civil society, and academic partners.
I want to make three changes:
1) Adopt well-being metrics to measure human outcomes in specific domains. Consider, for example, measuring the well-being effects of a news recommendation system, like Google News. How to we think about news diet in terms of well-being? Some existing frameworks have civic engagement measures like voter turnout, but we need more granular outcomes to really drive these systems. For example, I am studying measures of diversity and polarization. Whereas for an online shopping system such as Amazon, we will want other measures, for example the carbon footprint of products sold.
2) Use this information to drive both managerial and algorithmic decisions. I’d like to see well being measures adopted as KPIs, the key performance indicators that are often used in management. So product managers might look to well being measures when testing product changes. But there are also emerging methods to use real-time well-being data — for example from ongoing user surveys — to algorithms directly.
3) Understand that these metrics have to change. No metric can be static. You’ve probably heard of Goodhart’s law, the idea that any measure that becomes a target changes its meaning. Moreover, the world changes. One of the things was saw with COVID-19 was that the all sorts of machine learning models broke, as people changed their behavior — where they went, what they did. So metrics will never be set-and forget — we have to constantly re-evaluate them.
So:
- Adopt well-being metrics to measure human outcomes in specific domains.
- Use this information to drive both managerial and algorithmic decisions
- Understand that these metrics have to change
This is easy to measure, and correlates with both revenue and value for the users — a system that no one wants to use isn’t helping anybody. The catch is that engagement is very incomplete measure. It’s very much like GDP in this regard: it’s easy to measure, but misses so much about what it means to have a flourishing society.
The simplest alternative is simply asking people about how these systems are affecting their well-being. And this is exactly the work we are undertaking at PAI in 2021, developing concrete methods in collaboration with our industry, civil society, and academic partners.
I want to make three changes:
1) Adopt well-being metrics to measure human outcomes in specific domains. Consider, for example, measuring the well-being effects of a news recommendation system, like Google News. How to we think about news diet in terms of well-being? Some existing frameworks have civic engagement measures like voter turnout, but we need more granular outcomes to really drive these systems. For example, I am studying measures of diversity and polarization. Whereas for an online shopping system such as Amazon, we will want other measures, for example the carbon footprint of products sold.
2) Use this information to drive both managerial and algorithmic decisions. I’d like to see well being measures adopted as KPIs, the key performance indicators that are often used in management. So product managers might look to well being measures when testing product changes. But there are also emerging methods to use real-time well-being data — for example from ongoing user surveys — to algorithms directly.
3) Understand that these metrics have to change. No metric can be static. You’ve probably heard of Goodhart’s law, the idea that any measure that becomes a target changes its meaning. Moreover, the world changes. One of the things was saw with COVID-19 was that the all sorts of machine learning models broke, as people changed their behavior — where they went, what they did. So metrics will never be set-and forget — we have to constantly re-evaluate them.
So:
- Adopt well-being metrics to measure human outcomes in specific domains.
- Use this information to drive both managerial and algorithmic decisions
- Understand that these metrics have to change