The field of machine learning and artificial intelligence is making rapid progress. Many people are starting to ask what a world with intelligent computers will look like. But what is the ratio of hype to real progress? What kinds of problems have been well solved by current machine learning techniques, which ones are close to being solved, and which ones remain exceptionally hard?

There isn’t currently a good single place to find the state of the art on well-specified machine learning metrics, let alone the many problems in artificial intelligence that are still so hard that there are no good datasets and benchmarks to keep track of them yet. So we are trying to make one. Today, we’re launching the EFF AI Progress Measurement experiment, and encouraging machine learning researchers to give us feedback and contribute to the effort.

We want to know what types of AI we need to start engaging with on legal, political, and technical safety fronts.

We have drawn data from a number of sources: blog posts that report on snapshots of progress; websites that try to collate data on specific subfields of machine learning; and review articles. Where those sources didn’t have coverage, we’ve gone to the research literature itself and gathered data.  

We’ve placed this information in an Jupyter / IPython Notebook, which you can read at https://eff.org/ai/metrics. The Notebook is hosted on Github, where the community can directly contribute.

What we have thus far is an experiment, and we’d like to know: Is this information useful to the machine learning community? What important problems, datasets, and results are we missing?

EFF’s interest in AI progress is primarily from a policy perspective. We want to know what types of AI we need to start engaging with on legal, political, and technical safety fronts. Beyond that, we’re also just excited to see how many things computers are learning to do over time.

Given that machine learning tools and AI techniques are increasingly part of our everyday lives, it is critical that journalists, policy makers, and technology users understand the state of the field. When improperly designed or deployed, machine learning methods can violate privacy, threaten safety, and perpetuate inequality and injustice. Stakeholders must be able to anticipate such risks and policy questions before they arise, rather than playing catch-up with the technology. To this end, it’s part of the responsibility of researchers, engineers, and developers in the field to help make information about their life-changing research widely available and understandable. We hope you’ll join us.