Machine learning, and the related but broader concept of ‘artificial intelligence’, sound magical on first encounter. But look more closely and you will see there is no magic.
Machine learning uses sophisticated algorithms and computing power to detect patterns in large amounts of data. This pattern detection is done in a continual, iterative way that allows the machine to ‘learn’ and make predictions about the future.
Machine learning has the potential to enable more intelligent and impartial public policy. The possibilities are everywhere with bail decisions, immigration decisions and detecting tax avoidance just some of the emerging areas of opportunity. Recognising the potential of applying machine learning across government, Carnegie Mellon now offers a joint PhD program in machine learning and public policy.
Unfortunately, the use of machine learning in a public policy context is a high stakes game and not without its challenges; namely:
To mitigate these challenges, we recommend policy-makers apply the following five steps:
Machine learning is not suitable for all types of decisions. The types of decisions that are amenable to machine learning generally require a future prediction. Researchers from Harvard Business School, who studied the application of machine learning to bail decisions, noted subtle differences between bail decisions and sentencing. Bail decisions may be more suitable to machine learning than sentencing decisions. Bail explicitly requires a decision about an individual’s likelihood to re-offend, whereas sentencing “also depends on things like society’s sense of retribution, mercy, and redemption, which cannot be directly measured”.
Machine learning models are ‘trained’ using what is referred to as ‘labelled data’. This is a historical data set that is used by the model to start making future predictions. Training data can contain implicit or explicit human bias. For example, in the context of bail decisions, historical data may reflect racial biases by human decision makers. To overcome this, policy makers should use their nuanced understanding of the ‘labelled-data’ to highlight potential pitfalls of its application as machine learning training data.
The development of a model represents the beginning of the application of machine learning, not the end. Policy-makers must closely monitor outputs to determine whether the machine learning model is working as intended.
This monitoring process should include testing the model with ‘unseen’ data. A practical way to do this is to withhold a ‘test set’ of data during the build phase so it can be used later for evaluation. When testing, policy-makers should be clear on the relative importance of false negatives versus false positives. The computer that developed the model, and potentially the project team, will be indifferent to error type. But in reality, the consequences of a false negative (such as someone released on bail) may be much more significant than a false positive (such as someone held on bail who may not have reoffended).
Once operationalised, policy-makers should create a system for ongoing monitoring. A model may be accurate initially but decline in accuracy over time. Ongoing monitoring will allow the team to adjust the model to reflect changes in the external environment and human behaviour over time.
The best model for prediction may not be the easiest to understand or explain. There can be an inverse relationship between the explicability of a machine learning model and its predictive power. A simple linear regression model is easy to understand; a change in the predictors has a linear impact on the predicted outcome, but real-world relationships are rarely linear. However, a more complex ‘deep learning’ model uses ‘artificial neural networks’ to make more accurate predictions that may not be explainable. Sophisticated users of machine learning may be willing to trade off lower, but acceptable, accuracy for increased explicability to stakeholders.
Finally, the best decisions are often made through a combination of machine learning and human decision-making. The analogy of ‘freestyle chess’ is directly relevant to the use of machine learning in a policy setting. Freestyle chess is a variant of chess where human players can use computer assistants. Grandmaster Garry Kasparov (famously defeated by Deep Blue) observed that “weak human + machine + better process was superior to a strong computer alone”. The thinking of ‘freestyle chess’ should inform policy-makers using machine learning. The challenge should be to design systems and processes and develop the capability of staff to make better decisions with the aid of machine learning and other forms of artificial intelligence.
Nous’ practice uniquely combines analytics, design and public policy to improve outcomes for citizens. Get in touch to discuss how you can effectively apply machine learning to implement public policy.