Building models to trust their gut

Dr Samiul Ansari
Chief Data Officer, Nudg

By manually constructing predictive models from granular components, and training them on derived data, we are able to give them a ‘gut instinct’ we can trust

As humans we justify our decisions through reasoning. By applying reasoning, we are able to have confidence in our decisions, even in cases where that reasoning is flawed because it relies on preconceived intuitions, hypothetical assumptions or unexamined axioms and anecdotes. Despite these logical fallacies, if the reasoning is convincing enough, the decisions made on them are generally accepted – commonly known as ‘trusting your gut’. While this can be useful, the essence of ‘trusting your gut’ is formed from the establishment of relationships between the underlying data points, the individual circumstances and an historical understanding of the scenario. The amalgamation of these factors lead to a profoundly higher degree of confidence in the outcome decision.

When dealing with predictive tools in the child healthcare space, it is imperative that logical reasoning for known data patterns that discern well understood condition–symptom relationships are established. However, when ratifying the predictions of the model, the only quantifiable method is to compare with the historical data the model has been built upon, and similar data patterns with known outcomes to justify the model’s prediction. This limits any predictive model to a tool with no human reasoning; a great piece of technology with no ‘gut’. Such hurdles pose challenges to the model’s interpretability or explicability.


Currently there is no industry-agreed definition for model interpretability or what it encompasses. However, we can characterise interpretability through the following:

  1. Algorithm clarity: This relates to a comprehensive understanding of the internal working of the algorithm, including the computations it executes to process and optimise inputs, how it applies mathematical intuitions to the inputs, how it maps processed inputs to make a prediction, and how it ensures prediction accuracy.
  2. Model deconstructability: This is the ability to break apart the model into granular components that can explain a part of the prediction. This is crucial to checking the right level of weighting is applied to the inputs.
  3. Model synthesizability: This is the ability for a human to walk through the input data and manually apply the same computations to derive the model-predicted output. The ability to audit the algorithm to this level ensures that its fixed and random effects can all be explained.
  4. Post-hoc explainability: This helps to reinforce the prediction reasoning beyond the components described above. While this step does not rely on the internal workings of the algorithm, it provides additional supporting information for the reasoning, usually by employing other utility models.

How a model is built will have a considerable impact on its interpretability. A model’s degree of computational complexity will impact its algorithm clarity and synthesizability, while a model’s architecture will impact its deconstructability. Opaque algorithms with a low level of interpretability are categorised as black-box models, while more transparent and interpretable algorithms are considered white-box models.

Factoring in human reasoning

Some of the more famous and impressive instances of artificial intelligence are Deep Blue and AlphaGo. These are examples of algorithmic, predictive models that perform a type of ‘reasoning’ in a very narrow area, in this case a specific strategy game, like chess or Go.

The AlphaGo model was designed to play Go by Google’s DeepMind. It proved its superhuman ability in 2016 when it beat 18-time world champion Lee Se-dol, the first time a computer had defeated a Go player of the highest calibre. Go’s simple rules are deceptive – the ancient Chinese game is many times more complex than chess, with the number of possible positions calculated at vastly more than the number of atoms in the known, observable universe.

To defeat a Go grandmaster, AlphaGo needed to be one of the most sophisticated reasoning models ever developed. But despite this, AlphaGo is unable to beat even an amateur at chess. In fact, it cannot move a single piece on a chessboard for it has no chess tree from which to pull moves.

We are still some ways away from machines that are capable of ‘general reasoning’, such that they are able to build on and optimise their existing knowledge to solve completely new problems. Without reams of structured input data, there is no pattern for the models to train on, and learn from, in order to make predictions.

How do we tackle this conundrum? How do we use what we already have to answer completely new questions? If we are strict with resources, the answer is simple: we algebraically manipulate existing data and knowledge to create new data points to help form the structured data to answer the new questions.

At LovedBy, we evolve our platform Nudg through continuous research where we can establish previously unknown relationships hidden within the data. Establishing these relationships with derived data rather than acquired, we have been able to create links to both clinical and behavioural patterns that were previously unknown.

We improve model deconstructability by forming the granular components of the algorithm manually. The final step is controlling the patterns the model trains on and learns from. We do this by weighting the individual patterns as required and then creating a stack of models that target a specific area and consider all of them for predictions. This allows our models to be adaptive, intuitive and flexible, while leveraging some level of human reasoning.

In a sense, this gives our model a ‘gut instinct’ based on historical understanding, and because we have manually constructed the model from granular components, we can follow and understand the ‘instinctive’ reasoning processes by which the model’s outputs are derived. This provides confidence that the reasoning is valid; it allows us to trust the model’s ‘gut’.

You might also be interested in…