Data renaisscientists, the modern day knowledge sculptors
By Moriba Jah|February/March 2021
The genius of Italian sculptors was not in carving the statues but in removing the portions of marble that trapped them, to set them free, as Michelangelo himself is said to have described it. As an astronautical data scientist, I too am a sculptor except that my statues are freed bodies of knowledge, my block of enslaving marble is ignorance, and my tools to chip away at the ignorance are data. In an ideal world, I would have exact information so as to remove all ignorance and what would remain is truth and knowledge, free and exposed for all to see. However, what actually remains is not complete knowledge but rather partial knowledge because complete knowledge is so cloaked in the data and model’s intrinsic uncertainty.
In order to sculpt these knowledge statues, I make use of abductive reasoning. Much like a refrigerator indirectly cools by incrementally removing heat over time, I seek to learn by incrementally removing ignorance over data. If I were to ask you to choose where Abraham Lincoln was born, you may not actually know. However, if along with the question I provide several hypotheses, such as Beijing, China; Dallas, Texas; Hodgenville, Kentucky; and Lexington, Kentucky, you’d quickly remove the obviously wrong answers and be left with Hodgenville and Lexington. You might want to ask more questions at this point that could remove the next wrong answer, but this is an example of you using data for its power to highlight the wrong answer. Any surviving hypothesis must explain the evidence. Otherwise, it fails the test.
In my professional experience, data tend to be much more powerful in telling you what something is not rather than what it is. Often, people ask me how I knew I wanted to become an astrodynamicist, and I tell them that I tried a list of things, determined and discarded what I disliked, and astrodynamics remained on my list. I’ve always enjoyed the musical talent of Michael Jackson, and scrolling through some of his interviews online I came across one where he was asked what he hears when he listens to any given song, and he answered, “I hear what’s missing.” From my artistic or even metaphysical perspective, he created his masterpiece songs by removing silence from them. It’s like the songs already existed but were trapped by silence and as he removed the silence, he freed them, like a sculptor.
When Rudolf Emil Kalman, co-inventor of the Kalman filter so important in trajectory estimation, accepted the Kyoto prize in 1985, he shared his perspective about data in his acceptance speech. If data were exact and complete, he said, then only one minimal, or simple, hypothesis could explain their cause. He called this the “Uniqueness Principle” of minimal modeling, and it is an idealization. Kalman went on to say, “uncertain data cannot provide exact models,” and he cautioned against allowing prejudice to influence the scientific process of deducing a unique model from uncertain data, highlighting Bayesian processes as an example of prejudiced inference. You see, in statistics, Bayesian formulation requires a prior belief to be conditioned by evidence. Kalman, and I for that matter, argue that in reality it is unlikely that the traits ascribed to this prior belief are real or even knowable. Therefore, the conclusion of any Bayesian process is correct by luck or flawed because the result is forcefully prejudiced by the inferer’s prior belief.
In deductive reasoning, one derives b from a, where b follows logically from a. Let’s assume that a are the data or evidence and b is the cause or model. Methods such as maximum likelihood estimation always deduce a unique model given the data. But as Kalman stated, we cannot conclude an exact model given uncertain data, unless we apply some measure of prejudice to remove this intrinsic uncertainty.
Alternatively, inductive reasoning allows us to infer b from a, but there is no requirement or guarantee that b follows logically from a. Lastly, abductive reasoning reverses this by allowing us to infer a as an explanation of b without guarantees of its truth or even uniqueness. This is why I’m a fan of abduction. It allows me to let go of prejudice and sculpt my way to what seems true by forcing me to subject all of my prejudices to evidence and making me reject those that fail to explain it. With abductive reasoning, I can have multiple hypotheses co-existing as long as they can explain the evidence, like in our Abraham Lincoln example. Now, if you were forced to choose one of your surviving hypotheses, there is what is known as Occam’s razor. The razor is meant to figuratively cut away all but the simplest of explanations to the evidence. This is attributed to an English Franciscan, William of Ockham, who favored simple explanations over complicated ones. However, applying Occam’s razor is also prejudiced because the evidence hasn’t discarded your other choices, even if they’re less simple.
There are only two ways to be prejudice-free, either by making absolutely no assumptions or by making every possible assumption. Making all possible assumptions has been my approach to knowledge sculpting, using abductive reasoning, freeing the truth residing within the original block of ignorance. The data guide my chisel, and my goal is to remove all the ignorance possible, but knowing that I’ll never be free of it entirely because there will always be uncertainty. To become a master knowledge sculptor should be the path of every data renaisscientist.