How to find the true reason behind any phenomenon mathematically without any human biases?
This is useful for analysts, strategists, Investment managers, data scientists, statisticians and managers in any field of work
Contemporary buzz word is Data Science and Machine learning. It's so common today that even non-technical professionals such as Students of Arts and Language have started learning and applying it in their own craft to study patterns and draw inferences, while the rest who have even any remote semblance to some professional course consider it as almost a necessary skill set before barging into the job market. Needless to say there are thousands of free and chargeable courses available in the market to help one get started. However for the untutored ones, there are two aspects of Data Science -
1. First finding the cause and effect relationship behind any phenomenon
2. Second to predict the feature/variable/factor/phenomenon of interest
You'll find detailed work available on the internet on the latter one, however not a single absolute solution is given for the former one which is establishing Cause and Effect relationship behind any phenomenon. In fact this is an ongoing area of research where statisticians, mathematicians and data scientists are working towards building absolute or near to absolute solutions.
Currently the status quo to establish Cause and Effect relationship behind any phenomenon is to rely on so called Subject level experts who for their age old experience are relied upon by analysts to derive the possible factors behind any phenomenon before they further go on to build state of the art prediction models based on these factors. That's why we often hear these popular adages - "This requires domain expertise/ this requires experience / experience matters". Now I do believe subject knowledge and experience matters but what would happen if these domain experts fail to look beyond their feudal knowledge of the domain in changing market circumstances and thus end up picking the wrong factors? In reality this does happen a lot in the industry and the blame is often put on the Data Scientist or Statistician of the organisation for failing to predict the phenomenon accurately who in turn blames the Researcher for incorrectly capturing the data in the 1st place - the end result is this blame cycle goes on. In fact in almost every industry today it's an accepted reality that predictions could go wrong and in such cases, the work around is to re-capture data and redo predictions based on it. However, the core problem still remains overlooked at large i.e. nothing is done to validate if the Causal factors picked for any phenomenon are correct. The reason for this situation is mostly lack of awareness that this is actually a problem which often is complacently justified due to lack of any available objective solution in this regard.
So in short - You can build the best of prediction models which can range from a simple linear regression to state of the art Reinforcement learning but all that would be rendered useless if you fail to identify the correct factors which contribute to any phenomenon.
The reason behind contemporary Analysts failing to capture the true causal factors behind any phenomenon is the interaction/contribution effect of the actual causal factors on some other pseudo causal factors. When an analyst finds a significant correlation or achieves some state of the art models with high accuracy or precision with such pseudo variables, he ends up believing the pseudo factors to be the true causal factors because at that given point based on the available data, he finds his predictions highly plausible. However whenever the underlying set of conditions based on which the predictions were made fundamentally change in future, the predictions could fail drastically. To make the matters worse, it is mathematically impossible (NP complete) to find the total range of factors which are responsible for any phenomenon thus leaving us at the mercy of the researcher or so called Subject level experts with their traditional knowledge.
The only way to break this chain of confusion is to device new statistical ways that can mathematically validate decisions. Fortunately after a lot of research and experiments, I have come up with some case specific solutions which I have already deployed on multiple occasions from validating business decisions to risk modelling for clients of a leading Global Consulting firm.
This solution is currently in the process of getting sold to an organisation, hence not available publicly but I hope to help companies and individuals reading this post with their datasets implementing causal "Fail proof" models for you!
Comments
Post a Comment