Our solution takes Excel records containing the root cause analysis of the problems faced in factory as input, identifies the root causes from these records and creates dashboards to show their frequency on a web application, hence saving the factory staff to manually do the job. It also provides in-depth insights into the data by representing data in the form of graphs, categorising the root causes into five severity classes and establishing cohesivity and correlation amongst various root cause classes. The web application allows the factory staff to upload the records and an authorized personnel to access the analysis that follows. We were provided with the required dataset, specific to the factory, during the Hackathon. After preprocessing the data which includes cleaning the text using NLTK functions, removing stopwords and shifting all the words to their root words, a model was built. The model classifies the records into the known root cause classes (if possible) using attentive LSTM. For the records which do not fit into any of the existing classes, ( labelled as 'Others' in the training set) a two step procedure was followed. In the first step, the records that could be classified in the existing classes with a high accuracy were categorised. For the remaining records, clustering was performed and certain new classes were devised by analysing the clusters. The second step made us of LDA, a generative statistical model in Natural Language Processing.

pik pika