For the past days we first figured out how to recognise duplicate client names. The php script takes each companyname and removes any “dust” first like dots and kommas
in the name as well as “words” of 1 letter only. Like this A.M.F. Inc becomes AMF Inc and Starship,S.R.L. becomes Starship SRL.
Then he measures the frequency of each origin word in the database and selects the subset with the less frequent word. In that subset he looks for more matching origin
words and then decides that the MYSQL base contains a duplicate and warns the operator. The operator can accept or reject the proposed replacements by the php application.
If the operator accepts then all tables in a database are searched that contain “entityid” and a list of all tables with entity is then updated with the new entityid like
invoice , booking , rates , quotations. After doing the updates the duplicate row is then removed from the entity table.
As a joke we decided to introduce a machine learning part into the php script that records all decisions made by the operator. Then when the next replacement proposal
is made the Artificial Intelligence Unit searches for the nearest situation in the recorded data and advises if it would replace or not if the AI had been given that autonomy.
If it advised wrong this is also recorded and the AI has then to analyse why it made a wrong decision and increase or decrease the weight factor of the variable that had
been over or underrated. Easier said than done. At the moment we are struggling with recording all the decision results. Then we will write the database search for the AI
to find the most similar past recorded situation , and then we will design the calculations that have to produce the Machine’s advice. Challenging !