Saturday, 15 April 2017
Those who ignore history are condemned to retweet it.
The algorithm purportedly used by the Department to match business names between the ATO dataset and Centrelink data was leaked to the media, and I undertook an analysis of it. This algorithm is breathtakingly naïve and will result in incorrect matches for common situations such as typographical errors, misplaced punctuation, and the legal entity name being different from the business trading name. The potential for mismatches is significant. Various more sophisticated fuzzy matching algorithms are readily available. [Senate Standing Committees On Community Affairs, Inquiry Into Design, Scope, Cost-Benefit Analysis, Contracts Awarded And Implementation Associated With The Better Management Of The Social Welfare System Initiative, Submission 38]