Around 18 billion invoices are issued each year in USA and Europe alone. Form-like documents such as invoices, purchase orders, tax forms and insurance quotes are common in everyday business, but current techniques for processing these still employ a large amount of manual effort/time or use OCR based heuristics for extraction. Although OCR has been fairly successful in helping digitization of machine-printed text there are quite a lot of limitations in dealing with form-like data available.
Using AI to process form-like data is a challenging task since it involves the usage of both Computer Vision and NLP. In addition, the data input in forms need not be natural language and hence the NLP algorithms have to be trained to deal with unknown words. In this article we shall look at the various challenges involved in dealing with dynamic data, how various AI techniques can be used in attacking the problem along with corresponding code references.
Don't forget to tag @Anil-matcha in your comment, otherwise they may not be notified.