Industrial Talk Abstract

Machine Learning Considered Effective for Commercial Document Understanding Systems

Jakub Zavrel, Textkernel BV
Session: Industrial Session 2

Machine learning methods are widely used as a heuristic knowledge acquisition method for building commercial text mining and document understanding systems, even though learning from data inherently delivers imperfect results. In this talk I will look at the practical value of imperfect solutions in document understanding systems, and at how machine learning is effective in this context, in particular in domains which are transaction oriented. We will point out practical issues from an industrial perspective, illustrated by cases from our practice at Textkernel. Machine Learning is proving to be a viable commercial methodology for knowledge acquisition and offers a principled way towards progress in systems engineering for Language Technology.