Adobe’s UDoc Captures Cross-Modal Correlations in a Unified Pretraining Framework to Improve Document Understanding
Although modern machine learning models have achieved tremendous advancements in natural language processing (NLP), their focus has been strictly text-based. In the real world, documents often also contain important visual formatting information and features such as tables, figures and charts…