How to cite this paper
Paoli, Jean. “We Created Document Dysfunction: It Is Time to Fix It.” Presented at Balisage: The Markup Conference 2019, Washington, DC, July 30 - August 2, 2019. In Proceedings of Balisage: The Markup Conference 2019. Balisage Series on Markup Technologies, vol. 23 (2019). https://doi.org/10.4242/BalisageVol23.Paoli01.
Balisage: The Markup Conference 2019
July 30 - August 2, 2019
Balisage Paper: We Created Document Dysfunction
It Is Time to Fix It
Jean Paoli is the Founder of Docugami Inc., a startup that uses AI to transform
the unique document business processes of individual companies, making frontline
users more efficient
while giving COOs better compliance and insights – inspired by his deep belief
that openness and interoperability raises all boats.
He was formerly President of Microsoft Open Technologies, Inc., and one of the
co-creators of the XML 1.0 standard with the World Wide Web Consortium (W3C).
Throughout his career, Jean has worked in startups: before Microsoft, with Inria,
the renowned French research Labs (Gipsi S.A. and Grif S.A.); and within Microsoft
creating four new startups:
XML, InfoPath, opening the Office formats and MS OpenTech (Microsoft’s open source
The startups he built created breakthrough platform technologies used today by
He is the recipient of multiple industry awards for his work on XML, semi-structured
the convergence of documents and data and openness at large.
In addition to core technical design, Jean takes deep care at building healthy
ecosystems at worldwide scale.
He is credited as one of the key leaders responsible for shifting in a fundamental
under the guidance of the CEO, Microsoft’s strategy to embrace and love open
Copyright © Docugami Inc.
Some of us building software need to take a hard look in the mirror.
For years, we have promised that technology would solve the world’s information
but 85% of business information is still
dark data, with potentially useful insights
lost in a rising tide of disconnected documents, emails, Slack conversations, voice-to-text
We need an effective approach to documents and want to start a public conversation
about these issues.
We believe that effective solutions should be based on: Declarative Markup;
AI sympathetic to
Small Data; focus on company-specific documents;
applying AI to documents as a whole; and solutions that do not disrupt
existing workflows or require massive investment.
The future is not about AI making human beings obsolete;
the future is about AI making human beings and companies more productive, effective,
Table of Contents
- What does document dysfunction look like?
- Five principles
- Starting a public conversation
It is time for some of us building software to take a hard look in the mirror.
For years, we promised technology would solve the world’s information management problems,
but 85% of business information is still
dark data, potentially useful insights lost in a rising tide of disconnected documents, emails,
Slack conversations, voice-to-text messages, and myriad other forms.
As the digital transformation accelerates, the sheer volume and opacity of documents
make it harder to ensure quality, consistency, accountability, and regulatory compliance.
We call this problem
document dysfunction, and it affects nearly every type of organization, from finance to health care to
real estate to government and more, impacting millions of citizens, customers and
What does document dysfunction look like?
It is a bank with thousands of loan documents, but zero visibility into the terms
and conditions that impact the value of those loans.
A government agency with hundreds of project agreements that need to be audited and
updated due to a regulatory change.
A commercial real estate firm with hundreds of contracts, but no insight into millions
of dollars in underlying obligations.
A health care system with dozens of doctors spending
pajama time every night recording and writing patient notes in a laborious and disconnected process.
Now multiply those cases by hundreds of thousands of companies and organizations around
the world. That is document dysfunction.
We see five principles that can lead us to more effective solutions:
First, we need to bring together multiple scientific domains in innovative and powerful
ways, including diverse AI approaches and Declarative Markup https://markupdeclaration.org/.
Second, instead of “Big Data,” we need AI that understands
Small Data– the unique sets of business documents distinctive to individual companies. government
agency with hundreds of project agreements that need to be audited and updated due
to a regulatory change.
Third, this focus on company-specific
Small Data will enable us to maintain the privacy and security of each individual customer..
Fourth, past attempts to use AI to try to solve business data and document problems
have failed because they focused on the wrong altitude — helping to complete words
or sentences instead of applying AI to the document as a whole.
And fifth, to be truly effective, we need solutions that do not disrupt existing workflows
or require massive investments in staff training, IT development, or armies of consultants.
The future is not about AI making human beings obsolete. The future is about AI making
human beings and companies more productive, effective, and creative.
Starting a public conversation
Our goal is to start a public conversation about these issues.
We published an Open Letter about Document Dysfunction
We beleive that the problems runs deep but at the same time we see an opportunity
for the industry at large and envision how a set of new technologies
can impact the core of the document industry.
Tell us your document dysfunction horror stories, or your dream for how technology
could give you greater efficiency and control. Or maybe you completely disagree
and have never met a dysfunctional document in your life.
Or maybe you think our principles are all wrong —
we would still like to hear from you!