There's a fairly large ecosystem of tools for creating and processing financial data in XBRL which doesn't exist for SQLite. All the Big 4 handle XBRL already - I know because I write the software they use. It's quite straightforward to take a Word document, for example, and turn it into XBRL; we even use machine learning to automatically tag the tables.
I can easily imagine how painful it is for you to process XBRL from scratch, but it's not crazy to exploit the existing infrastructure.
Of course if you give a project to IBM I wouldn't be surprised if it costs a billion dollars, especially given they know roughly nothing about XBRL...
I'd never heard of XBRL, looks very interesting. It seems it's primarily used in financial reporting environments though. Is it suitable for general purpose reporting as well?
In principle yes, it was intended for general business reporting.
Essentially each filing (called an instance) consists of a series of 'facts', each of which reports a single value and some metadata, and footnotes, which are XHTML content attached to facts. Fact metadata includes dimensions, which can specify arbitrary properties of a fact. So a fact might be e.g. 'profit' with metadata declaring it's in 2018, in the UK, and on beer, but all of those aspects would be defined by a specific set of rules called a taxonomy. You can create a taxonomy for any form of reporting you want.
There's also a language, XBRL Formula, which allows taxonomies to define validation. It allows something semantically similar to SQL queries over the facts in an instance, with the resulting rows being fed into arbitrary XPath expressions.
Unfortunately the tools for working with XBRL are mostly quite expensive, which probably limits its application outside finance. Arelle is a free and fairly standards compliant tool that will parse, validate and render XBRL and even push the data into an SQL database, but it's written in Python and isn't very performant. (Although it's probably good enough for most uses since it's used as the backend for US XBRL filing.) I'm not sure if there are any open source tools to help with creating instances.
Also creating a taxonomy itself is quite challenging. There are (expensive) tools to help, and using them it's still quite challenging. For real-world taxonomies it usually involves a six or seven figure payment to one of the few companies with the right expertise.
XBRL is the format the SEC (security and exchange commission, the government "accountants of last resort" that check publicly traded companies). This means there's regular XBRL files (every quarter) for every large publicly traded US company.
Please do keep in mind that this is a sort of XML key-value database with a number of keys standardized, and some level of rules defined that say "if a bank lends Volkswagen money and it uses 12% of that money to Google to run ads with a repo clause, you enter add A to value X and B to value Y". In other words, there's rules that define how complex financial data is entered into those standardized values. To find those rules, there's a SEC textbook that you wouldn't wish on your worst enemy, nothing about that in the files themselves.
There's a large directory at the SEC with the quarterly XBRLs for all US publicly traded companies. Used to be accessible over FTP until a little over a year ago.
XBRL files exist for all forms to be filed with the SEC. The ones you probably want are the 10-Q and 10-K ones (q = quarterly, k = no idea, but somehow means yearly)
(of course there's an entire industry of accountants essentially about hacking those rules, and therefore the meaning of those files. So let me give you a free quick 5 year experience in financial analysis: search for "GAAP vs non-GAAP", read 2 articles, decide the conspiracy theorists are less informative than the government, and just believe the SEC is at least trying. That doesn't mean nobody's lying, but GAAP vs non-GAAP is generally not what they're lying about)
I can easily imagine how painful it is for you to process XBRL from scratch, but it's not crazy to exploit the existing infrastructure.
Of course if you give a project to IBM I wouldn't be surprised if it costs a billion dollars, especially given they know roughly nothing about XBRL...