As far as I can remember, the usefulness of online public data was always pitched against the efforts of extracting and structuring it. However, going from raw data to a well-structured and parsed output takes a considerable amount of time, effort, and resources. Even once the initial prototype has been deployed, constant maintenance will be required.
Once scale comes into the picture, data parsing may often become something that only the select few will do. What’s more, web data parsing has unique challenges due to the nature of how HTML has been used over the years.