Tom MacWright

tom@macwright.com

Three Things

Slides

A presentation given at NYPL to librarians in the historical mapping space.

Change over Time

I think the biggest unsolved problem in maps is change over time. This is unfortunately an unsolved problem in most data structures, which either represent change as replacement, like you can see with street maps - Google Maps, Mapbox, and so on - or as copying, which you can see in annual sources like the US Census. Tools that manage change over time in a mature way and which can present it as a fundamental characteristic of data are extremely rare. Specifically, OpenStreetMap is the only mainstream example.

Cracking this code is necessary because it’s the only way to solve collaboration, as software development has shown us. Once multiple people interact with a source, you can no longer count on changes being ordered and ‘versions’ being a strong concept.

Mapbox is working on this problem as part of our larger push towards creating the first scalable geospatial database. GeoGit and dat are also working on potential solutions. We should talk about this, because it’s a massive technology change that will rely on it fitting your use cases and rely even more on network effect.

Things to read:

Rights

The second stumbling block for data to work, and I know there’s a big copyright symbol there, but a better description would be rights and expectations.

Historical data often has the luxury of Public Domain status in the United States, but Public Domain is an American flavor of a concept that has different and sometimes doesn’t have a representation elsewhere. The products derived from Public Domain data, whether they’re extracted buildings or even just scans - licensing of these artifacts is more or less the choice of the maker.

Maps in particular are a battleground for copyright law for two reasons: they have forms as database, data, and image, and they are mostly useful in combination with other data.

Copyright matters to you because combinations of data are combinations of licenses, and the future is in datasets that come from a lot of places. At Mapbox we already have projects like OpenAddresses and a Satellite layer that combine more than 30 datasets and thus have the aggregated legal boilerplante of all of them. The friction of this combined with the legal risk is a brake.

Things to read:

Standards

Finally, maps need standards. Like the copyright symbol before, I don’t really mean standards. I mean a standards body. Much like copyright, this is a prickly subject, which is why I chose it. But storage and collaboration are the pain points where being unique is a risk: nobody wants to store data in a format that won’t be openable in 10 years, or 5 years even. And everyone wants to share data in the format that everyone else uses.

There are roughly three kinds of standards: those that come from a standards body, like WMS from the OGC, those from a company, like KML from Keyhole, and those from interest groups, like GeoJSON. The lines admittedly blur. In the last five years, little of importance has come out of a standards body. And the reason for companies like Google or Esri to publish standards is for their products to be more successful. Those products used to be desktop applications like Google Earth, which inspired KML, but are moving to the web: so the way they communicate is behind the scenes and in APIs, not formats. File formats as a user-facing concern are losing their shine for startups.

Interest groups are the theoretically purest way to produce standards, but they’re rare and the people who participate need to be operating in a sort of time surplus, or have complex motivations that boil down to the previous two types.

Things to read: