Spatial data science (SDS) is a subset of Data Science that focuses on the unique characteristics of spatial data, moving beyond simply looking at where things happen to understand why they happen there. SDS treats location, distance & spatial interactions as core aspects of the data using specialized methods & software to analyze, visualize & apply learnings to spatial use cases.
What’s the difference between Spatial Data Science & GIS?
Geographic information systems (GIS) applies to a wide range of users & use cases, yet is one of those strange anomalies that, despite its value spanning many industries, has remained a niche field – often siloed from other business units. GIS typically refers to varied types of information systems such as websites, apps, or databases that store different types of spatial data.
With new types of users such as Data Scientists, GIS is starting to happen more outside of traditional GIS tools – allowing more sophisticated spatial analyses to take place in connection with new Data Science & Big Data solutions. This shift is allowing Spatial Data Science to emerge as a discipline with greater interactivity with Open Source & Cloud technologies.
GIS is exploding, and our industry has never been bigger than it is now – with a growing number of players not only providing cross-industry platforms, but also niche industry geospatial specialists. This means that GIS is happening where we’re not used to it happening – outside of traditional GIS tools.
What are the consequences of the shift we’re seeing?
GIS specialists will have to adapt and join new communities, such as Data Engineering, Data Science or Development groups. As they upskill, they will need to work across multidisciplinary teams and think of how their work can provide value beyond a single tool, bringing their spatial knowledge to the table.
The days of the full stack GIS expert are over. We cannot expect GIS specialists to also be experts in creating websites, managing servers, spatial modeling, creating data pipelines, making dashboards and telling stories. There are totally different skill sets and entire industries devoted to these disciplines, so GIS experts will have to choose where they want to focus their professional development.
Geospatial products will have to adapt or die. GIS is no longer an island where you start and finish with GIS. It is part of a bigger ecosystem, which means that many of our capabilities will need to be connected or blended with other products, rather than competing with them. It is our responsibility as an industry to ensure that newcomers to GIS, who don’t even know what it is, can do great spatial analysis, using good cartography, and ensuring that they learn how not to lie with maps.
Spatial data infrastructures have to change. Infrastructures will need to play well with key cloud solutions and databases, as well as Open Source technologies that are now widely used in the enterprise.Connectivity will be key, which is why our integration with Google BigQuery is so important to many of our users.
Expand awareness of good GIS among different communities. There should always be a track on spatial analysis at Data Science, Data Engineering, and Developer conferences, as we have at the Spatial Data Science Conference.
How does the adoption of spatial analytics vary across industries?
Retail, OOH, Real Estate and Private Equity have been growing rapidly in terms of SDS capability, which is something our Data Science team here at CARTO discussed in this video:
However, we are also seeing huge growth in verticals like Pharmaceuticals, Healthcare and Logistics who are turning to location data. You can see more specifics on the use cases here.
One thing we see in the market is that it is very competitive to hire talent in the SDS space, so very often we see more lucrative industries (such as management consulting or private equity) attracting some of the top candidates.
What are the Biggest challenges facing spatial data science?
The other huge challenge we see is that companies and their Data Science professionals are wasting too much time on everything but analysis:
Only 20% of their time is being spent on analysis, vs the other 80% being spent on discovery, evaluation and ETLing. This means that companies have Data Scientists working on everything but what they hired them for, with hundreds of different departments waiting to work on projects with them that require spatial analysis muscle.
Our mission is to spatially enable every Data Scientist, saving them time and making it easier to access high quality location data for their spatial models, whether it’s foot traffic, financial, housing, geosocial or climate data. That’s why we’re focusing on CARTOframes and our Data Observatory – making it easier for them to carry out such analysis within their Juptyer notebooks without needing switch context all the time.
Lots of GIS experts have already realized the changes we’ve discussed in this article. In fact, at the latest edition of the Spatial Data Science Conference, approximately 20% of our attendees came from a GIS background and were looking to connect with the world of Data Science.
We need to make the work of Spatial Data Scientists more productive and enjoyable, making it easier for Developers and Analysts to collaborate with them. GIS is reincarnating, and the current situation we’re living through in society and our economy is only going to accelerate that evolution.