makepath is excited to share our recent collaboration with the Texas Natural Resources Information System (TNRIS) and Tessellations.
Together we developed an approach to compare historical grayscale imagery to reference RGB imagery that was captured across an 18 year time difference.
The result is a reproducible prototype for matching historical imagery to present-day RGB imagery using open source machine learning tools.
We created two categories of notebooks for aligning historical grayscale imagery with reference RGB imagery:
- Rough Alignment: This notebook is focused on binding historical grayscale imagery to the most similar reference RGB imagery.
- Fine Alignment: This notebook is focused on mosaicing historical grayscale imagery and reference RGB imagery.
Two samples that were accurately predicted to be the same geographic location across an 18 year time period. For each sample: The first image is a historical grayscale image from 1978, and the second image is the reference RGB image from 1996.
This project was completed using open source GIS and open source machine learning libraries:
- Xarray: to read historical and reference imagery and metadata
- Pandas: to produce manifest directory of training images
- GeoPandas: to find intersecting regions in historical and reference imagery and create training, validation, and testing splits
- OpenCV: reading large-scale imagery and tiling images to patches for training and evaluating ML models.
- NumPy: to filter noisy samples
- TensorFlow: training models, predicting matching scores for historical and reference imagery, and visualizing training and predictions (Tensorboard)
Here’s how we did it:
1. Data Curation and Database Creation
The first step and the most key component of training a machine learning model is acquiring quality training data.
makepath developed automated methods for producing machine learning datasets from historical imagery using the Matagorda data. To accomplish this, the training data was divided into four scene types (classes):
Incorporating a variety of scenes into the training data allows the model to learn similarities and differences between historical and new imagery across a diverse range of geographical regions.
2. Preparing Machine Learning Training Datasets
To prepare the machine learning training datasets, all training data imagery was cropped, and any imagery with metadata (information such as time stamps and coordinates printed directly onto historical imagery) were filtered out.
The remaining data was broken down into three splits depending on latitude and longitude:
Dividing the splits by latitude and longitude ensures there is no overlap between the categories.
3. Approach: Machine Learning Models
Two Machine Learning models were used to reference and mosaic the imagery:
SuperPoint1: used to obtain keypoints and descriptors in order to mosaic the imagery.
H-Net2: used to provide matching scores for historical grayscale to reference RGB imagery
Plans for the future include extending this model to other applications, such as larger geographic areas than previously tested.
Further development, including a front-end interface, would open up the opportunity for the public to access historical imagery based on a desired location.
- DeTone, Daniel, Tomasz Malisiewicz, and Andrew Rabinovich. “Superpoint: Self-supervised interest point detection and description.” Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2018.
- Liu, Weiquan, et al. “H-Net: Neural Network for Cross-domain Image Patch Matching.” IJCAI. 2018.