3 Jun 2021, 13:05

How to use Planet-Box-Extractor API to create bounding boxes for your CV models

In this article, we introduce the Planet-Box-Extractor API (developed by VAIS) which enables the extraction of bounding boxes of interest from Mercator Web Tiles. The extraction of bounding boxes from the spherical surface of the Earth requires special mathematical treatment which we discuss in the first section of the article. Then, we explore the cartography used by modern web-based map providers. Finally, we illustrate how our API can be used with Planet.com API to retrieve a bounding box around a given location, using the Lacuna - Correct Field Deection Challenge.

Satellite images are a great resource for analytics applications in agriculture, wildlife monitoring, weather forecasting, and many more. Agricultural applications rely on satellite images to improve crop production such as crop yield estimation and plant disease detection. Robust models can improve agricultural planning and provide farmers with forecasts and timely interventions for better crops. Deep learning methods can develop powerful agricultural models by learning from data. Processing of satellite image data is important for the development of successful deep learning models and requires appropriate processing in order to provide impactful information.

Bounding Boxes using WGS84

Extracting a region of interest around a given longitude-latitude coordinate would be straightforward in case of a planar projection. However, for Earth surface patches, simply treating the geometry as if it were on a 2D plane results in weak approximations in case of small bounding boxes in low curvature regions. As for larger bounding boxes and higher curvature regions, the 2D plane projection approximation error becomes much higher. Figure 1 illustrates the ellipsoidal model of the Earth. For an interactive illustration of the difference between planar and spherical projection see [1].

More precise bounding box region calculation requires the utilization of cartography standards such as the World Geodetic System 1984 (WSG84) [2]. The WSG84 models the Earth as an ellipsoid with two radii: equatorial (or semi-major axis) and polar (or semi-minor axis). Projecting a region of interest onto the surface of the Earth involves some calculations as explained in [3].

Figure 1. Equatorial (a), polar (b) and mean Earth radii as defined in the 1984 World Geodetic System revision (not to scale). Source

Mercator Tiles

Tiling is the most common method of rendering maps on web application in seemingly continuous images [4]. Tiled maps are bandwidth-efficient since upon panning, many tiles remain the same, while retrieving only those needed. Web Mercator projection is the de facto standard for web tiles, using the cylindrical map projection introduced by the Flemish cartographer Gerardus Mercator in 1569. Figure 2 illustrates the borders of tiles on a map.

Figure 2. An example of a tiled web map. Tiled web maps are normally displayed with no gap between tiles. Source: Sentinel-2

Planet.com API [6] provides the following properties for tiled web maps:

  • Tiles are 256 * 256 pixels.
  • The lowest zoom level 0 represents a single tile for the entire planet.
  • The highest zoom level can vary between different map providers; level 20 represents a mid-sized building.
  • Tiles can be accessed with an XYZ naming convention, where Z is the zoom level, X and Y are the tile identifiers.
  • Where “tiles{0–3}” are different servers indices, “item_type” is the type of the item to view, and “item_id” is the id of the item to view.
  • To identify web tile given longitude and latitude:

Planet-Box-Extractor API

We developed the Planet-Box-Extractor API to facilitate the extraction of bounding boxes with a given radius around a specific longitude-latitude location. As we have explained, the projection onto the surface of the Earth and the retrieval of the proper web map tiles needs proper processing. Planet-Box-Extractor provides two key functionalities to facilitate this process:

(1) Stitching tiled images together

The rectangular nature of bounding boxes and map tiles may result in one of three cases (Figure 3):

i. A single tile completely encompasses the bounding box

ii. The border of two horizontally or vertically adjacent tiles passes through the bounding box.

iii. The point of intersection of a grid of four tiles is inside the bounding box.

Figure 3 The three possible cases of the location of a bounding box and Mercator tiles.

Planet-Box-Extractor API detects which of these cases exist given point coordinates, zoom level, and bounding box radius. Then, the relevant tiles intersecting the bounding box are retrieved and their relative clockwise positions computed starting from the top left corner i.e., up-left, up-right, down-right, down-left. Subsequently, a single image is constructed by placing the tiles in the correct order according to their clockwise positions.

(2) Cropping the desired bounding

Once we have a single image which covers the entire bounding box, we project the coordinates of the corners of the bounding box onto the pixel space of the image using the following equation:

pixel_x = int((longitude — tile.west)/(tile.east — tile.west) * 256) pixel_y = int((latitude — tile.north)/(tile.south — tile.north) * 256)

where both “pixel_x” and “pixel_y” represent the location in the image for each longitude-latitude pair of the corners of the bounding box. The “tile” variable represents the coordinates of the Mercator tile where the given longitude-latitude coordinates of interest are located, e.g., “tile.east” the longitude of the west border of the tile. Finally, the corresponding pixels of the bounding box are cropped and exported.

Zindi Competition

Figure 4 Zindi Competition task: predict the correct center (red) of a given crop field from an initial center (black). Source [7]

We use the Planet-Box-Extractor API to prepare the data for the Zindi competition “Lacuna — Correct Field Detection Challenge” [7]. The goal of this competition is to accurately find the center of crop field locations given satellite images for the field. We provide data both from Copernicus Sentinel-2 [8] and Planet Labs Imagery [9]. The Planet data is provided by Norway’s International Climate and Forests Initiative (NICFI) [10] Imagery Program with Planet Labs. Figure 4 illustrates the task of the competition: predicting the correct center (red) of a given crop field from an initial center (black).


The Planet-Box-Extractor API can be found here.

Currently, we use the Planet.com API [6] to retrieve satellite images. You can sign-up at [9].

Once installed, you can specify the longitude-latitude of point of interest, the tile zoom level, and the radius of the bounding box in kilometers as following:


Author’s Bio

Kareem Eissa is a Senior AI/ML Engineer at VAIS. Kareem has a master’s degree in Informatics from Nile University where he was a Research Assistant. Kareem has a broad experience in the field of deep learning with publications in computer vision, medical imaging, and natural language processing.