Reproducible Research Repository
Reproducible Research Repository
  • Home
  • Repository
  • Collections
  • About
    Home / Repository / RR_WLD_2026_606
central

Reproducibility package for Fine-Scale Spatial Disaggregation Of Statistical Data Via Graph Neural Networks

2026
Reference ID
RR_WLD_2026_606
DOI
https://doi.org/10.60572/sj2b-wp55
Author(s)
Kamwoo Lee, Brian Blankespoor, David Newhouse
Collections
World Bank Policy Research Working Papers
Metadata
JSON
Created on
Apr 08, 2026
Last modified
Apr 28, 2026
Page views
10
Downloads
7
  • Project Description
  • Downloads
  • Overview
  • Reproducibility Package
  • Description
  • Scope and coverage
  • Disclaimer
  • Access and rights
  • Contacts
  • Information on metadata
  • Citation
  • Overview

    Abstract

    Fine-grained spatial data are critical for informed decision-making in domains ranging from economic planning to environmental management. However, many statistics are only available for coarse administrative units, necessitating techniques for fine-scale spatial disaggregation. In this paper, we introduce a graph neural network (GNN) based framework for disaggregating aggregated indicators to a finer spatial resolution. The GNN approach leverages graph representations of spatial units to incorporate both feature information and spatial relationships, addressing challenges of heterogeneity and data sparsity. The approach also adopts the H3 hierarchical hexagonal indexing system to define fine-resolution cells, providing a globally consistent, multi-resolution spatial grid well suited to graph-based modeling. We demonstrate the framework using gross domestic product (GDP) as a representative example, disaggregating national or regional GDP to fine-resolution cells. While illustrated with GDP, the proposed methodology is applicable to a broad class of aggregate indicators, offering a flexible and scalable tool for spatial analysis of economic, social, and environmental statistics. Our results show that the framework produces high-resolution estimates that are consistent with known aggregates and aligned with ancillary covariate patterns. This general-purpose approach to spatial disaggregation enables more detailed mapping of indicators like GDP and beyond, unlocking finer insights from coarse data.

    Reproducibility Package

    Scripts
    Readme
    Link: https://reproducibility.worldbank.org/catalog/530/download/1562/README.pdf
    Reproducibility package for Fine-Scale Spatial Disaggregation Of Statistical Data Via Graph Neural Networks
    File name
    RR_WLD_2026_606
    Zip package
    RR_WLD_2026_606.zip
    Title
    Reproducibility package for Fine-Scale Spatial Disaggregation Of Statistical Data Via Graph Neural Networks
    Date
    2026-04
    Dependencies
    Dependencies are stored in the requirements.txt file.
    Instructions
    See README in reproducibility package.
    Notes
    Computational reproducibility verified by Development Impact (DECDI) Analytics team, World Bank.
    Source code repository
    Repository name Type URI
    Reproducible Research Repository (World Bank) https://reproducibility.worldbank.org
    GitHub Available to World Bank Staff https://github.com/worldbank/gnn-gdp-disaggregation/releases/tag/v1.01
    Software
    Python
    Name
    Python
    Version
    3.11.15

    Reproducibility

    Technology environment

    Paper exhibits were reproduced on a computer with the following specifications:
    • OS: macOS
    • Processor: Apple M4 Pro
    • Memory available: 24 GB

    Technology requirements

    Runtime: 10 minutes

    Reproduction instructions

    To reproduce the findings of this paper, a new user should:

    1. Recover the environment by installing the required packages listed in requirements.txt.
    2. Run the Jupyter notebooks in the 09_model_validation folder.
    3. Run the Jupyter notebooks in the 11_visualization folder.

    Note: All data needed to run the package and reproduce the findings in the manuscript are included in the reproducibility package. The starting point is intermediate data, which is already included. The associated repository (https://github.com/worldbank/gnn-gdp-disaggregation/releases/tag/v1.01 only available to World Bank Staff) contains the code to go from raw data to the intermediate data used as the starting point. This portion of the workflow is currently accessible to World Bank staff only. All raw data sources are documented in the Data Section of this entry.

    Data

    Datasets
    Global Dataset of Reported Subnational Economic Output (DOSE) (V2.11)
    Name
    Global Dataset of Reported Subnational Economic Output (DOSE) (V2.11)
    Note
    Harmonized subnational GDP data for 1,661 sub-national regions across 83 countries (1953–2020), with sectoral detail. Regional identifiers use GADM 3.6 GID codes. Downloaded files: DOSE_V2.11.csv; DoseV2p11_changes.pdf (01_raw_data/admin_gdp/DOSE/). Corresponding administrative boundary shapefiles downloaded separately (DOSE_shapefiles.gpkg; 01_raw_data/admin_boundaries/DOSE/). Used in the model training pipeline (steps 02–08). Intermediate files derived from this source and included in the reproducibility package: 04_admin_mapping/output/h3_res6_to_dose_adm1.csv (H3-to-DOSE admin-1 mapping, input to 11_visualization); 08_model_inference/output/ and 10_data_product/output/ (gdp_intensity_r6_estimates_{year}.csv, years 2015–2024, inputs to 09_model_validation and 11_visualization).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://creativecommons.org/licenses/by/4.0/
    Data URL
    https://zenodo.org/records/16313760
    Citation
    Druckenmiller, H., & Burke, M. (2025). Global dataset of reported subnational economic output (DOSE), V2.11 [dataset]. Zenodo. https://doi.org/10.5281/zenodo.16313760
    World Development Indicators
    Name
    World Development Indicators
    Note
    Country-level GDP data are used in two stages of the pipeline. (1) Raw GDP series from 01_raw_data/admin_gdp/WB_WDI/: GDP (current LCU).csv, GDP (current USD).csv, GDP (constant 2015 USD).csv — used in the model training pipeline (steps 02–08). (2) PPP-adjusted GDP series from 10_data_product/additional_input/WB_WDI/downloaded_data/: GDP, PPP (constant 2021 international $).csv; GDP, PPP (current international $).csv — directly included in the reproducibility package (see data_hash_report.csv) and used in step 10_data_product to normalize cell-level estimates. Intermediate files using this source: 10_data_product/output/gdp_intensity_r6_estimates_{year}.csv (2015–2024), inputs to 11_visualization.
    Access policy
    Data are publicly available and included in the reproducibility package.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
    Data URL
    https://databank.worldbank.org/source/world-development-indicators
    Citation
    World Bank. (2025). World Development Indicators [dataset]. Available at https://databank.worldbank.org/source/world-development-indicators
    Regional Economic Database
    Name
    Regional Economic Database
    Note
    Sub-national (TL2 and TL3) GDP data in PPP USD for OECD member and partner countries. Downloaded files: OECD.CFE.EDS,DSD_REG_ECO@DF_GDP,2.0+all.csv; OECD Territorial correspondence - TL2024.xlsx (01_raw_data/admin_gdp/OECD/). Used in the model training pipeline (steps 02–08). Non-NUTS OECD country mapping files included in package: 04_admin_mapping/output/h3_res6_to_oecd_non_nuts_{country}_adm1.csv for Australia, Canada, Chile, Colombia, Egypt, Indonesia, Japan, Korea, Mexico, New Zealand, and Peru (inputs to 11_visualization). Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://www.oecd.org/en/about/terms-conditions.html
    Data URL
    https://data-explorer.oecd.org/vis?df[id]=DSD_REG_ECO%40DF_GDP&df[ag]=OECD.CFE.EDS
    Citation
    Organisation for Economic Co-operation and Development (OECD). (2025). Regional Economic Database [dataset]. Available at https://data-explorer.oecd.org
    World Economic Outlook Database
    Name
    World Economic Outlook Database
    Note
    Country-level macroeconomic data including nominal and real GDP used to fill gaps in national accounts coverage. Downloaded file: weoapr2025all.xls (01_raw_data/admin_gdp/IMF_WEO/). Used in the model training pipeline (steps 02–08). Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    IMF Data Terms of Use
    License URL
    https://www.imf.org/external/terms.htm
    Data URL
    https://www.imf.org/en/publications/weo/weo-database/2025/april
    Citation
    International Monetary Fund. (2025). World Economic Outlook Database [dataset]. Available at https://www.imf.org/en/publications/weo/weo-database/2025/april
    National Accounts Estimates of Main Aggregates
    Name
    National Accounts Estimates of Main Aggregates
    Note
    Country-level GDP in current and constant prices (LCU and USD) from the United Nations Statistics Division (UNSD). Downloaded files: UNdata_Export_GDP_current_lcu.csv; UNdata_Export_GDP_current_usd.csv; UNdata_Export_GDP_constant_2015_usd.csv (01_raw_data/admin_gdp/UNData/). Used in the model training pipeline (steps 02–08). Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://data.un.org/Host.aspx?Content=UNdataUse
    Data URL
    http://data.un.org/Explorer.aspx
    Citation
    United Nations Statistics Division. (2025). National Accounts Estimates of Main Aggregates [dataset]. Available at http://data.un.org/Explorer.aspx
    Regional Accounts in Albania
    Name
    Regional Accounts in Albania
    Note
    Gross Domestic Product by statistical regions for Albania (2019–2023). Downloaded file: llogaritë_rajonale_në_shqipëri_2019-2023-angl.xlsx (01_raw_data/admin_gdp/NSO_ALB/). Used in the model training pipeline (steps 02–08). Boundary mapping file included in the package: 04_admin_mapping/output/h3_res6_to_nso_alb_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Direct download, no licensing information found
    Data URL
    https://www.instat.gov.al/en/themes/economy-and-finance/regional-accounts-in-albania/
    Citation
    Institute of Statistics of Albania (INSTAT). (2025). Gross Domestic Product by Statistical Regions in Albania, 2019–2023 [dataset]. Available at https://www.instat.gov.al/en/themes/economy-and-finance/regional-accounts-in-albania/
    Regional Accounts of Brazil
    Name
    Regional Accounts of Brazil
    Note
    State-level gross domestic product for Brazil. Downloaded file: Especiais_2010_2023_xls.zip (01_raw_data/admin_gdp/NSO_BRA/). An administrative lookup file derived from this source is included in the package: 02_preprocessing/admin_gdp/06_NSO_BRA/admin_lookup.xlsx. Boundary mapping file: 04_admin_mapping/output/h3_res6_to_nso_bra_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://www.ibge.gov.br/en/institutional/institutional.html
    Data URL
    https://www.ibge.gov.br/en/statistics/economic/national-accounts/16855-regional-accounts-of-brazil.html
    Citation
    Instituto Brasileiro de Geografia e Estatística (IBGE). (2025). Regional Accounts of Brazil [dataset]. Available at https://www.ibge.gov.br/en/statistics/economic/national-accounts/16855-regional-accounts-of-brazil.html
    National Accounts and Regional GDP
    Name
    National Accounts and Regional GDP
    Note
    National and provincial-level gross domestic product for China. Downloaded files: Annual.csv (national accounts); AnnualbyProvince.csv (provincial GDP) (01_raw_data/admin_gdp/NSO_CHN/). An administrative lookup file derived from this source is included in the package: 02_preprocessing/admin_gdp/06_NSO_CHN/admin_lookup.xlsx. Boundary mapping file: 04_admin_mapping/output/h3_res6_to_nso_chn_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    National Bureau of Statistics of China Terms of Service
    License URL
    https://www.stats.gov.cn/english/nbs/200701/t20070104_59236.html
    Data URL
    https://data.stats.gov.cn/english/index.htm
    Citation
    National Bureau of Statistics of China. (2025). National Accounts and GDP by Province [dataset]. Available at https://data.stats.gov.cn/english/index.htm
    Handbook of Statistics on Indian Economy and Indian States
    Name
    Handbook of Statistics on Indian Economy and Indian States
    Note
    Macro-economic aggregates at current prices (national) and Gross State Domestic Product (GSDP) at current prices for Indian states. Downloaded files from Handbook of Statistics on Indian Economy (years 2022–2025) and Handbook of Statistics of Indian States (year 2025) (01_raw_data/admin_gdp/NSO_IND/). An administrative lookup file derived from this source is included in the package: 02_preprocessing/admin_gdp/06_NSO_IND/admin_lookup.xlsx. Boundary mapping file: 04_admin_mapping/output/h3_res6_to_nso_ind_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Reserve Bank of India Terms of Use
    License URL
    https://www.rbi.org.in/scripts/Disclaimer.aspx
    Data URL
    https://www.rbi.org.in/Scripts/AnnualPublications.aspx?head=Handbook%20of%20Statistics%20on%20Indian%20Economy
    Citation
    Reserve Bank of India. (2025). Handbook of Statistics on Indian Economy and Handbook of Statistics of Indian States [dataset]. Available at https://www.rbi.org.in/Scripts/publications.aspx
    Bureau of National Statistics of Kazakhstan — Gross Regional Product
    Name
    Bureau of National Statistics of Kazakhstan — Gross Regional Product
    Note
    Gross Regional Product (GRP) at current prices for Kazakhstan's regions. Downloaded file: 1. GRP.xlsx (01_raw_data/admin_gdp/NSO_KAZ/). Boundary mapping file included in package: 04_admin_mapping/output/h3_res6_to_nso_kaz_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Open Data of the Republic of Kazakhstan
    License URL
    https://stat.gov.kz/en/description/
    Data URL
    https://stat.gov.kz/en/industries/economy/national-accounts/dynamic-tables/
    Citation
    Bureau of National Statistics, Agency for Strategic Planning and Reforms of the Republic of Kazakhstan. (2025). Gross Regional Product [dataset]. Available at https://stat.gov.kz/en/industries/economy/national-accounts/dynamic-tables/
    National Statistical Committee of the Kyrgyz Republic — Gross Regional Product
    Name
    National Statistical Committee of the Kyrgyz Republic — Gross Regional Product
    Note
    Gross Regional Product (GRP) at current prices for Kyrgyz regions. Downloaded file: 1.01.00.09 Валовой региональный продукт (ВРП) в текущих ценах..xlsx (01_raw_data/admin_gdp/NSO_KGZ/). Boundary mapping file included in package: 04_admin_mapping/output/h3_res6_to_nso_kgz_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution-NonCommercial-ShareAlike license
    License URL
    https://www.stat.gov.kg/en/
    Data URL
    https://www.stat.gov.kg/en/statistics/nacionalnye-scheta/
    Citation
    National Statistical Committee of the Kyrgyz Republic. (2025). Gross Regional Product (GRP) at current prices [dataset]. Available at https://www.stat.gov.kg/en/statistics/download/dynamic/743/
    Malta National Statistics Office — Regional Gross Domestic Product
    Name
    Malta National Statistics Office — Regional Gross Domestic Product
    Note
    GDP at market prices by region (NUTS 3) for Malta. Downloaded file: NR-237-2025-Table-2-b053b1aa83d93e3b.xlsx (01_raw_data/admin_gdp/NSO_MLT/). Boundary mapping file included in package: 04_admin_mapping/output/h3_res6_to_nso_mlt_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Direct download
    Data URL
    https://nso.gov.mt/regional_and_geospatial_statistics/
    Citation
    National Statistics Office Malta. (2025). Regional Gross Domestic Product: 2024 [dataset]. Available at https://nso.gov.mt/regional-and-geospatial/regional-gross-domestic-product-2024/
    Philippine Statistics Authority — Gross Domestic Regional Product
    Name
    Philippine Statistics Authority — Gross Domestic Regional Product
    Note
    GRDP by region for the Philippines (2000–2024). Downloaded files: GRDP_Reg_2018PSNA_2022_to_2024_with_NIR.xlsx; GRDP_Reg_2018PSNA_2000_to_2023_without_NIR.xlsx (01_raw_data/admin_gdp/NSO_PHL/). Boundary mapping file included in package: 04_admin_mapping/output/h3_res6_to_nso_phl_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Philippine Government Open Data License
    License URL
    https://psa.gov.ph/
    Data URL
    https://psa.gov.ph/statistics/grdp/data-series
    Citation
    Philippine Statistics Authority (PSA). (2025). Gross Domestic Regional Product Data Series [dataset]. Available at https://psa.gov.ph/statistics/grdp/data-series
    Rosstat Federal State Statistics Service — Russian Statistical Yearbook
    Name
    Rosstat Federal State Statistics Service — Russian Statistical Yearbook
    Note
    Regional gross domestic product data for Russia, extracted from Russian Statistical Yearbooks 2022, 2023, and 2024. Downloaded files: Russian Statistical Yearbook 2024.rar; Russian Statistical Yearbook 2023.rar; Russian Statistical Yearbook 2022.rar (01_raw_data/admin_gdp/NSO_RUS/). An administrative lookup file derived from this source is included in the package: 02_preprocessing/admin_gdp/06_NSO_RUS/admin_lookup.xlsx. Boundary mapping file: 04_admin_mapping/output/h3_res6_to_nso_rus_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Rosstat Open Data Terms
    License URL
    https://eng.rosstat.gov.ru/
    Data URL
    https://eng.rosstat.gov.ru/Publications/document/74811
    Citation
    Federal State Statistics Service of Russia (Rosstat). (2024). Russian Statistical Yearbook 2024 [dataset]. Available at https://eng.rosstat.gov.ru/Publications/document/74811
    Tanzania National Bureau of Statistics — National Accounts of Mainland Tanzania
    Name
    Tanzania National Bureau of Statistics — National Accounts of Mainland Tanzania
    Note
    National accounts data for Mainland Tanzania including regional GDP estimates. Downloaded file: en-1737111899-National Accounts of Mainland Tanzania 2024.xlsx (01_raw_data/admin_gdp/NSO_TZA/). Boundary mapping file included in package: 04_admin_mapping/output/h3_res6_to_nso_tza_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Direct download
    Data URL
    https://www.nbs.go.tz/index.php/statistics/topic/annual-national-accounts-publications
    Citation
    National Bureau of Statistics, United Republic of Tanzania. (2025). National Accounts of Mainland Tanzania 2024 [dataset]. Available at https://www.nbs.go.tz/index.php/statistics/topic/annual-national-accounts-publications
    Gross Domestic Product by County and Metropolitan Statistical Area (CAGDP2)
    Name
    Gross Domestic Product by County and Metropolitan Statistical Area (CAGDP2)
    Note
    GDP in current dollars by county and MSA for the United States. Downloaded file: CAGDP2.zip (01_raw_data/admin_gdp/NSO_USA/). Boundary mapping files included in package: 04_admin_mapping/output/h3_res6_to_nso_usa_adm1.csv; h3_res6_to_nso_usa_adm2.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024). US county and state shapefiles (downloaded separately from US Census Bureau TIGER/Line; see separate entry) are used as boundary reference for this dataset.
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Open Data
    License URL
    https://www.bea.gov/open-data
    Data URL
    https://www.bea.gov/data/gdp/gdp-county-metro-and-other-areas
    Citation
    Bureau of Economic Analysis, U.S. Department of Commerce. (2025). Gross Domestic Product by County and Metropolitan Statistical Area (CAGDP2) (CAGDP2) [dataset]. Available at https://www.bea.gov/data/gdp/gdp-county-metro-and-other-areas
    Statistics South Africa — Provincial Gross Domestic Product (P0441.2)
    Name
    Statistics South Africa — Provincial Gross Domestic Product (P0441.2)
    Note
    GDP by province at current prices for South Africa (2014–2024). Downloaded file: P044122024.pdf (01_raw_data/admin_gdp/NSO_ZAF/). Boundary mapping file included in package: 04_admin_mapping/output/h3_res6_to_nso_zaf_adm1.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Direct download
    Data URL
    https://www.statssa.gov.za/?page_id=1854&PPN=P0441.2
    Citation
    Statistics South Africa. (2025). P0441.2 — Provincial Gross Domestic Product, 2024 [dataset]. Available at https://www.statssa.gov.za/?page_id=1854&PPN=P0441.2
    Comprehensive Global Administrative Zones (CGAZ)
    Name
    Comprehensive Global Administrative Zones (CGAZ)
    Note
    Global composite administrative boundary files (ADM1 and ADM2 geoPackages) and individual country files (Morocco ADM1). Downloaded files: geoBoundariesCGAZ_ADM1.gpkg; geoBoundariesCGAZ_ADM2.gpkg; geoBoundaries-MAR-ADM1-all zip (01_raw_data/admin_boundaries/CGAZ/). Used as administrative boundary shapefiles to generate H3-to-admin mapping files in the pipeline (steps 03–04). H3-to-admin mapping files derived from CGAZ boundaries and included in package: 04_admin_mapping/output/h3_res6_to_adm0.csv; and various h3_res6_to_nso_* and h3_res6_to_oecd_non_nuts_* files. These are intermediate inputs to 11_visualization.
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://www.geoboundaries.org/index.html#usage
    Data URL
    https://www.geoboundaries.org/globalDownloads.html
    Citation
    Runfola, D., Anderson, A., Baier, H., Crittenden, M., Dowker, E., Fuhrig, S., ... & Hobbs, B. (2020). geoBoundaries: A global database of political administrative boundaries. PLOS ONE, 15(4), e0231866. https://doi.org/10.1371/journal.pone.0231866
    Database of Global Administrative Areas (GADM)
    Name
    Database of Global Administrative Areas (GADM)
    Note
    Global database of administrative area boundaries at multiple levels. Versions 4.1 and 3.6 downloaded (gadm_410-levels.zip; gadm36_levels_gpkg.zip; 01_raw_data/admin_boundaries/GADM/). Note: DOSE V2.11 subnational regions use GADM 3.6 GID codes as region identifiers. Used in steps 03–04 for spatial crosswalks and H3-to-admin boundary mapping. H3-to-admin mapping files derived (in part) using GADM boundaries are included in the package: 04_admin_mapping/output/ h3_res6_to_* CSV files. These are intermediate inputs to 11_visualization.
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    GADM License
    License URL
    https://gadm.org/license.html
    Data URL
    https://gadm.org/index.html
    Citation
    University of California, Davis. (2022). GADM database of Global Administrative Areas, version 4.1 [dataset]. Available at https://gadm.org/download_world.html
    Territorial Units for Statistics (2021 and 2024)
    Name
    Territorial Units for Statistics (2021 and 2024)
    Note
    GeoPackage boundary files for NUTS 2021 and NUTS 2024 statistical regions of Europe at 1:1M scale (EPSG:4326). Downloaded files: NUTS_RG_01M_2021_4326.gpkg; NUTS_RG_01M_2024_4326.gpkg (01_raw_data/admin_boundaries/NUTS/). H3-to-NUTS mapping files derived from these boundaries and included in the package: 04_admin_mapping/output/h3_res6_to_oecd_nuts_2021_adm1.csv; h3_res6_to_oecd_nuts_2021_adm2.csv; h3_res6_to_oecd_nuts_2024_adm1.csv; h3_res6_to_oecd_nuts_2024_adm2.csv. These are intermediate inputs to 11_visualization.
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://ec.europa.eu/eurostat/web/gisco/geodata
    Data URL
    https://ec.europa.eu/eurostat/web/gisco/geodata/statistical-units/territorial-units-statistics
    Citation
    Eurostat GISCO. (2024). Territorial units for statistics (NUTS 2021 and NUTS 2024) [dataset]. Available at https://ec.europa.eu/eurostat/web/gisco/geodata/statistical-units/territorial-units-statistics
    OpenStreetMap (OSM) Boundaries
    Name
    OpenStreetMap (OSM) Boundaries
    Note
    OpenStreetMap-derived administrative boundary polygons for Tanzania – Zanzibar and Taiwan. Downloaded file: OSMB-6b1f9a1a800f048091aade6e439c2a0594d9e726.geojson (01_raw_data/admin_boundaries/OSM/). Used in steps 03–04 for H3-to-admin mapping for these jurisdictions. Downstream intermediate files: 04_admin_mapping/output/h3_res6_to_* CSV files and 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution-ShareAlike 2.0 license (CC BY-SA 2.0)
    License URL
    https://www.openstreetmap.org/copyright
    Data URL
    https://osm-boundaries.com/
    Citation
    OpenStreetMap Contributors. (2025). OpenStreetMap (OSM) Boundaries - administrative boundary polygons [dataset]. Available at https://osm-boundaries.com/
    OpenStreetMap — Planet OSM
    Name
    OpenStreetMap — Planet OSM
    Note
    Global OpenStreetMap data (full history extract, November 2025 snapshot) used to derive road network and settlement features as geospatial covariates. Downloaded via AWS S3 (s3://osm-planet-eu-central-1; history-251117.osm.pbf; 01_raw_data/geo_covariates/OSM/). Note: a separate OSM-Boundaries download was also used for administrative boundary polygons (see OSM-Boundaries entry). Used in the model training pipeline (steps 02–08) for geospatial feature extraction. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution-ShareAlike 2.0 license (CC BY-SA 2.0)
    License URL
    https://www.openstreetmap.org/copyright
    Data URL
    https://wiki.openstreetmap.org/wiki/Planet.osm
    Citation
    OpenStreetMap Contributors. (2025). OpenStreetMap — Planet OSM [dataset]. Available at https://wiki.openstreetmap.org/wiki/Planet.osm
    TIGER/Line Shapefiles (Counties and States)
    Name
    TIGER/Line Shapefiles (Counties and States)
    Note
    County and state boundary shapefiles for the United States for years 2021–2024. Downloaded files: tl_{year}_us_county.zip and tl_{year}_us_state.zip for years 2021, 2022, 2023, 2024 (01_raw_data/admin_boundaries/TIGER/). Used in steps 03–04 for H3-to-admin mapping for the United States. Boundary mapping files derived from TIGER and included in package: 04_admin_mapping/output/h3_res6_to_nso_usa_adm1.csv; h3_res6_to_nso_usa_adm2.csv. Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Direct download
    Data URL
    https://www.census.gov/cgi-bin/geo/shapefiles/index.php
    Citation
    US Census Bureau. (2024). TIGER/Line Shapefiles: Counties and States [dataset]. Available at https://www.census.gov/cgi-bin/geo/shapefiles/index.php
    World Bank Official Boundaries
    Name
    World Bank Official Boundaries
    Note
    World Bank Official Boundaries GeoPackage files (Admin 0, Admin 0 all layers, Admin 1, Admin 2, NDLSA, Ocean Mask) used as the canonical country boundary reference for the model pipeline. Downloaded from World Bank Data Catalog (01_raw_data/admin_boundaries/WBOB/). (1) Used in steps 03–04 for spatial operations and country-level administrative mapping: intermediate outputs include 04_admin_mapping/output/h3_res6_to_adm0.csv and other h3_res6_to_* mapping files. (2) Admin 2 and NDLSA boundary layers are also read directly in step 10 (data product creation) by 10_data_product/01_create_final_dataset.ipynb to assign WB administrative codes and territory status to each H3 cell: files 01_raw_data/admin_boundaries/WBOB/downloaded_data/World Bank Official Boundaries - Admin 2.gpkg; World Bank Official Boundaries - NDLSA.gpkg. Final data product files used as inputs to steps 09 (model validation) and 11 (visualization): 10_data_product/output/gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
    Data URL
    https://datacatalog.worldbank.org/search/dataset/0038272/World-Bank-Official-Boundaries
    Citation
    World Bank. (2025). World Bank Official Boundaries [dataset]. World Bank Data Catalog. Available at https://datacatalog.worldbank.org/search/dataset/0038272/World-Bank-Official-Boundaries
    MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500m (MCD12Q1 v061)
    Name
    MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500m (MCD12Q1 v061)
    Note
    Annual land cover classification rasters at 500 m resolution (EPSG:4326) for years 2015–2024. Granule IDs: MCD12Q1.A{year}001.* (01_raw_data/geo_covariates/MCD12/). Used as geospatial covariate features in the model training pipeline (steps 03–08). Downstream intermediate files embodying this information: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Direct download
    Data URL
    https://doi.org/10.5067/MODIS/MCD12Q1.061
    Citation
    Friedl, M., & Sulla-Menashe, D. (2022). MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500m SIN Grid V061 [dataset]. https://doi.org/10.5067/MODIS/MCD12Q1.061
    VIIRS Day/Night Band Annual Composites (VNL v21 and v22)
    Name
    VIIRS Day/Night Band Annual Composites (VNL v21 and v22)
    Note
    Annual nighttime light radiance composites at 15 arc-second (~500 m) resolution (EPSG:4326) for 2015–2024. VNL v21 covers 2015–2021; VNL v22 covers 2022–2024. Downloaded files: VNL_v21_npp_{year}_global_vcmslcfg_c202205302300.median_masked.dat.tif (2015–2021); VNL_v22_npp-j01_2022_global_vcmslcfg_c202303062300.median_masked.dat.tif (2022); VNL_npp_{year}_global_vcmslcfg_v2_*.median_masked.dat.tif (2023–2024) (01_raw_data/geo_covariates/VNL/). Used as geospatial covariate features in the model training pipeline (steps 03–08). Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://eogdata.mines.edu/products/vnl/
    Data URL
    https://developers.google.com/earth-engine/datasets/catalog/NOAA_VIIRS_DNB_ANNUAL_V22
    Citation
    Elvidge, C. D., Zhizhin, M., Ghosh, T., Hsu, F.-C., & Taneja, J. (2021). Annual time series of global VIIRS nighttime lights derived from monthly averages: 2012 to 2019. Remote Sensing 13(5), 922. https://doi.org/10.3390/rs13050922
    WorldPop Global Population Counts — Global Mosaics 2015–2030 (1 km, R2025A v1)
    Name
    WorldPop Global Population Counts — Global Mosaics 2015–2030 (1 km, R2025A v1)
    Note
    Annual global population count rasters at 30 arc-second (~1 km) resolution (EPSG:4326) for 2015–2024 (R2025A v1). Downloaded files: global_pop_{year}_CN_1km_R2025A_UA_v1.tif for years 2015–2024 (01_raw_data/geo_covariates/WorldPop/). Used as a geospatial population covariate in the model training pipeline (steps 03–08). Downstream intermediate files: 08_model_inference/ and 10_data_product/output/ gdp_intensity_r6_estimates_{year}.csv (2015–2024).
    Access policy
    Data are publicly available but not directly included in the reproducibility package. Intermediate files derived from this source are included.
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://hub.worldpop.org/
    Data URL
    https://hub.worldpop.org/geodata/listing?id=137
    Citation
    WorldPop (School of Geography and Environmental Science, University of Southampton). (2025). Global Population Counts — Global Mosaics 2015–2030, 1 km resolution, R2025A v1 [dataset]. https://doi.org/10.5258/SOTON/WP00789
    World Bank Country and Lending Groups
    Name
    World Bank Country and Lending Groups
    Note
    Historical income group classifications (low, lower-middle, upper-middle, high income) for all World Bank member countries from 1987 to 2025. Downloaded file: OGHIST_2025_10_07.xlsx (11_visualization/additional_input/WB_country_classification/downloaded_data/). This file is directly included in the reproducibility package (see data_hash_report.csv) and is a direct input to 11_visualization/02_check_temporal_dynamic.ipynb, where it is used to classify countries by income group for distributional analysis of cell-level GDP estimates.
    Access policy
    Data are publicly available and included in the reproducibility package (11_visualization/additional_input/WB_country_classification/downloaded_data/OGHIST_2025_10_07.xlsx).
    License
    Creative Commons Attribution 4.0 International license (CC BY 4.0)
    License URL
    https://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
    Data URL
    https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups
    Citation
    World Bank. (2025). World Bank Country and Lending Groups [dataset]. Available at https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups
    Demographic and Health Surveys — Household Recode Datasets
    Name
    Demographic and Health Surveys — Household Recode Datasets
    Note
    Cluster-level wealth index scores derived from DHS Household Recode (HR) Stata files and geographic cluster locations from GE shapefiles, for 83 DHS surveys conducted in 2015 or later. Raw data downloaded from the DHS Program data portal (requires a registered account). Processing notebooks in 09_model_validation/additional_input/DHS/ query the DHS API for the survey catalog, download files, and aggregate wealth indices to H3 resolution-6 cells, producing the processed intermediate file: 09_model_validation/additional_input/DHS/output/dhs_cluster_wealth_r6.csv. This processed file is a direct input to 09_model_validation/04_validate_against_DHS_wealth_index.ipynb for external validation of the GNN model. Access date: 2025.
    Access policy
    Data access requires purchase or human approval and is not included in the reproducibility package. Intermediate data is included in the reproducibility package.
    License
    DHS Program Data Use Agreement
    License URL
    https://dhsprogram.com/data/terms-of-use.cfm
    Data URL
    https://api.dhsprogram.com/rest/dhs/datasets
    Citation
    DHS Program. (2025). Demographic and Health Surveys — Household Recode Datasets [dataset]. Available at https://dhsprogram.com/data/
    Data statement

    All data is publicly available, but not all is directly included in the reproducibility package. All intermediate data needed to run the package is included in the reproducibility package.

    Description

    Output
    Fine-Scale Spatial Disaggregation Of Statistical Data Via Graph Neural Networks
    Type
    Working Paper
    Title
    Fine-Scale Spatial Disaggregation Of Statistical Data Via Graph Neural Networks
    Description
    Policy Research Working Papers (PRWP)
    Authors
    Author Affiliation Email
    Kamwoo Lee World Bank klee16@worldbank.org
    Brian Blankespoor World Bank bblankespoor@worldbank.org
    David Newhouse World Bank dnewhouse@worldbank.org
    Date of production

    2026-04-08

    Scope and coverage

    Geographic locations
    Location Code
    World WLD
    Keywords
    Spatial Disaggregation Graph Neural Networks Fine-Scale Statistical Mapping Regional Gdp H3 Spatial Indexing
    Topics
    ID Topic Parent topic ID Vocabulary Vocabulary URI
    C45 Neural Networks and Related Topics C4 Journal of Economic Literature (JEL)
    C55 Large Data Sets: Modeling and Analysis C5 Journal of Economic Literature (JEL)
    C81 Methodology for Collecting, Estimating, and Organizing Microeconomic Data • Data Access C8 Journal of Economic Literature (JEL)
    R12 Size and Spatial Distributions of Regional Economic Activity R1 Journal of Economic Literature (JEL)
    R15 Econometric and Input–Output Models • Other Models R1 Journal of Economic Literature (JEL)

    Disclaimer

    Disclaimer

    The materials in the reproducibility packages are distributed as they were prepared by the staff of the International Bank for Reconstruction and Development/The World Bank. The findings, interpretations, and conclusions expressed in this event do not necessarily reflect the views of the World Bank, the Executive Directors of the World Bank, or the governments they represent. The World Bank does not guarantee the accuracy of the materials included in the reproducibility package.

    Access and rights

    License
    Name URI
    Modified BSD3 https://opensource.org/license/bsd-3-clause/

    Contacts

    Contacts
    Name Affiliation Email
    Kamwoo Lee World Bank klee16@worldbank.org
    Reproducibility WBG World Bank reproducibility@worldbank.org

    Information on metadata

    Producers
    Name Abbreviation Affiliation Role
    Reproducibility WBG DECDI World Bank - Development Impact Department Verification and preparation of metadata
    Date of Production

    2026-04-08

    Document version

    1

    Citation

    Citation
    loading, please wait...
    Citation format
    Export citation: RIS | BibTeX | Plain text
    Back to Catalog
    The World Bank Working for a World Free of Poverty
    • IBRD IDA IFC MIGA ICSID

    © The World Bank Group, All Rights Reserved.