Reproducible Research Repository
Reproducible Research Repository
  • Home
  • Repository
  • Collections
  • About
    Home / Repository / RR_BGD_2026_620
central

Reproducibility package for Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: A Survey-To-Survey Imputation Approach

2026
Get Reproducibility Package
Reference ID
RR_BGD_2026_620
DOI
https://doi.org/10.60572/1z28-h024
Author(s)
FNU Jonaed, Ivan Gachet, Leopoldo Tornarolli
Metadata
JSON
Created on
Apr 22, 2026
Last modified
May 05, 2026
Page views
2
Downloads
3
  • Project Description
  • Downloads
  • Overview
  • Reproducibility Package
  • Description
  • Scope and coverage
  • Disclaimer
  • Access and rights
  • Contacts
  • Information on metadata
  • Citation
  • Overview

    Abstract

    Bangladesh's fertilizer subsidy costs $2.5 billion annually and accounts for nearly two-thirds of agricultural spending, yet its distributional impact remains unknown due to data limitations. This impedes reform of a policy that may favor larger farmers while crowding out investment in public goods, the real engines for long-term productivity growth. The study develops a survey-to-survey imputation method to address this gap: the Household Income and Expenditure Survey (HIES) 2022 records total fertilizer expenditure but not subsidized types, preventing accurate incidence analysis. The method combines cross-validated LASSO regression with randomized hot-deck matching to transfer type-specific fertilizer patterns from the Bangladesh Integrated Household Survey (BIHS) 2018-2019 to HIES 2022. Our procedure predicts household urea shares using 42 harmonized predictors, then assigns complete fertilizer compositions through nearest-neighbor matching within welfare-by-agro-ecological strata. The method achieves strong predictive accuracy (test RMSE = 0.169) and preserves distributional properties. Imputed shares replicate donor patterns closely: mean urea share is 51.5 percent versus 51.1 percent in BIHS, with overlapping confidence intervals across fertilizer types and regions. The enriched dataset provides the foundation for assessing whether subsidy benefits are concentrated among larger, wealthier farmers or distributed more equitably across farm households—a question previously unanswerable with existing data. More broadly, the study demonstrates a scalable framework for integrating complementary surveys in data-constrained settings.

    Reproducibility Package

    Scripts
    Readme Get Reproducibility Package
    Link: https://reproducibility.worldbank.org/catalog/540/download/1571/README.pdf
    Reproducibility package for Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: A Survey-To-Survey Imputation Approach
    File name
    RR_BGD_2026_620
    Zip package
    RR_BGD_2026_620.zip
    Title
    Reproducibility package for Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: A Survey-To-Survey Imputation Approach
    Date
    2026-04
    Dependencies
    R dependencies are listed in the file renv.lock. Stata dependencies are listed in the ado folder.
    Instructions
    See README in reproducibility package.
    Notes
    Computational reproducibility verified by Development Impact (DECDI) Analytics team, World Bank.
    Source code repository
    Repository name URI
    Reproducible Research Repository (World Bank) https://reproducibility.worldbank.org
    Software
    R
    Name
    R
    Version
    4.5.2
    Stata
    Name
    Stata
    Version
    18.0 MP

    Reproducibility

    Technology environment

    Paper exhibits were reproduced on a computer with the following specifications:
    • OS: Windows 11 Enterprise
    • Processor: Intel(R) Core(TM) i5-1145G7 CPU @ 2.60GHz
    • Memory available: 15.7 GB

    Technology requirements

    Runtime: 30 minutes

    Reproduction instructions

    To reproduce the exhibits in this paper, follow these steps:

    1. Request access to the restricted data and place the files in the appropriate folders (see the Data section above for access instructions per dataset).
    2. Open the Stata do-file 00_master_stage1_harmonization, update the working directory, and run the code.
    3. Open the R project fertilizer_s2s_reproducibility.Rproj.
    4. Restore the R environment by running renv::restore(), or manually install the packages listed in renv.lock.
    5. Open the R script 03_figure1_kernel_density, update the working directory, and run the script.
    6. Open the R script 05_s2s_imputation, update the working directory, and run the script.
    7. Open the Stata do-file 00_master_stage3, update the working directory, install the required packages, and run the code.

    Since the data is not included in the package, the outputs produced by the verification team are included so that users can compare them with the published manuscript.

    Data

    Datasets
    Household Income and Expenditure Survey (HIES) 2022
    Name
    Household Income and Expenditure Survey (HIES) 2022
    Note
    Data accessed in 2023. Raw HIES 2022 modules provided by the Bangladesh Bureau of Statistics. Replication requires authorized access to the HIES 2022 files. To access raw HIES 2022 data, an external user may contact Kabir Uddin Ahmed, Director, Computer Wing, Bangladesh Bureau of Statistics (kabir.ddd@gmail.com). File location: 01. data/01. raw/HIES_2022/
    Access policy
    Data access was granted directly to the study authors by the data owners. It was obtained with a custom data license that does not allow for redistribution, and it is not included in the reproducibility package.
    License
    Custom License
    Citation
    Bangladesh Bureau of Statistics. 2023. "Household Income and Expenditure Survey (HIES) 2022" [dataset]. Unpublished data. Accessed 2023.
    Household Income and Expenditure Survey (HIES) 2022 - South Asia Regional Micro Database Derived Files
    Name
    Household Income and Expenditure Survey (HIES) 2022 - South Asia Regional Micro Database Derived Files
    Note
    Data accessed in 2025. Accessed via the World Bank South Asia Regional Micro Database (SARMD) derivatives of HIES 2022. Replication requires authorized access. For an authorized World Bank user, these SARMD files can be accessed through Datalibweb by subscribing to the relevant survey and module (for example, BGD_2022_HIES, SARMD, IND/INC). Access requires a World Bank computer or virtual machine, Stata, the Datalibweb package, and an approved token/subscription. File location: 01. data/01. raw/HIES_2022/BGD_2022_HIES_v02_M_v03_A_SARMD_INC.dta; BGD_2022_HIES_v02_M_v03_A_SARMD_IND_full.dta; and distributional_assessment/data/BGD_2022_HIES_v02_M_v05_A_SARMD_INC.dta; BGD_2022_HIES_v02_M_v05_A_SARMD_IND_full.dta
    Access policy
    Data access requires purchase or human approval and is not included in the reproducibility package.
    License
    Custom License
    Data URL
    https://github.com/worldbank/SARMD
    Citation
    World Bank. 2022. "Household Income and Expenditure Survey (HIES) 2022 - South Asia Regional Micro Database (SARMD) Derived Files" [dataset]. Available from: https://github.com/worldbank/SARMD. Accessed 2025.
    Bangladesh Integrated Household Survey (BIHS) 2018–2019
    Name
    Bangladesh Integrated Household Survey (BIHS) 2018–2019
    Note
    Data accessed in 2025. BIHS 2018–2019 files provided by the International Food Policy Research Institute. To download from the source, select the BIHS 2018–2019 files from the IFPRI Dataverse release and place them in the designated directory. File location: 01. data/01. raw/BIHS_2019/
    Access policy
    Data is publicly available and included in the reproducibility package.
    License
    Creative Commons Attribution 4.0 International License
    License URL
    https://creativecommons.org/licenses/by/4.0/deed.en
    Data URL
    https://dataverse.harvard.edu/dataverse/IFPRI/?q=title%3A%22Bangladesh+Integrated+Household+Survey+%28BIHS%29%22
    Citation
    International Food Policy Research Institute. 2025. "Bangladesh Integrated Household Survey (BIHS) 2018–2019" [dataset]. Accessed 2025. https://dataverse.harvard.edu/dataverse/IFPRI/?q=title%3A%22Bangladesh+Integrated+Household+Survey+%28BIHS%29%22
    Bangladesh Integrated Household Survey (BIHS) 2015
    Name
    Bangladesh Integrated Household Survey (BIHS) 2015
    Note
    Data accessed in 2025. BIHS 2015 files provided by the International Food Policy Research Institute. To download from the source, select the BIHS 2015 files from the IFPRI Dataverse release and place them in the designated directory. File location: 01. data/01. raw/BIHS_2015/
    Access policy
    Data is publicly available and included in the reproducibility package.
    License
    Creative Commons Attribution 4.0 International license
    License URL
    https://creativecommons.org/licenses/by/4.0/deed.en
    Data URL
    https://dataverse.harvard.edu/dataverse/IFPRI/?q=title%3A%22Bangladesh+Integrated+Household+Survey+%28BIHS%29%22
    Citation
    International Food Policy Research Institute. 2025. "Bangladesh Integrated Household Survey (BIHS) 2015" [dataset]. Accessed 2025. https://dataverse.harvard.edu/dataverse/IFPRI/?q=title%3A%22Bangladesh+Integrated+Household+Survey+%28BIHS%29%22
    Input-Output Matrix
    Name
    Input-Output Matrix
    Note
    Data accessed in 2025. The input-output table used in Stage 3 was received directly from Dr. Bazlul H. Khondker, who compiled the table and provided the authors permission to use it. A potential replicator may contact him directly to request access: Bazlul H. Khondker (bazlul.khondker@gmail.com). File location: distributional_assessment/data/I_O_matrix.xlsx.
    Access policy
    Data access was granted directly to the study authors by the data owner. It was obtained with a custom data license that does not allow for redistribution, and it is not included in the reproducibility package.
    License
    Custom License
    Citation
    Khondker, Bazlul H. 2025. "Input-Output Matrix" [dataset]. General Economics Division, Planning Commission. Unpublished data. Accessed 2025.
    Input Prices Data
    Name
    Input Prices Data
    Note
    Author-compiled project-specific supporting files required for Stage 3 of the analysis. - items_expenditure2.dta was prepared from the raw HIES 2022 data. It is a core output dataset based on the consumption aggregate and contains monthly household expenditure by COICOP item. The file was prepared by the Bangladesh Bureau of Statistics (BBS) as part of the official poverty and inequality estimation work, and we received it from the HIES 2022 team in its current form. A potential replicator may use the file directly, as there is no confidentiality restriction on sharing it. They may also contact the HIES 2022 Project Director, Mohiuddin Ahmed (mohiuddin.bbs@gmail.com), for further information. - input_prices.xlsx is an author-compiled auxiliary file, and its provenance differs by sheet: 1. In the expenditure sheet, the first three columns come from HIES 2022 and were received from the HIES 2022 team in their current form. These provide monthly household expenditure by COICOP code. A potential replicator may use this sheet directly, as there is no confidentiality restriction on sharing it. The Sector Code and Sector Name columns were added by the authors through a mapping exercise to align the HIES expenditure items with the sectors used in I_O_matrix.xlsx. 2. In the reference_prices sheet, the retail price, import price, and domestic production information were compiled from administrative data received from the Ministry of Agriculture. Much of this information is also reported in Ministry annual reports. 3. In the Fertilizers_weights sheet, most of the information was compiled from Ministry annual reports and direct communication with Ministry officials. For a small number of inputs, we also relied on newspaper reports. 4. The input_agrSubsidies sheet is fully derived by the authors through arithmetic calculations using the information in the reference_prices and Fertilizers_weights sheets. File location: distributional_assessment/data/input_prices.xlsx; distributional_assessment/data/items_expenditure2.dta
    Access policy
    Data is publicly available and included in the reproducibility package.
    Citation
    Authors' compilation. 2025. Fertilizer Input Prices and Household Item Expenditure Data. [dataset]. Based on the Bangladesh Bureau of Statistics' Household Income and Expenditure Survey (HIES) 2022 consumption aggregate, administrative data from the Ministry of Agriculture, and Ministry of Agriculture annual reports.
    Data statement

    Some data is restricted and has not been included in the reproducibility package. For more details, please refer to the README file.

    Description

    Output
    Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: A Survey-To-Survey Imputation Approach
    Type
    Working Paper
    Title
    Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: A Survey-To-Survey Imputation Approach
    Description
    Policy Research Working Papers (PRWP)
    Authors
    Author Affiliation Email
    FNU Jonaed World Bank fjonaed@worldbank.org
    Ivan Gachet World Bank igachet@worldbank.org
    Leopoldo Tornarolli World Bank tornarolli@gmail.com
    Date of production

    2026-04-17

    Scope and coverage

    Geographic locations
    Location Code
    Bangladesh BGD
    Keywords
    Fertilizer Subsidy Survey-To-Survey Imputation Fiscal Incidence Analysis Agricultural Policy
    Topics
    ID Topic Parent topic ID Vocabulary Vocabulary URI
    Q18 Agricultural Policy • Food Policy • Animal Welfare Policy Q1 Journal of Economic Literature (JEL)
    C55 Large Data Sets: Modeling and Analysis C5 Journal of Economic Literature (JEL)
    H22 Incidence H2 Journal of Economic Literature (JEL)

    Disclaimer

    Disclaimer

    The materials in the reproducibility packages are distributed as they were prepared by the staff of the International Bank for Reconstruction and Development/The World Bank. The findings, interpretations, and conclusions expressed in this event do not necessarily reflect the views of the World Bank, the Executive Directors of the World Bank, or the governments they represent. The World Bank does not guarantee the accuracy of the materials included in the reproducibility package.

    Access and rights

    License
    Name URI
    Modified BSD3 https://opensource.org/license/bsd-3-clause/

    Contacts

    Contacts
    Name Affiliation Email
    FNU Jonaed World Bank fjonaed@worldbank.org
    Reproducibility WBG World Bank reproducibility@worldbank.org

    Information on metadata

    Producers
    Name Abbreviation Affiliation Role
    Reproducibility WBG DECDI World Bank - Development Impact Department Verification and preparation of metadata
    Date of Production

    2026-04-17

    Document version

    1

    Citation

    Citation
    loading, please wait...
    Citation format
    Export citation: RIS | BibTeX | Plain text
    Back to Catalog
    The World Bank Working for a World Free of Poverty
    • IBRD IDA IFC MIGA ICSID

    © The World Bank Group, All Rights Reserved.