{"type":"script","doc_desc":{"producers":[{"name":"Reproducibility WBG","abbr":"DECDI","affiliation":"World Bank - Development Impact Department","role":"Verification and preparation of metadata"}],"prod_date":"2026-04-17","version":"1"},"project_desc":{"authoring_entity":[{"name":"FNU Jonaed","affiliation":"World Bank","email":"fjonaed@worldbank.org"},{"name":"Ivan Gachet","affiliation":"World Bank","email":"igachet@worldbank.org"},{"name":"Leopoldo Tornarolli","affiliation":"World Bank","email":"tornarolli@gmail.com"}],"title_statement":{"title":"Reproducibility package for Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: A Survey-To-Survey Imputation Approach","idno":"RR_BGD_2026_620"},"data_statement":"Some data is restricted and has not been included in the reproducibility package. For more details, please refer to the README file. ","software":[{"name":"R","version":"4.5.2"},{"name":"Stata","version":"18.0 MP"}],"scripts":[{"title":"Reproducibility package for Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: \nA Survey-To-Survey Imputation Approach","date":"2026-04","notes":"Computational reproducibility verified by Development Impact (DECDI) Analytics team, World Bank.","instructions":"See README in reproducibility package.","file_name":"RR_BGD_2026_620","zip_package":"RR_BGD_2026_620.zip","dependencies":"R dependencies are listed in the file renv.lock. Stata dependencies are listed in the ado folder."}],"repository_uri":[{"name":"Reproducible Research Repository (World Bank)","uri":"https:\/\/reproducibility.worldbank.org"}],"production_date":"2026-04-17","abstract":"Bangladesh's fertilizer subsidy costs $2.5 billion annually and accounts for nearly two-thirds of agricultural spending, yet its distributional impact remains unknown due to data limitations. This impedes reform of a policy that may favor larger farmers while crowding out investment in public goods, the real engines for long-term productivity growth. The study develops a survey-to-survey imputation method to address this gap: the Household Income and Expenditure Survey (HIES) 2022 records total fertilizer expenditure but not subsidized types, preventing accurate incidence analysis. The method combines cross-validated LASSO regression with randomized hot-deck matching to transfer type-specific fertilizer patterns from the Bangladesh Integrated Household Survey (BIHS) 2018-2019 to HIES 2022. Our procedure predicts household urea shares using 42 harmonized predictors, then assigns complete fertilizer compositions through nearest-neighbor matching within welfare-by-agro-ecological strata. The method achieves strong predictive accuracy (test RMSE = 0.169) and preserves distributional properties. Imputed shares replicate donor patterns closely: mean urea share is 51.5 percent versus 51.1 percent in BIHS, with overlapping confidence intervals across fertilizer types and regions. The enriched dataset provides the foundation for assessing whether subsidy benefits are concentrated among larger, wealthier farmers or distributed more equitably across farm households\u2014a question previously unanswerable with existing data. More broadly, the study demonstrates a scalable framework for integrating complementary surveys in data-constrained settings.","geographic_units":[{"name":"Bangladesh","code":"BGD"}],"keywords":[{"name":"Fertilizer Subsidy"},{"name":"Survey-To-Survey Imputation"},{"name":"Fiscal Incidence Analysis"},{"name":"Agricultural Policy"}],"topics":[{"id":"Q18","uri":"https:\/\/www.aeaweb.org\/econlit\/jelCodes.php?view=jel","vocabulary":"Journal of Economic Literature (JEL)","name":"Agricultural Policy \u2022 Food Policy \u2022 Animal Welfare Policy","parent_id":"Q1"},{"id":" C55","uri":"https:\/\/www.aeaweb.org\/econlit\/jelCodes.php?view=jel","vocabulary":"Journal of Economic Literature (JEL)","name":"Large Data Sets: Modeling and Analysis","parent_id":"C5"},{"id":" H22","uri":"https:\/\/www.aeaweb.org\/econlit\/jelCodes.php?view=jel","vocabulary":"Journal of Economic Literature (JEL)","name":"Incidence","parent_id":"H2"}],"output":[{"type":"Working Paper","description":"Policy Research Working Papers (PRWP)","title":"Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: \nA Survey-To-Survey Imputation Approach"}],"language":[{"name":"English","code":"EN"}],"technology_requirements":"Runtime: 30 minutes","disclaimer":"The materials in the reproducibility packages are distributed as they were prepared by the staff of the International Bank for Reconstruction and Development\/The World Bank. The findings, interpretations, and conclusions expressed in this event do not necessarily reflect the views of the World Bank, the Executive Directors of the World Bank, or the governments they represent. The World Bank does not guarantee the accuracy of the materials included in the reproducibility package.","license":[{"name":"Modified BSD3","uri":"https:\/\/opensource.org\/license\/bsd-3-clause\/"}],"contacts":[{"name":"FNU Jonaed","affiliation":"World Bank","email":"fjonaed@worldbank.org"},{"name":"Reproducibility WBG","affiliation":"World Bank","email":"reproducibility@worldbank.org"}],"datasets":[{"name":"Household Income and Expenditure Survey (HIES) 2022","note":"Data accessed in 2023. Raw HIES 2022 modules provided by the Bangladesh Bureau of Statistics. Replication requires authorized access to the HIES 2022 files. To access raw HIES 2022 data, an external user may contact Kabir Uddin Ahmed, Director, Computer Wing, Bangladesh Bureau of Statistics (kabir.ddd@gmail.com). File location: 01. data\/01. raw\/HIES_2022\/","access_type":"Data access was granted directly to the study authors by the data owners. It was obtained with a custom data license that does not allow for redistribution, and it is not included in the reproducibility package.","license":"Custom License","citation":"Bangladesh Bureau of Statistics. 2023. \"Household Income and Expenditure Survey (HIES) 2022\" [dataset]. Unpublished data. Accessed 2023."},{"name":"Household Income and Expenditure Survey (HIES) 2022 - South Asia Regional Micro Database Derived Files","note":"Data accessed in 2025. Accessed via the World Bank South Asia Regional Micro Database (SARMD) derivatives of HIES 2022. Replication requires authorized access. For an authorized World Bank user, these SARMD files can be accessed through Datalibweb by subscribing to the relevant survey and module (for example, BGD_2022_HIES, SARMD, IND\/INC). Access requires a World Bank computer or virtual machine, Stata, the Datalibweb package, and an approved token\/subscription. File location: 01. data\/01. raw\/HIES_2022\/BGD_2022_HIES_v02_M_v03_A_SARMD_INC.dta; BGD_2022_HIES_v02_M_v03_A_SARMD_IND_full.dta; and distributional_assessment\/data\/BGD_2022_HIES_v02_M_v05_A_SARMD_INC.dta; BGD_2022_HIES_v02_M_v05_A_SARMD_IND_full.dta","access_type":"Data access requires purchase or human approval and is not included in the reproducibility package.","license":"Custom License","citation":"World Bank. 2022. \"Household Income and Expenditure Survey (HIES) 2022 - South Asia Regional Micro Database (SARMD) Derived Files\" [dataset]. Available from: https:\/\/github.com\/worldbank\/SARMD. Accessed 2025.","uri":"https:\/\/github.com\/worldbank\/SARMD"},{"name":"Bangladesh Integrated Household Survey (BIHS) 2018\u20132019","note":"Data accessed in 2025. BIHS 2018\u20132019 files provided by the International Food Policy Research Institute. To download from the source, select the BIHS 2018\u20132019 files from the IFPRI Dataverse release and place them in the designated directory. File location: 01. data\/01. raw\/BIHS_2019\/","access_type":"Data is publicly available and included in the reproducibility package.","uri":"https:\/\/dataverse.harvard.edu\/dataverse\/IFPRI\/?q=title%3A%22Bangladesh+Integrated+Household+Survey+%28BIHS%29%22","citation":"International Food Policy Research Institute. 2025. \"Bangladesh Integrated Household Survey (BIHS) 2018\u20132019\" [dataset]. Accessed 2025. https:\/\/dataverse.harvard.edu\/dataverse\/IFPRI\/?q=title%3A%22Bangladesh+Integrated+Household+Survey+%28BIHS%29%22","license":"Creative Commons Attribution 4.0 International License","license_uri":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/deed.en"},{"name":"Bangladesh Integrated Household Survey (BIHS) 2015","note":"Data accessed in 2025. BIHS 2015 files provided by the International Food Policy Research Institute. To download from the source, select the BIHS 2015 files from the IFPRI Dataverse release and place them in the designated directory. File location: 01. data\/01. raw\/BIHS_2015\/","access_type":"Data is publicly available and included in the reproducibility package.","uri":"https:\/\/dataverse.harvard.edu\/dataverse\/IFPRI\/?q=title%3A%22Bangladesh+Integrated+Household+Survey+%28BIHS%29%22","citation":"International Food Policy Research Institute. 2025. \"Bangladesh Integrated Household Survey (BIHS) 2015\" [dataset]. Accessed 2025. https:\/\/dataverse.harvard.edu\/dataverse\/IFPRI\/?q=title%3A%22Bangladesh+Integrated+Household+Survey+%28BIHS%29%22","license_uri":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/deed.en","license":"Creative Commons Attribution 4.0 International license"},{"name":"Input-Output Matrix","note":"Data accessed in 2025. The input-output table used in Stage 3 was received directly from Dr. Bazlul H. Khondker, who compiled the table and provided the authors permission to use it. A potential replicator may contact him directly to request access: Bazlul H. Khondker (bazlul.khondker@gmail.com).\nFile location: distributional_assessment\/data\/I_O_matrix.xlsx.\n","access_type":"Data access was granted directly to the study authors by the data owner. It was obtained with a custom data license that does not allow for redistribution, and it is not included in the reproducibility package.","license":"Custom License","citation":"Khondker, Bazlul H. 2025. \"Input-Output Matrix\" [dataset]. General Economics Division, Planning Commission. Unpublished data. Accessed 2025."},{"name":"Input Prices Data","note":"Author-compiled project-specific supporting files required for Stage 3 of the analysis. \n\n- items_expenditure2.dta was prepared from the raw HIES 2022 data. It is a core output dataset based on the consumption aggregate and contains monthly household expenditure by COICOP item. The file was prepared by the Bangladesh Bureau of Statistics (BBS) as part of the official poverty and inequality estimation work, and we received it from the HIES 2022 team in its current form. A potential replicator may use the file directly, as there is no confidentiality restriction on sharing it. They may also contact the HIES 2022 Project Director, Mohiuddin Ahmed (mohiuddin.bbs@gmail.com), for further information.\n\n- input_prices.xlsx is an author-compiled auxiliary file, and its provenance differs by sheet:\n\t1. In the expenditure sheet, the first three columns come from HIES 2022 and were received from the HIES 2022 team in their current form. These provide monthly household expenditure by COICOP code. A potential replicator may use this sheet directly, as there is no confidentiality restriction on sharing it. The Sector Code and Sector Name columns were added by the authors through a mapping exercise to align the HIES expenditure items with the sectors used in I_O_matrix.xlsx.\n   2. In the reference_prices sheet, the retail price, import price, and domestic production information were compiled from administrative data received from the Ministry of Agriculture. Much of this information is also reported in Ministry annual reports.\n    3. In the Fertilizers_weights sheet, most of the information was compiled from Ministry annual reports and direct communication with Ministry officials. For a small number of inputs, we also relied on newspaper reports.\n    4. The input_agrSubsidies sheet is fully derived by the authors through arithmetic calculations using the information in the reference_prices and Fertilizers_weights sheets.\n\nFile location: distributional_assessment\/data\/input_prices.xlsx; distributional_assessment\/data\/items_expenditure2.dta\n\n\n","access_type":"Data is publicly available and included in the reproducibility package.","citation":"Authors' compilation. 2025. Fertilizer Input Prices and Household Item Expenditure Data. [dataset]. Based on the Bangladesh Bureau of Statistics' Household Income and Expenditure Survey (HIES) 2022 consumption aggregate, administrative data from the Ministry of Agriculture, and Ministry of Agriculture annual reports. "}],"reproduction_instructions":"To reproduce the exhibits in this paper, follow these steps:\n\n1. Request access to the restricted data and place the files in the appropriate folders (see the Data section above for access instructions per dataset).\n2. Open the Stata do-file `00_master_stage1_harmonization`, update the working directory, and run the code.\n3. Open the R project `fertilizer_s2s_reproducibility.Rproj`.\n4. Restore the R environment by running `renv::restore()`, or manually install the packages listed in `renv.lock`.\n5. Open the R script `03_figure1_kernel_density`, update the working directory, and run the script.\n6. Open the R script `05_s2s_imputation`, update the working directory, and run the script.\n7. Open the Stata do-file `00_master_stage3`, update the working directory, install the required packages, and run the code.\n\nSince the data is not included in the package, the outputs produced by the verification team are included so that users can compare them with the published manuscript.","technology_environment":"Paper exhibits were reproduced on a computer with the following specifications:\n\u2022 OS: Windows 11 Enterprise\n\u2022 Processor: Intel(R) Core(TM) i5-1145G7 CPU @ 2.60GHz\n\u2022 Memory available: 15.7 GB"},"datacite":{"creators":[{"givenName":"FNU","familyName":"Jonaed","nameType":"Personal","affiliation":[{"name":"World Bank"}]},{"givenName":"Ivan","familyName":"Gachet","nameType":"Personal","affiliation":[{"name":"World Bank"}]},{"givenName":"Leopoldo","familyName":"Tornarolli","nameType":"Personal","affiliation":[{"name":"World Bank"}]}],"titles":[{"lang":"en","title":"Reproducibility package for Closing Data Gaps In Fertilizer Subsidy Analysis In Bangladesh: \nA Survey-To-Survey Imputation Approach"},{"title":"RR_BGD_2026_620","titleType":"Other"}],"publisher":"World Bank","publicationYear":"2026","types":{"resourceType":"Reproducibility package","resourceTypeGeneral":"Other"},"url":"https:\/\/reproducibility.worldbank.org\/index.php\/catalog\/study\/RR_BGD_2026_620","language":"en","doi":"10.60572\/1z28-h024","prefix":"10.60572","suffix":"1z28-h024"},"tags":[{"tag":"DOI"},{"tag":"Open Code"},{"tag":"Restricted Data"}],"schematype":"script"}