Skip to contents

Functions for generating pixellated population information and population frames at the `area` and `subarea` levels. The `area` and `subarea` levels can be thought of as big regions and little regions, where areas can be partitioned into unique sets of subareas. For example, Admin-1 and Admin-2 areas might be areas and subareas respectively. The population totals are either tabulated at the area x urban/rural level, the subarea x urban/rural level, or at the pixel level of a specified resolution. Totals are calculated using population density information, shapefiles, and, possibly, preexisting population frames at different areal levels. Note that area names should each be unique, and similarly for subarea names.

Usage

makePopIntegrationTab(
  kmRes = 5,
  pop,
  domainMapDat,
  eastLim,
  northLim,
  mapProjection,
  areaMapDat,
  subareaMapDat,
  areaNameVar = "NAME_1",
  subareaNameVar = "NAME_2",
  poppa = NULL,
  poppsub = NULL,
  stratifyByUrban = TRUE,
  areapa = NULL,
  areapsub = NULL,
  customSubsetPolygons = NULL,
  areaPolygonSubsetI = NULL,
  subareaPolygonSubsetI = NULL,
  mean.neighbor = 50,
  delta = 0.1,
  returnPoppTables = FALSE,
  setNAsToZero = TRUE,
  fixZeroPopDensitySubareas = FALSE,
  extractMethod = "bilinear"
)

getPoppsub(
  kmRes = 1,
  pop,
  domainMapDat,
  eastLim,
  northLim,
  mapProjection,
  poppa,
  areapa = NULL,
  areapsub,
  subareaMapDat,
  subareaNameVar = "NAME_2",
  stratifyByUrban = TRUE,
  areaMapDat = NULL,
  areaNameVar = "NAME_1",
  areaPolygonSubsetI = NULL,
  subareaPolygonSubsetI = NULL,
  customSubsetPolygons = NULL,
  mean.neighbor = 50,
  delta = 0.1,
  setNAsToZero = TRUE,
  fixZeroPopDensitySubareas = FALSE
)

adjustPopMat(
  popMat,
  poppaTarget = NULL,
  adjustBy = c("area", "subarea"),
  stratifyByUrban = TRUE
)

Arguments

kmRes

The resolution of the pixelated grid in km

pop

Population density raster

domainMapDat

A shapefile representing the full spatial domain (e.g. country)

eastLim

Range in km easting over the spatial domain under the input projection

northLim

Range in km northing over the spatial domain under the input projection

mapProjection

A projection function taking longitude and latitude and returning easting and northing in km. Or the inverse if inverse is set to TRUE. For example, projKenya. Check https://epsg.io/ for example for best projection EPSG codes for specific countries

areaMapDat

SpatialPolygonsDataFrame object with area level map information

subareaMapDat

SpatialPolygonsDataFrame object with subarea level map information

areaNameVar

The name of the area variable associated with areaMapDat@data and subareaMapDat@data

subareaNameVar

The name of the subarea variable associated with subareaMapDat@data

poppa

data.frame of population per area separated by urban/rural. If `poppsub` is not included, this is used for normalization of populations associated with population integration points. Contains variables:

area

name of area

popUrb

total urban (general) population of area

popRur

total rural (general) population of area

popTotal

total (general) population of area

pctUrb

percentage of population in the area that is urban (between 0 and 100)

poppsub

data.frame of population per subarea separated by urban/rural using for population normalization or urbanicity classification. Often based on extra fine scale population density grid. Has variables:

subarea

name of subarea

area

name of area

popUrb

total urban (general) population of subarea

popRur

total rural (general) population of subarea

popTotal

total (general) population of subarea

pctUrb

percentage of population in the subarea that is urban (between 0 and 100)

stratifyByUrban

Whether to stratify the pixellated grid by urban/rural. If TRUE, renormalizes population densities within areas or subareas crossed with urban/rural

areapa

A list with variables:

area

name of area

spatialArea

spatial area of the subarea (e.g. in km^2)

areapsub

A list with variables:

subarea

name of subarea

spatialArea

spatial area of the subarea (e.g. in km^2)

customSubsetPolygons

'SpatialPolygonsDataFrame' or 'SpatialPolygons' object to subset the grid over. This option can help reduce computation time relative to constructing the whole grid and subsetting afterwards. `areaPolygonSubsetI` or `subareaPolygonSubsetI` can be used when subsetting by areas or subareas in `areaMapDat` or `subareaMapDat`. Must be in latitude/longitude projection "EPSG:4326"

areaPolygonSubsetI

Index in areaMapDat for a specific area to subset the grid over. This option can help reduce computation time relative to constructing the whole grid and subsetting afterwards

subareaPolygonSubsetI

FOR EXPERIMENTAL PURPOSES ONLY. Index in subareaMapDat for a specific area to subset the grid over. This option can help reduce computation time relative to constructing the whole grid and subsetting afterwards

mean.neighbor

For determining what area or subarea points are nearest to if they do not directly fall into an area. See fields.rdist.near for details.

delta

For determining what area or subarea points are nearest to if they do not directly fall into an area. See fields.rdist.near for details.

returnPoppTables

If TRUE, poppa and poppsub will be calculated based on the generated population integration matrix and input area/subarea map data

setNAsToZero

If TRUE, sets NA populations to 0.

fixZeroPopDensitySubareas

If TRUE, if population density in a subarea is estimated to be zero, but the total population in the subarea is nonzero, population is filled into the area uniformly

extractMethod

Either 'bilinear' or 'simple'. see `method` from extract

popMat

Pixellated grid data frame with variables `area` and `pop` such as that generated by makePopIntegrationTab

poppaTarget

Target population per area stratified by urban rural. Same format as poppa

adjustBy

Whether to adjust population density by the `area` or `subarea` level

Functions

  • makePopIntegrationTab(): Generate pixellated `grid` of coordinates (both longitude/latitude and east/north) over spatial domain of the given resolution with associated population totals, areas, subareas, and urban/rural levels. For very small areas that might not otherwise have a grid point in them, a custom integration point is added at their centroid. Sets urbanicity classifications by thresholding input population density raster using area and subarea population tables, and generates area and subarea population tables from population density information if not already given. Can be used for integrating predictions from the given coordinates to area and subarea levels using population weights.

  • getPoppsub(): Generate table of estimates of population totals per subarea x urban/rural combination based on population density raster at `kmres` resolution "grid", including custom integration points for any subarea too small to include grid points at their centroids.

  • adjustPopMat(): Adjust population densities in grid based on a population frame.

Author

John Paige

Examples

if (FALSE) { # \dontrun{

library(sp)
library(sf)
# download Kenya GADM shapefiles from SUMMERdata github repository
githubURL <- paste0("https://github.com/paigejo/SUMMERdata/blob/main/data/", 
                    "kenyaMaps.rda?raw=true")
tempDirectory = "~/"
mapsFilename = paste0(tempDirectory, "/kenyaMaps.rda")
if(!file.exists(mapsFilename)) {
  download.file(githubURL,mapsFilename)
}

# load it in
out = load(mapsFilename)
out
adm1@data$NAME_1 = as.character(adm1@data$NAME_1)
adm1@data$NAME_1[adm1@data$NAME_1 == "Trans Nzoia"] = "Trans-Nzoia"
adm1@data$NAME_1[adm1@data$NAME_1 == "Elgeyo-Marakwet"] = "Elgeyo Marakwet"
adm2@data$NAME_1 = as.character(adm2@data$NAME_1)
adm2@data$NAME_1[adm2@data$NAME_1 == "Trans Nzoia"] = "Trans-Nzoia"
adm2@data$NAME_1[adm2@data$NAME_1 == "Elgeyo-Marakwet"] = "Elgeyo Marakwet"

# some Admin-2 areas have the same name
adm2@data$NAME_2 = as.character(adm2@data$NAME_2)
adm2@data$NAME_2[(adm2@data$NAME_1 == "Bungoma") & 
                   (adm2@data$NAME_2 == "Lugari")] = "Lugari, Bungoma"
adm2@data$NAME_2[(adm2@data$NAME_1 == "Kakamega") & 
                   (adm2@data$NAME_2 == "Lugari")] = "Lugari, Kakamega"
adm2@data$NAME_2[(adm2@data$NAME_1 == "Meru") & 
                   (adm2@data$NAME_2 == "Igembe South")] = "Igembe South, Meru"
adm2@data$NAME_2[(adm2@data$NAME_1 == "Tharaka-Nithi") & 
                   (adm2@data$NAME_2 == "Igembe South")] = "Igembe South, Tharaka-Nithi"

# The spatial area of unknown 8 is so small, it causes problems unless its removed or 
# unioned with another subarea. Union it with neighboring Kakeguria:
newadm2 = adm2
unknown8I = which(newadm2$NAME_2 == "unknown 8")
newadm2$NAME_2[newadm2$NAME_2 %in% c("unknown 8", "Kapenguria")] <- 
  "Kapenguria + unknown 8"
admin2.IDs <- newadm2$NAME_2

newadm2@data = cbind(newadm2@data, NAME_2OLD = newadm2@data$NAME_2)
newadm2@data$NAME_2OLD = newadm2@data$NAME_2
newadm2@data$NAME_2 = admin2.IDs
newadm2$NAME_2 = admin2.IDs
temp <- terra::aggregate(as(newadm2, "SpatVector"), by="NAME_2")

temp <- sf::st_as_sf(temp)
temp <- sf::as_Spatial(temp)

tempData = newadm2@data[-unknown8I,]
tempData = tempData[order(tempData$NAME_2),]
newadm2 <- SpatialPolygonsDataFrame(temp, tempData, match.ID = F)
adm2 = newadm2

# download 2014 Kenya population density TIF file

githubURL <- paste0("https://github.com/paigejo/SUMMERdata/blob/main/data/", 
                    "Kenya2014Pop/worldpop_total_1y_2014_00_00.tif?raw=true")
popTIFFilename = paste0(tempDirectory, "/worldpop_total_1y_2014_00_00.tif")
if(!file.exists(popTIFFilename)) {
  download.file(githubURL,popTIFFilename)
}

# load it in
pop = terra::rast(popTIFFilename)

eastLim = c(-110.6405, 832.4544)
northLim = c(-555.1739, 608.7130)

## Construct poppsubKenya, a table of urban/rural general population totals 
## in each subarea. Technically, this is not necessary since we can load in 
## poppsubKenya via data(kenyaPopulationData). First, we will need to calculate 
## the areas in km^2 of the areas and subareas


# use Lambert equal area projection of areas (Admin-1) and subareas (Admin-2)
midLon = mean(adm1@bbox[1,])
midLat = mean(adm1@bbox[2,])
p4s = paste0("+proj=laea +x_0=0 +y_0=0 +lon_0=", midLon, 
             " +lat_0=", midLat, " +units=km")

adm1_sf = st_as_sf(adm1)
adm1proj_sf = st_transform(adm1_sf, p4s)
adm1proj = as(adm1proj_sf, "Spatial")

adm2_sf = st_as_sf(adm2)
adm2proj_sf = st_transform(adm2_sf, p4s)
adm2proj = as(adm2proj_sf, "Spatial")

# now calculate spatial area in km^2
admin1Areas = as.numeric(st_area(adm1proj_sf))
admin2Areas = as.numeric(st_area(adm2proj_sf))

areapaKenya = data.frame(area=adm1proj@data$NAME_1, spatialArea=admin1Areas)
areapsubKenya = data.frame(area=adm2proj@data$NAME_1, subarea=adm2proj@data$NAME_2, 
                           spatialArea=admin2Areas)

# Calculate general population totals at the subarea (Admin-2) x urban/rural 
# level and using 1km resolution population grid. Assign urbanicity by 
# thresholding population density based on estimated proportion population 
# urban/rural, making sure total area (Admin-1) urban/rural populations in 
# each area matches poppaKenya.
require(fields)
# NOTE: the following function will typically take ~15-20 minutes. Can speed up 
#       by setting kmRes to be higher, but we recommend fine resolution for 
#       this step, since it only needs to be done once. Instead of running this, 
#       you can simply run data(kenyaPopulationData)
system.time(poppsubKenya <- getPoppsub(
  kmRes=1, pop=pop, domainMapDat=adm0,
  eastLim=eastLim, northLim=northLim, mapProjection=projKenya,
  poppa = poppaKenya, areapa=areapaKenya, areapsub=areapsubKenya, 
  areaMapDat=adm1, subareaMapDat=adm2, 
  areaNameVar = "NAME_1", subareaNameVar="NAME_2"))

# Now generate a general population integration table at 5km resolution, 
# based on subarea (Admin-2) x urban/rural population totals. This takes 
# ~1 minute
system.time(popMatKenya <- makePopIntegrationTab(
  kmRes=5, pop=pop, domainMapDat=adm0,
  eastLim=eastLim, northLim=northLim, mapProjection=projKenya,
  poppa = poppaKenya, poppsub=poppsubKenya, 
  areaMapDat = adm1, subareaMapDat = adm2,
  areaNameVar = "NAME_1", subareaNameVar="NAME_2"))

## Adjust popMat to be target (neonatal) rather than general population density. First
## creat the target population frame
## (these numbers are based on IPUMS microcensus data)
mothersPerHouseholdUrb = 0.3497151
childrenPerMotherUrb = 1.295917
mothersPerHouseholdRur = 0.4787696
childrenPerMotherRur = 1.455222
targetPopPerStratumUrban = easpaKenya$HHUrb * mothersPerHouseholdUrb * childrenPerMotherUrb
targetPopPerStratumRural = easpaKenya$HHRur * mothersPerHouseholdRur * childrenPerMotherRur
easpaKenyaNeonatal = easpaKenya
easpaKenyaNeonatal$popUrb = targetPopPerStratumUrban
easpaKenyaNeonatal$popRur = targetPopPerStratumRural
easpaKenyaNeonatal$popTotal = easpaKenyaNeonatal$popUrb + easpaKenyaNeonatal$popRur
easpaKenyaNeonatal$pctUrb = 100 * easpaKenyaNeonatal$popUrb / easpaKenyaNeonatal$popTotal
easpaKenyaNeonatal$pctTotal = 
  100 * easpaKenyaNeonatal$popTotal / sum(easpaKenyaNeonatal$popTotal)

# Generate the target population density by scaling the current population density grid 
# at the Admin1 x urban/rural level
popMatKenyaNeonatal = adjustPopMat(popMatKenya, easpaKenyaNeonatal)

# Generate neonatal population table from the neonatal population integration matrix.
# This is technically not necessary for population simulation purposes, but is here 
# for illustrative purposes
poppsubKenyaNeonatal = poppRegionFromPopMat(popMatKenyaNeonatal, popMatKenyaNeonatal$subarea)
poppsubKenyaNeonatal = cbind(subarea=poppsubKenyaNeonatal$region, 
                             area=adm2@data$NAME_1[match(poppsubKenyaNeonatal$region, 
                               adm2@data$NAME_2)], 
                             poppsubKenyaNeonatal[,-1])
print(head(poppsubKenyaNeonatal))
} # }