Generating pixellated populations, and population frames
Source:R/popGrid.R
makePopIntegrationTab.Rd
Functions for generating pixellated population information and population frames at the `area` and `subarea` levels. The `area` and `subarea` levels can be thought of as big regions and little regions, where areas can be partitioned into unique sets of subareas. For example, Admin-1 and Admin-2 areas might be areas and subareas respectively. The population totals are either tabulated at the area x urban/rural level, the subarea x urban/rural level, or at the pixel level of a specified resolution. Totals are calculated using population density information, shapefiles, and, possibly, preexisting population frames at different areal levels. Note that area names should each be unique, and similarly for subarea names.
Usage
makePopIntegrationTab(
kmRes = 5,
pop,
domainMapDat,
eastLim,
northLim,
mapProjection,
areaMapDat,
subareaMapDat,
areaNameVar = "NAME_1",
subareaNameVar = "NAME_2",
poppa = NULL,
poppsub = NULL,
stratifyByUrban = TRUE,
areapa = NULL,
areapsub = NULL,
customSubsetPolygons = NULL,
areaPolygonSubsetI = NULL,
subareaPolygonSubsetI = NULL,
mean.neighbor = 50,
delta = 0.1,
returnPoppTables = FALSE,
setNAsToZero = TRUE,
fixZeroPopDensitySubareas = FALSE,
extractMethod = "bilinear"
)
getPoppsub(
kmRes = 1,
pop,
domainMapDat,
eastLim,
northLim,
mapProjection,
poppa,
areapa = NULL,
areapsub,
subareaMapDat,
subareaNameVar = "NAME_2",
stratifyByUrban = TRUE,
areaMapDat = NULL,
areaNameVar = "NAME_1",
areaPolygonSubsetI = NULL,
subareaPolygonSubsetI = NULL,
customSubsetPolygons = NULL,
mean.neighbor = 50,
delta = 0.1,
setNAsToZero = TRUE,
fixZeroPopDensitySubareas = FALSE
)
adjustPopMat(
popMat,
poppaTarget = NULL,
adjustBy = c("area", "subarea"),
stratifyByUrban = TRUE
)
Arguments
- kmRes
The resolution of the pixelated grid in km
- pop
Population density raster
- domainMapDat
A shapefile representing the full spatial domain (e.g. country)
- eastLim
Range in km easting over the spatial domain under the input projection
- northLim
Range in km northing over the spatial domain under the input projection
- mapProjection
A projection function taking longitude and latitude and returning easting and northing in km. Or the inverse if inverse is set to TRUE. For example,
projKenya
. Check https://epsg.io/ for example for best projection EPSG codes for specific countries- areaMapDat
SpatialPolygonsDataFrame object with area level map information
- subareaMapDat
SpatialPolygonsDataFrame object with subarea level map information
- areaNameVar
The name of the area variable associated with
areaMapDat@data
andsubareaMapDat@data
- subareaNameVar
The name of the subarea variable associated with
subareaMapDat@data
- poppa
data.frame of population per area separated by urban/rural. If `poppsub` is not included, this is used for normalization of populations associated with population integration points. Contains variables:
- area
name of area
- popUrb
total urban (general) population of area
- popRur
total rural (general) population of area
- popTotal
total (general) population of area
- pctUrb
percentage of population in the area that is urban (between 0 and 100)
- poppsub
data.frame of population per subarea separated by urban/rural using for population normalization or urbanicity classification. Often based on extra fine scale population density grid. Has variables:
- subarea
name of subarea
- area
name of area
- popUrb
total urban (general) population of subarea
- popRur
total rural (general) population of subarea
- popTotal
total (general) population of subarea
- pctUrb
percentage of population in the subarea that is urban (between 0 and 100)
- stratifyByUrban
Whether to stratify the pixellated grid by urban/rural. If TRUE, renormalizes population densities within areas or subareas crossed with urban/rural
- areapa
A list with variables:
- area
name of area
- spatialArea
spatial area of the subarea (e.g. in km^2)
- areapsub
A list with variables:
- subarea
name of subarea
- spatialArea
spatial area of the subarea (e.g. in km^2)
- customSubsetPolygons
'SpatialPolygonsDataFrame' or 'SpatialPolygons' object to subset the grid over. This option can help reduce computation time relative to constructing the whole grid and subsetting afterwards. `areaPolygonSubsetI` or `subareaPolygonSubsetI` can be used when subsetting by areas or subareas in `areaMapDat` or `subareaMapDat`. Must be in latitude/longitude projection "EPSG:4326"
- areaPolygonSubsetI
Index in areaMapDat for a specific area to subset the grid over. This option can help reduce computation time relative to constructing the whole grid and subsetting afterwards
- subareaPolygonSubsetI
FOR EXPERIMENTAL PURPOSES ONLY. Index in subareaMapDat for a specific area to subset the grid over. This option can help reduce computation time relative to constructing the whole grid and subsetting afterwards
- mean.neighbor
For determining what area or subarea points are nearest to if they do not directly fall into an area. See
fields.rdist.near
for details.- delta
For determining what area or subarea points are nearest to if they do not directly fall into an area. See
fields.rdist.near
for details.- returnPoppTables
If TRUE, poppa and poppsub will be calculated based on the generated population integration matrix and input area/subarea map data
- setNAsToZero
If TRUE, sets NA populations to 0.
- fixZeroPopDensitySubareas
If TRUE, if population density in a subarea is estimated to be zero, but the total population in the subarea is nonzero, population is filled into the area uniformly
- extractMethod
Either 'bilinear' or 'simple'. see `method` from
extract
- popMat
Pixellated grid data frame with variables `area` and `pop` such as that generated by
makePopIntegrationTab
- poppaTarget
Target population per area stratified by urban rural. Same format as poppa
- adjustBy
Whether to adjust population density by the `area` or `subarea` level
Functions
makePopIntegrationTab()
: Generate pixellated `grid` of coordinates (both longitude/latitude and east/north) over spatial domain of the given resolution with associated population totals, areas, subareas, and urban/rural levels. For very small areas that might not otherwise have a grid point in them, a custom integration point is added at their centroid. Sets urbanicity classifications by thresholding input population density raster using area and subarea population tables, and generates area and subarea population tables from population density information if not already given. Can be used for integrating predictions from the given coordinates to area and subarea levels using population weights.getPoppsub()
: Generate table of estimates of population totals per subarea x urban/rural combination based on population density raster at `kmres` resolution "grid", including custom integration points for any subarea too small to include grid points at their centroids.adjustPopMat()
: Adjust population densities in grid based on a population frame.
Examples
if (FALSE) { # \dontrun{
library(sp)
library(sf)
# download Kenya GADM shapefiles from SUMMERdata github repository
githubURL <- paste0("https://github.com/paigejo/SUMMERdata/blob/main/data/",
"kenyaMaps.rda?raw=true")
tempDirectory = "~/"
mapsFilename = paste0(tempDirectory, "/kenyaMaps.rda")
if(!file.exists(mapsFilename)) {
download.file(githubURL,mapsFilename)
}
# load it in
out = load(mapsFilename)
out
adm1@data$NAME_1 = as.character(adm1@data$NAME_1)
adm1@data$NAME_1[adm1@data$NAME_1 == "Trans Nzoia"] = "Trans-Nzoia"
adm1@data$NAME_1[adm1@data$NAME_1 == "Elgeyo-Marakwet"] = "Elgeyo Marakwet"
adm2@data$NAME_1 = as.character(adm2@data$NAME_1)
adm2@data$NAME_1[adm2@data$NAME_1 == "Trans Nzoia"] = "Trans-Nzoia"
adm2@data$NAME_1[adm2@data$NAME_1 == "Elgeyo-Marakwet"] = "Elgeyo Marakwet"
# some Admin-2 areas have the same name
adm2@data$NAME_2 = as.character(adm2@data$NAME_2)
adm2@data$NAME_2[(adm2@data$NAME_1 == "Bungoma") &
(adm2@data$NAME_2 == "Lugari")] = "Lugari, Bungoma"
adm2@data$NAME_2[(adm2@data$NAME_1 == "Kakamega") &
(adm2@data$NAME_2 == "Lugari")] = "Lugari, Kakamega"
adm2@data$NAME_2[(adm2@data$NAME_1 == "Meru") &
(adm2@data$NAME_2 == "Igembe South")] = "Igembe South, Meru"
adm2@data$NAME_2[(adm2@data$NAME_1 == "Tharaka-Nithi") &
(adm2@data$NAME_2 == "Igembe South")] = "Igembe South, Tharaka-Nithi"
# The spatial area of unknown 8 is so small, it causes problems unless its removed or
# unioned with another subarea. Union it with neighboring Kakeguria:
newadm2 = adm2
unknown8I = which(newadm2$NAME_2 == "unknown 8")
newadm2$NAME_2[newadm2$NAME_2 %in% c("unknown 8", "Kapenguria")] <-
"Kapenguria + unknown 8"
admin2.IDs <- newadm2$NAME_2
newadm2@data = cbind(newadm2@data, NAME_2OLD = newadm2@data$NAME_2)
newadm2@data$NAME_2OLD = newadm2@data$NAME_2
newadm2@data$NAME_2 = admin2.IDs
newadm2$NAME_2 = admin2.IDs
temp <- terra::aggregate(as(newadm2, "SpatVector"), by="NAME_2")
temp <- sf::st_as_sf(temp)
temp <- sf::as_Spatial(temp)
tempData = newadm2@data[-unknown8I,]
tempData = tempData[order(tempData$NAME_2),]
newadm2 <- SpatialPolygonsDataFrame(temp, tempData, match.ID = F)
adm2 = newadm2
# download 2014 Kenya population density TIF file
githubURL <- paste0("https://github.com/paigejo/SUMMERdata/blob/main/data/",
"Kenya2014Pop/worldpop_total_1y_2014_00_00.tif?raw=true")
popTIFFilename = paste0(tempDirectory, "/worldpop_total_1y_2014_00_00.tif")
if(!file.exists(popTIFFilename)) {
download.file(githubURL,popTIFFilename)
}
# load it in
pop = terra::rast(popTIFFilename)
eastLim = c(-110.6405, 832.4544)
northLim = c(-555.1739, 608.7130)
## Construct poppsubKenya, a table of urban/rural general population totals
## in each subarea. Technically, this is not necessary since we can load in
## poppsubKenya via data(kenyaPopulationData). First, we will need to calculate
## the areas in km^2 of the areas and subareas
# use Lambert equal area projection of areas (Admin-1) and subareas (Admin-2)
midLon = mean(adm1@bbox[1,])
midLat = mean(adm1@bbox[2,])
p4s = paste0("+proj=laea +x_0=0 +y_0=0 +lon_0=", midLon,
" +lat_0=", midLat, " +units=km")
adm1_sf = st_as_sf(adm1)
adm1proj_sf = st_transform(adm1_sf, p4s)
adm1proj = as(adm1proj_sf, "Spatial")
adm2_sf = st_as_sf(adm2)
adm2proj_sf = st_transform(adm2_sf, p4s)
adm2proj = as(adm2proj_sf, "Spatial")
# now calculate spatial area in km^2
admin1Areas = as.numeric(st_area(adm1proj_sf))
admin2Areas = as.numeric(st_area(adm2proj_sf))
areapaKenya = data.frame(area=adm1proj@data$NAME_1, spatialArea=admin1Areas)
areapsubKenya = data.frame(area=adm2proj@data$NAME_1, subarea=adm2proj@data$NAME_2,
spatialArea=admin2Areas)
# Calculate general population totals at the subarea (Admin-2) x urban/rural
# level and using 1km resolution population grid. Assign urbanicity by
# thresholding population density based on estimated proportion population
# urban/rural, making sure total area (Admin-1) urban/rural populations in
# each area matches poppaKenya.
require(fields)
# NOTE: the following function will typically take ~15-20 minutes. Can speed up
# by setting kmRes to be higher, but we recommend fine resolution for
# this step, since it only needs to be done once. Instead of running this,
# you can simply run data(kenyaPopulationData)
system.time(poppsubKenya <- getPoppsub(
kmRes=1, pop=pop, domainMapDat=adm0,
eastLim=eastLim, northLim=northLim, mapProjection=projKenya,
poppa = poppaKenya, areapa=areapaKenya, areapsub=areapsubKenya,
areaMapDat=adm1, subareaMapDat=adm2,
areaNameVar = "NAME_1", subareaNameVar="NAME_2"))
# Now generate a general population integration table at 5km resolution,
# based on subarea (Admin-2) x urban/rural population totals. This takes
# ~1 minute
system.time(popMatKenya <- makePopIntegrationTab(
kmRes=5, pop=pop, domainMapDat=adm0,
eastLim=eastLim, northLim=northLim, mapProjection=projKenya,
poppa = poppaKenya, poppsub=poppsubKenya,
areaMapDat = adm1, subareaMapDat = adm2,
areaNameVar = "NAME_1", subareaNameVar="NAME_2"))
## Adjust popMat to be target (neonatal) rather than general population density. First
## creat the target population frame
## (these numbers are based on IPUMS microcensus data)
mothersPerHouseholdUrb = 0.3497151
childrenPerMotherUrb = 1.295917
mothersPerHouseholdRur = 0.4787696
childrenPerMotherRur = 1.455222
targetPopPerStratumUrban = easpaKenya$HHUrb * mothersPerHouseholdUrb * childrenPerMotherUrb
targetPopPerStratumRural = easpaKenya$HHRur * mothersPerHouseholdRur * childrenPerMotherRur
easpaKenyaNeonatal = easpaKenya
easpaKenyaNeonatal$popUrb = targetPopPerStratumUrban
easpaKenyaNeonatal$popRur = targetPopPerStratumRural
easpaKenyaNeonatal$popTotal = easpaKenyaNeonatal$popUrb + easpaKenyaNeonatal$popRur
easpaKenyaNeonatal$pctUrb = 100 * easpaKenyaNeonatal$popUrb / easpaKenyaNeonatal$popTotal
easpaKenyaNeonatal$pctTotal =
100 * easpaKenyaNeonatal$popTotal / sum(easpaKenyaNeonatal$popTotal)
# Generate the target population density by scaling the current population density grid
# at the Admin1 x urban/rural level
popMatKenyaNeonatal = adjustPopMat(popMatKenya, easpaKenyaNeonatal)
# Generate neonatal population table from the neonatal population integration matrix.
# This is technically not necessary for population simulation purposes, but is here
# for illustrative purposes
poppsubKenyaNeonatal = poppRegionFromPopMat(popMatKenyaNeonatal, popMatKenyaNeonatal$subarea)
poppsubKenyaNeonatal = cbind(subarea=poppsubKenyaNeonatal$region,
area=adm2@data$NAME_1[match(poppsubKenyaNeonatal$region,
adm2@data$NAME_2)],
poppsubKenyaNeonatal[,-1])
print(head(poppsubKenyaNeonatal))
} # }