Using R how to write a loop to calculate ndwi using green and nir in a dataframe with x timesteps?
up vote
0
down vote
favorite
I have Landsat data for 31 years in 6 NetCDF files.
Each files has about 4 million rows of data.
Each file has an as yet unknown number of timesteps in each
Except for the first file that I am using to write my script, which has 60 timesteps, and will be used to process all the files.
I have never used R before this wee exercise.
My task is to create one set of land only data for a statistician to work on.
How can I loop through the timesteps calculating ndwi?
I have this code working:
But it requires a lot of find/replace, copy/paste to work through the 60 timesteps and then there are 5 more files to process.
###############################################################
# Calculate ndwi for for the each timestep
###############################################################
#convert the timestep_1 for green to numeric value
green_nir_df02$t_1 <- as.numeric(as.character(green_nir_df02$t_1))
head (na.omit(green_nir_df02$t_1, 20))
#convert the timestep_1 for nir to numeric value
green_nir_df02$t_1.1 <- as.numeric(as.character(green_nir_df02$t_1.1))
head (na.omit(green_nir_df02$t_1.1, 20))
# calculate green minus nir for timestep_1
grnMinusNir_t_1 <- green_nir_df02$t_1 - green_nir_df02$t_1.1
head(grnMinusNir_t_1, 20)
# calculate green plus nir for timestep_1
grnPlusNir_t_1 <- green_nir_df02$t_1 + green_nir_df02$t_1.1
head(grnPlusNir_t_1, 20)
# calculate ndwi from greenMinusNir divided by greenPlusNir for timestep_1
ndwi_t_1 <- grnMinusNir_t_1 / grnPlusNir_t_1
head(ndwi_t_1, 20)
# write ndwi to the green_nir_df02 dataframe for timestep_1
green_nir_df02$ndwi_t_1 <- ndwi_t_1
####################################################
#convert the timestep_2 for green to numeric value
green_nir_df02$t_2 <- as.numeric(as.character(green_nir_df02$t_2))
head (na.omit(green_nir_df02$t_2, 20))
#convert the timestep_2 for nir to numeric value
green_nir_df02$t_2.1 <- as.numeric(as.character(green_nir_df02$t_2.1))
head (na.omit(green_nir_df02$t_2.1, 20))
# calculate green minus nir for timestep_2
grnMinusNir_t_2 <- green_nir_df02$t_2 - green_nir_df02$t_2.1
head(grnMinusNir_t_2, 20)
# calculate green plus nir for timestep_2
grnPlusNir_t_2 <- green_nir_df02$t_2 + green_nir_df02$t_2.1
head(grnPlusNir_t_2, 20)
# calculate ndwi from greenMinusNir divided by greenPlusNir for timestep_2
ndwi_t_2 <- grnMinusNir_t_2 / grnPlusNir_t_2
head(ndwi_t_2, 20)
# write ndwi to the green_nir_df02 dataframe for timestep_2
green_nir_df02$ndwi_t_2 <- ndwi_t_2
... etc to t_60
###############################################################
# END: Calculate ndwi for for the each timestep
###############################################################
and the next file starts at t_61 and so on
I have tried using this loop, which is not working because I can't point to the value in the column of t_1 in the dataframe rather than the value in the array of column names.
# initialize the "for" loop to calculate ndwi columns
seq <- 1:nt
i <- 0
green_t <- array(1:nt)
nir_t <- array(1:nt)
ndwi_t <- array(1:nt)
greenMinusNir <- array(1:nt)
greenPlusNir <- array(1:nt)
ndwi <- array(1:nt)
# set the value of the count
# count = nt from previous file #nt = num timesteps#THIS IS THE VALUE TO USE
count <- 0 ################################## EDIT THIS VALUE BEFORE RUNNING
# START: Loop for each timestep
# Step 1: create variable names for green, nir, ndwi
# Step 2: convert t_[i] and t_[i].1 to numeric
# Step 3: calculate green minus nir
# Step 4: calculate green plus nir
# Step 5: calculate ndwi = g-nir/g+nir
# Step 6: write ndwi to the green_nir_df02 dataframe
for(i in seq){
# Step 1: create variable names for green, nir, ndwi
green_t[i] <- paste("green_nir_df02$t_",count+i,sep="")
nir_t[i] <- paste("green_nir_df02$t_",count+i,".1",sep="")
ndwi_t[i] <- paste("green_nir_df02$ndwi_t_",count+i,sep="")
}
# initialize the "for" loop to calculate ndwi columns
seq <- 1:nt
i <- 0
for(i in seq){
# Step 2: convert green (i.e., t_[i]) and nir (i.e., t_[i].1) to numeric
green <- as.numeric(as.character(green_t[i]))
##
#ERROR when i=1 > green <-as.numeric(as.character("green_nir_df02$t_1"))
##
nir_t <- as.numeric(as.character(nir_t))
# Step 3: calculate green minus nir
greenMinusNir[i] <- green_t - nir_t
# Step 4: calculate green plus nir
greenPlusNir[i] <- green_t[i] + nir_t[i]
# Step 5: calculate ndwi = g-nir/g+nir
ndwi[i] <- greenMinusNir[i] / greenPlusNir[i]
# Step 6: write ndwi to the green_nir_df02 dataframe ndwi timestep
ndwi_t[i] <- ndwi[i]
# i <- i+1
}
Sample data
lon_lat t_1 t_2 t_60 t_1.1 t_2.1 t_60.1 ndwi_t_1 ndwi_t_2
1 -1609787.5_-2275087.5 180 216 247 80 197 192 0.3846154 0.04600484
2 -1609762.5_-2275087.5 102 252 281 80 197 227 0.1208791 0.12249443
3 -1609737.5_-2275087.5 102 216 281 80 156 227 0.1208791 0.16129032
4 -1609712.5_-2275087.5 141 216 281 80 156 227 0.2760181 0.16129032
5 -1609687.5_-2275087.5 180 181 281 80 156 227 0.3846154 0.07418398
6 -1609662.5_-2275087.5 180 216 281 80 197 227 0.3846154 0.04600484
r
New contributor
add a comment |
up vote
0
down vote
favorite
I have Landsat data for 31 years in 6 NetCDF files.
Each files has about 4 million rows of data.
Each file has an as yet unknown number of timesteps in each
Except for the first file that I am using to write my script, which has 60 timesteps, and will be used to process all the files.
I have never used R before this wee exercise.
My task is to create one set of land only data for a statistician to work on.
How can I loop through the timesteps calculating ndwi?
I have this code working:
But it requires a lot of find/replace, copy/paste to work through the 60 timesteps and then there are 5 more files to process.
###############################################################
# Calculate ndwi for for the each timestep
###############################################################
#convert the timestep_1 for green to numeric value
green_nir_df02$t_1 <- as.numeric(as.character(green_nir_df02$t_1))
head (na.omit(green_nir_df02$t_1, 20))
#convert the timestep_1 for nir to numeric value
green_nir_df02$t_1.1 <- as.numeric(as.character(green_nir_df02$t_1.1))
head (na.omit(green_nir_df02$t_1.1, 20))
# calculate green minus nir for timestep_1
grnMinusNir_t_1 <- green_nir_df02$t_1 - green_nir_df02$t_1.1
head(grnMinusNir_t_1, 20)
# calculate green plus nir for timestep_1
grnPlusNir_t_1 <- green_nir_df02$t_1 + green_nir_df02$t_1.1
head(grnPlusNir_t_1, 20)
# calculate ndwi from greenMinusNir divided by greenPlusNir for timestep_1
ndwi_t_1 <- grnMinusNir_t_1 / grnPlusNir_t_1
head(ndwi_t_1, 20)
# write ndwi to the green_nir_df02 dataframe for timestep_1
green_nir_df02$ndwi_t_1 <- ndwi_t_1
####################################################
#convert the timestep_2 for green to numeric value
green_nir_df02$t_2 <- as.numeric(as.character(green_nir_df02$t_2))
head (na.omit(green_nir_df02$t_2, 20))
#convert the timestep_2 for nir to numeric value
green_nir_df02$t_2.1 <- as.numeric(as.character(green_nir_df02$t_2.1))
head (na.omit(green_nir_df02$t_2.1, 20))
# calculate green minus nir for timestep_2
grnMinusNir_t_2 <- green_nir_df02$t_2 - green_nir_df02$t_2.1
head(grnMinusNir_t_2, 20)
# calculate green plus nir for timestep_2
grnPlusNir_t_2 <- green_nir_df02$t_2 + green_nir_df02$t_2.1
head(grnPlusNir_t_2, 20)
# calculate ndwi from greenMinusNir divided by greenPlusNir for timestep_2
ndwi_t_2 <- grnMinusNir_t_2 / grnPlusNir_t_2
head(ndwi_t_2, 20)
# write ndwi to the green_nir_df02 dataframe for timestep_2
green_nir_df02$ndwi_t_2 <- ndwi_t_2
... etc to t_60
###############################################################
# END: Calculate ndwi for for the each timestep
###############################################################
and the next file starts at t_61 and so on
I have tried using this loop, which is not working because I can't point to the value in the column of t_1 in the dataframe rather than the value in the array of column names.
# initialize the "for" loop to calculate ndwi columns
seq <- 1:nt
i <- 0
green_t <- array(1:nt)
nir_t <- array(1:nt)
ndwi_t <- array(1:nt)
greenMinusNir <- array(1:nt)
greenPlusNir <- array(1:nt)
ndwi <- array(1:nt)
# set the value of the count
# count = nt from previous file #nt = num timesteps#THIS IS THE VALUE TO USE
count <- 0 ################################## EDIT THIS VALUE BEFORE RUNNING
# START: Loop for each timestep
# Step 1: create variable names for green, nir, ndwi
# Step 2: convert t_[i] and t_[i].1 to numeric
# Step 3: calculate green minus nir
# Step 4: calculate green plus nir
# Step 5: calculate ndwi = g-nir/g+nir
# Step 6: write ndwi to the green_nir_df02 dataframe
for(i in seq){
# Step 1: create variable names for green, nir, ndwi
green_t[i] <- paste("green_nir_df02$t_",count+i,sep="")
nir_t[i] <- paste("green_nir_df02$t_",count+i,".1",sep="")
ndwi_t[i] <- paste("green_nir_df02$ndwi_t_",count+i,sep="")
}
# initialize the "for" loop to calculate ndwi columns
seq <- 1:nt
i <- 0
for(i in seq){
# Step 2: convert green (i.e., t_[i]) and nir (i.e., t_[i].1) to numeric
green <- as.numeric(as.character(green_t[i]))
##
#ERROR when i=1 > green <-as.numeric(as.character("green_nir_df02$t_1"))
##
nir_t <- as.numeric(as.character(nir_t))
# Step 3: calculate green minus nir
greenMinusNir[i] <- green_t - nir_t
# Step 4: calculate green plus nir
greenPlusNir[i] <- green_t[i] + nir_t[i]
# Step 5: calculate ndwi = g-nir/g+nir
ndwi[i] <- greenMinusNir[i] / greenPlusNir[i]
# Step 6: write ndwi to the green_nir_df02 dataframe ndwi timestep
ndwi_t[i] <- ndwi[i]
# i <- i+1
}
Sample data
lon_lat t_1 t_2 t_60 t_1.1 t_2.1 t_60.1 ndwi_t_1 ndwi_t_2
1 -1609787.5_-2275087.5 180 216 247 80 197 192 0.3846154 0.04600484
2 -1609762.5_-2275087.5 102 252 281 80 197 227 0.1208791 0.12249443
3 -1609737.5_-2275087.5 102 216 281 80 156 227 0.1208791 0.16129032
4 -1609712.5_-2275087.5 141 216 281 80 156 227 0.2760181 0.16129032
5 -1609687.5_-2275087.5 180 181 281 80 156 227 0.3846154 0.07418398
6 -1609662.5_-2275087.5 180 216 281 80 197 227 0.3846154 0.04600484
r
New contributor
1
Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not usestr()
,head()
or screenshot)? You can use thereprex
anddatapasta
packages to assist you with that. See also Help me Help you & How to make a great R reproducible example?
– Tung
3 hours ago
Writing the question in jargon that 0.01% of users would understand is going to reduce the responses. 'NDWI' apparently means Normalized Difference Water Index, computed at each timestep t. So that boils down to you want to vectorize one calculation on all rows of a dataframe. Although you said the way the timesteps are represented causes an issue. Please explain more, i.e. post a snippet of (say 10-40) actual lines from your dataframe.
– smci
2 hours ago
"How can I loop through the timesteps calculating NDWI?" Don't. Vectorize them: write a custom functioncalculate_ndwi(...)
then useapply()
function to run it on all the rows and columns you need to. There should be zero for-loops in your final answer. Also, your timesteps should not be string to begin with, they should be read in directly as numeric from your netCDF file, or else converted immediately after reading. But, please show us a snippet of your input data, that's the best startpoint.
– smci
2 hours ago
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have Landsat data for 31 years in 6 NetCDF files.
Each files has about 4 million rows of data.
Each file has an as yet unknown number of timesteps in each
Except for the first file that I am using to write my script, which has 60 timesteps, and will be used to process all the files.
I have never used R before this wee exercise.
My task is to create one set of land only data for a statistician to work on.
How can I loop through the timesteps calculating ndwi?
I have this code working:
But it requires a lot of find/replace, copy/paste to work through the 60 timesteps and then there are 5 more files to process.
###############################################################
# Calculate ndwi for for the each timestep
###############################################################
#convert the timestep_1 for green to numeric value
green_nir_df02$t_1 <- as.numeric(as.character(green_nir_df02$t_1))
head (na.omit(green_nir_df02$t_1, 20))
#convert the timestep_1 for nir to numeric value
green_nir_df02$t_1.1 <- as.numeric(as.character(green_nir_df02$t_1.1))
head (na.omit(green_nir_df02$t_1.1, 20))
# calculate green minus nir for timestep_1
grnMinusNir_t_1 <- green_nir_df02$t_1 - green_nir_df02$t_1.1
head(grnMinusNir_t_1, 20)
# calculate green plus nir for timestep_1
grnPlusNir_t_1 <- green_nir_df02$t_1 + green_nir_df02$t_1.1
head(grnPlusNir_t_1, 20)
# calculate ndwi from greenMinusNir divided by greenPlusNir for timestep_1
ndwi_t_1 <- grnMinusNir_t_1 / grnPlusNir_t_1
head(ndwi_t_1, 20)
# write ndwi to the green_nir_df02 dataframe for timestep_1
green_nir_df02$ndwi_t_1 <- ndwi_t_1
####################################################
#convert the timestep_2 for green to numeric value
green_nir_df02$t_2 <- as.numeric(as.character(green_nir_df02$t_2))
head (na.omit(green_nir_df02$t_2, 20))
#convert the timestep_2 for nir to numeric value
green_nir_df02$t_2.1 <- as.numeric(as.character(green_nir_df02$t_2.1))
head (na.omit(green_nir_df02$t_2.1, 20))
# calculate green minus nir for timestep_2
grnMinusNir_t_2 <- green_nir_df02$t_2 - green_nir_df02$t_2.1
head(grnMinusNir_t_2, 20)
# calculate green plus nir for timestep_2
grnPlusNir_t_2 <- green_nir_df02$t_2 + green_nir_df02$t_2.1
head(grnPlusNir_t_2, 20)
# calculate ndwi from greenMinusNir divided by greenPlusNir for timestep_2
ndwi_t_2 <- grnMinusNir_t_2 / grnPlusNir_t_2
head(ndwi_t_2, 20)
# write ndwi to the green_nir_df02 dataframe for timestep_2
green_nir_df02$ndwi_t_2 <- ndwi_t_2
... etc to t_60
###############################################################
# END: Calculate ndwi for for the each timestep
###############################################################
and the next file starts at t_61 and so on
I have tried using this loop, which is not working because I can't point to the value in the column of t_1 in the dataframe rather than the value in the array of column names.
# initialize the "for" loop to calculate ndwi columns
seq <- 1:nt
i <- 0
green_t <- array(1:nt)
nir_t <- array(1:nt)
ndwi_t <- array(1:nt)
greenMinusNir <- array(1:nt)
greenPlusNir <- array(1:nt)
ndwi <- array(1:nt)
# set the value of the count
# count = nt from previous file #nt = num timesteps#THIS IS THE VALUE TO USE
count <- 0 ################################## EDIT THIS VALUE BEFORE RUNNING
# START: Loop for each timestep
# Step 1: create variable names for green, nir, ndwi
# Step 2: convert t_[i] and t_[i].1 to numeric
# Step 3: calculate green minus nir
# Step 4: calculate green plus nir
# Step 5: calculate ndwi = g-nir/g+nir
# Step 6: write ndwi to the green_nir_df02 dataframe
for(i in seq){
# Step 1: create variable names for green, nir, ndwi
green_t[i] <- paste("green_nir_df02$t_",count+i,sep="")
nir_t[i] <- paste("green_nir_df02$t_",count+i,".1",sep="")
ndwi_t[i] <- paste("green_nir_df02$ndwi_t_",count+i,sep="")
}
# initialize the "for" loop to calculate ndwi columns
seq <- 1:nt
i <- 0
for(i in seq){
# Step 2: convert green (i.e., t_[i]) and nir (i.e., t_[i].1) to numeric
green <- as.numeric(as.character(green_t[i]))
##
#ERROR when i=1 > green <-as.numeric(as.character("green_nir_df02$t_1"))
##
nir_t <- as.numeric(as.character(nir_t))
# Step 3: calculate green minus nir
greenMinusNir[i] <- green_t - nir_t
# Step 4: calculate green plus nir
greenPlusNir[i] <- green_t[i] + nir_t[i]
# Step 5: calculate ndwi = g-nir/g+nir
ndwi[i] <- greenMinusNir[i] / greenPlusNir[i]
# Step 6: write ndwi to the green_nir_df02 dataframe ndwi timestep
ndwi_t[i] <- ndwi[i]
# i <- i+1
}
Sample data
lon_lat t_1 t_2 t_60 t_1.1 t_2.1 t_60.1 ndwi_t_1 ndwi_t_2
1 -1609787.5_-2275087.5 180 216 247 80 197 192 0.3846154 0.04600484
2 -1609762.5_-2275087.5 102 252 281 80 197 227 0.1208791 0.12249443
3 -1609737.5_-2275087.5 102 216 281 80 156 227 0.1208791 0.16129032
4 -1609712.5_-2275087.5 141 216 281 80 156 227 0.2760181 0.16129032
5 -1609687.5_-2275087.5 180 181 281 80 156 227 0.3846154 0.07418398
6 -1609662.5_-2275087.5 180 216 281 80 197 227 0.3846154 0.04600484
r
New contributor
I have Landsat data for 31 years in 6 NetCDF files.
Each files has about 4 million rows of data.
Each file has an as yet unknown number of timesteps in each
Except for the first file that I am using to write my script, which has 60 timesteps, and will be used to process all the files.
I have never used R before this wee exercise.
My task is to create one set of land only data for a statistician to work on.
How can I loop through the timesteps calculating ndwi?
I have this code working:
But it requires a lot of find/replace, copy/paste to work through the 60 timesteps and then there are 5 more files to process.
###############################################################
# Calculate ndwi for for the each timestep
###############################################################
#convert the timestep_1 for green to numeric value
green_nir_df02$t_1 <- as.numeric(as.character(green_nir_df02$t_1))
head (na.omit(green_nir_df02$t_1, 20))
#convert the timestep_1 for nir to numeric value
green_nir_df02$t_1.1 <- as.numeric(as.character(green_nir_df02$t_1.1))
head (na.omit(green_nir_df02$t_1.1, 20))
# calculate green minus nir for timestep_1
grnMinusNir_t_1 <- green_nir_df02$t_1 - green_nir_df02$t_1.1
head(grnMinusNir_t_1, 20)
# calculate green plus nir for timestep_1
grnPlusNir_t_1 <- green_nir_df02$t_1 + green_nir_df02$t_1.1
head(grnPlusNir_t_1, 20)
# calculate ndwi from greenMinusNir divided by greenPlusNir for timestep_1
ndwi_t_1 <- grnMinusNir_t_1 / grnPlusNir_t_1
head(ndwi_t_1, 20)
# write ndwi to the green_nir_df02 dataframe for timestep_1
green_nir_df02$ndwi_t_1 <- ndwi_t_1
####################################################
#convert the timestep_2 for green to numeric value
green_nir_df02$t_2 <- as.numeric(as.character(green_nir_df02$t_2))
head (na.omit(green_nir_df02$t_2, 20))
#convert the timestep_2 for nir to numeric value
green_nir_df02$t_2.1 <- as.numeric(as.character(green_nir_df02$t_2.1))
head (na.omit(green_nir_df02$t_2.1, 20))
# calculate green minus nir for timestep_2
grnMinusNir_t_2 <- green_nir_df02$t_2 - green_nir_df02$t_2.1
head(grnMinusNir_t_2, 20)
# calculate green plus nir for timestep_2
grnPlusNir_t_2 <- green_nir_df02$t_2 + green_nir_df02$t_2.1
head(grnPlusNir_t_2, 20)
# calculate ndwi from greenMinusNir divided by greenPlusNir for timestep_2
ndwi_t_2 <- grnMinusNir_t_2 / grnPlusNir_t_2
head(ndwi_t_2, 20)
# write ndwi to the green_nir_df02 dataframe for timestep_2
green_nir_df02$ndwi_t_2 <- ndwi_t_2
... etc to t_60
###############################################################
# END: Calculate ndwi for for the each timestep
###############################################################
and the next file starts at t_61 and so on
I have tried using this loop, which is not working because I can't point to the value in the column of t_1 in the dataframe rather than the value in the array of column names.
# initialize the "for" loop to calculate ndwi columns
seq <- 1:nt
i <- 0
green_t <- array(1:nt)
nir_t <- array(1:nt)
ndwi_t <- array(1:nt)
greenMinusNir <- array(1:nt)
greenPlusNir <- array(1:nt)
ndwi <- array(1:nt)
# set the value of the count
# count = nt from previous file #nt = num timesteps#THIS IS THE VALUE TO USE
count <- 0 ################################## EDIT THIS VALUE BEFORE RUNNING
# START: Loop for each timestep
# Step 1: create variable names for green, nir, ndwi
# Step 2: convert t_[i] and t_[i].1 to numeric
# Step 3: calculate green minus nir
# Step 4: calculate green plus nir
# Step 5: calculate ndwi = g-nir/g+nir
# Step 6: write ndwi to the green_nir_df02 dataframe
for(i in seq){
# Step 1: create variable names for green, nir, ndwi
green_t[i] <- paste("green_nir_df02$t_",count+i,sep="")
nir_t[i] <- paste("green_nir_df02$t_",count+i,".1",sep="")
ndwi_t[i] <- paste("green_nir_df02$ndwi_t_",count+i,sep="")
}
# initialize the "for" loop to calculate ndwi columns
seq <- 1:nt
i <- 0
for(i in seq){
# Step 2: convert green (i.e., t_[i]) and nir (i.e., t_[i].1) to numeric
green <- as.numeric(as.character(green_t[i]))
##
#ERROR when i=1 > green <-as.numeric(as.character("green_nir_df02$t_1"))
##
nir_t <- as.numeric(as.character(nir_t))
# Step 3: calculate green minus nir
greenMinusNir[i] <- green_t - nir_t
# Step 4: calculate green plus nir
greenPlusNir[i] <- green_t[i] + nir_t[i]
# Step 5: calculate ndwi = g-nir/g+nir
ndwi[i] <- greenMinusNir[i] / greenPlusNir[i]
# Step 6: write ndwi to the green_nir_df02 dataframe ndwi timestep
ndwi_t[i] <- ndwi[i]
# i <- i+1
}
Sample data
lon_lat t_1 t_2 t_60 t_1.1 t_2.1 t_60.1 ndwi_t_1 ndwi_t_2
1 -1609787.5_-2275087.5 180 216 247 80 197 192 0.3846154 0.04600484
2 -1609762.5_-2275087.5 102 252 281 80 197 227 0.1208791 0.12249443
3 -1609737.5_-2275087.5 102 216 281 80 156 227 0.1208791 0.16129032
4 -1609712.5_-2275087.5 141 216 281 80 156 227 0.2760181 0.16129032
5 -1609687.5_-2275087.5 180 181 281 80 156 227 0.3846154 0.07418398
6 -1609662.5_-2275087.5 180 216 281 80 197 227 0.3846154 0.04600484
r
r
New contributor
New contributor
edited 3 hours ago
New contributor
asked 4 hours ago
Ilona
11
11
New contributor
New contributor
1
Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not usestr()
,head()
or screenshot)? You can use thereprex
anddatapasta
packages to assist you with that. See also Help me Help you & How to make a great R reproducible example?
– Tung
3 hours ago
Writing the question in jargon that 0.01% of users would understand is going to reduce the responses. 'NDWI' apparently means Normalized Difference Water Index, computed at each timestep t. So that boils down to you want to vectorize one calculation on all rows of a dataframe. Although you said the way the timesteps are represented causes an issue. Please explain more, i.e. post a snippet of (say 10-40) actual lines from your dataframe.
– smci
2 hours ago
"How can I loop through the timesteps calculating NDWI?" Don't. Vectorize them: write a custom functioncalculate_ndwi(...)
then useapply()
function to run it on all the rows and columns you need to. There should be zero for-loops in your final answer. Also, your timesteps should not be string to begin with, they should be read in directly as numeric from your netCDF file, or else converted immediately after reading. But, please show us a snippet of your input data, that's the best startpoint.
– smci
2 hours ago
add a comment |
1
Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not usestr()
,head()
or screenshot)? You can use thereprex
anddatapasta
packages to assist you with that. See also Help me Help you & How to make a great R reproducible example?
– Tung
3 hours ago
Writing the question in jargon that 0.01% of users would understand is going to reduce the responses. 'NDWI' apparently means Normalized Difference Water Index, computed at each timestep t. So that boils down to you want to vectorize one calculation on all rows of a dataframe. Although you said the way the timesteps are represented causes an issue. Please explain more, i.e. post a snippet of (say 10-40) actual lines from your dataframe.
– smci
2 hours ago
"How can I loop through the timesteps calculating NDWI?" Don't. Vectorize them: write a custom functioncalculate_ndwi(...)
then useapply()
function to run it on all the rows and columns you need to. There should be zero for-loops in your final answer. Also, your timesteps should not be string to begin with, they should be read in directly as numeric from your netCDF file, or else converted immediately after reading. But, please show us a snippet of your input data, that's the best startpoint.
– smci
2 hours ago
1
1
Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use
str()
, head()
or screenshot)? You can use the reprex
and datapasta
packages to assist you with that. See also Help me Help you & How to make a great R reproducible example?– Tung
3 hours ago
Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use
str()
, head()
or screenshot)? You can use the reprex
and datapasta
packages to assist you with that. See also Help me Help you & How to make a great R reproducible example?– Tung
3 hours ago
Writing the question in jargon that 0.01% of users would understand is going to reduce the responses. 'NDWI' apparently means Normalized Difference Water Index, computed at each timestep t. So that boils down to you want to vectorize one calculation on all rows of a dataframe. Although you said the way the timesteps are represented causes an issue. Please explain more, i.e. post a snippet of (say 10-40) actual lines from your dataframe.
– smci
2 hours ago
Writing the question in jargon that 0.01% of users would understand is going to reduce the responses. 'NDWI' apparently means Normalized Difference Water Index, computed at each timestep t. So that boils down to you want to vectorize one calculation on all rows of a dataframe. Although you said the way the timesteps are represented causes an issue. Please explain more, i.e. post a snippet of (say 10-40) actual lines from your dataframe.
– smci
2 hours ago
"How can I loop through the timesteps calculating NDWI?" Don't. Vectorize them: write a custom function
calculate_ndwi(...)
then use apply()
function to run it on all the rows and columns you need to. There should be zero for-loops in your final answer. Also, your timesteps should not be string to begin with, they should be read in directly as numeric from your netCDF file, or else converted immediately after reading. But, please show us a snippet of your input data, that's the best startpoint.– smci
2 hours ago
"How can I loop through the timesteps calculating NDWI?" Don't. Vectorize them: write a custom function
calculate_ndwi(...)
then use apply()
function to run it on all the rows and columns you need to. There should be zero for-loops in your final answer. Also, your timesteps should not be string to begin with, they should be read in directly as numeric from your netCDF file, or else converted immediately after reading. But, please show us a snippet of your input data, that's the best startpoint.– smci
2 hours ago
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Ilona is a new contributor. Be nice, and check out our Code of Conduct.
Ilona is a new contributor. Be nice, and check out our Code of Conduct.
Ilona is a new contributor. Be nice, and check out our Code of Conduct.
Ilona is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53370614%2fusing-r-how-to-write-a-loop-to-calculate-ndwi-using-green-and-nir-in-a-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Welcome to Stack Overflow! Could you make your problem reproducible by sharing a sample of your data so others can help (please do not use
str()
,head()
or screenshot)? You can use thereprex
anddatapasta
packages to assist you with that. See also Help me Help you & How to make a great R reproducible example?– Tung
3 hours ago
Writing the question in jargon that 0.01% of users would understand is going to reduce the responses. 'NDWI' apparently means Normalized Difference Water Index, computed at each timestep t. So that boils down to you want to vectorize one calculation on all rows of a dataframe. Although you said the way the timesteps are represented causes an issue. Please explain more, i.e. post a snippet of (say 10-40) actual lines from your dataframe.
– smci
2 hours ago
"How can I loop through the timesteps calculating NDWI?" Don't. Vectorize them: write a custom function
calculate_ndwi(...)
then useapply()
function to run it on all the rows and columns you need to. There should be zero for-loops in your final answer. Also, your timesteps should not be string to begin with, they should be read in directly as numeric from your netCDF file, or else converted immediately after reading. But, please show us a snippet of your input data, that's the best startpoint.– smci
2 hours ago