3

I'm using lapply to read in multiple .xls files from a directory. Since the data represents data collected from sites with a different ID given by the filename, I'd like to set the list name to be the filename of each file.

I am currently doing the following:

library(readxl)

# Set filepath
file_location="FILEPATH"
# List all files within folder
filenames=list.files(file_location, pattern="^ID.*xls",full.names = T) 
# Import all files
import_data=lapply(filenames, function(x) read_xls(x, col_names = T)) 

I could then run something like this:

filenames_short=gsub(".xls", "", x=list.files(file_location, pattern="^ID.*xls",full.names = F))
names(import_data)=filenames_short

However, my pessimistic self is telling me that there is a possibility that the order of the filenames won't match the order of the list. Surely there must be a way to set this in the original command?

2
  • 3
    You are too pessimistic.
    – jogo
    Commented Aug 20, 2018 at 9:08
  • You're probably right, @jogo. Wishful thinking for there to be a more elegant solution.
    – sym246
    Commented Aug 20, 2018 at 9:09

1 Answer 1

4

I agree with @jogo, but if this generates insecurity, you can return the table with the name of the file.

One option is to add an attribute to the table:

import_data=lapply(filenames, function(x) {
                   out <- read_xls(x, col_names = T)
                   attr(out, "file") <- x
                   return(out)
                   }) 

Another is to return a list where the table is an object and it is already named.

import_data=lapply(filenames, function(x) {
                   out <- list(read_xls(x, col_names = T))
                   names(out) <- gsub(".xls", "", x)
                   return(out)
                   }) 

Not the answer you're looking for? Browse other questions tagged or ask your own question.