Last month we looked at how to create Microsoft Excel (i.e. *.xls) files using the xlwt package. Today we will be looking at how we can read an *.xls/*.xlsx file using a package called xlrd. The xlrd package can be run on Linux and Mac as well as Windows. This is great when you need to process an Excel file on a Linux server.
We will start out by reading the first Excel file we created in our previous article.
Let’s get started!
Reading an Excel Spreadsheet
In this section, we will look at a function that demonstrates different ways of reading an Excel file. Here’s the code example:
import xlrd #---------------------------------------------------------------------- def open_file(path): """ Open and read an Excel file """ book = xlrd.open_workbook(path) # print number of sheets print book.nsheets # print sheet names print book.sheet_names() # get the first worksheet first_sheet = book.sheet_by_index(0) # read a row print first_sheet.row_values(0) # read a cell cell = first_sheet.cell(0,0) print cell print cell.value # read a row slice print first_sheet.row_slice(rowx=0, start_colx=0, end_colx=2) #---------------------------------------------------------------------- if __name__ == "__main__": path = "test.xls" open_file(path)
Let’s break this down a bit. First we import xlrd and then in our function, we open the Excel workbook that was passed in. The next couple of lines show how to introspect the book. We find out how many worksheets there are in the workbook and we print out their names. Next we extract the first worksheet via the sheet_by_index method. We can read an entire row from the worksheet using the row_values method. If we want to get a particular cell’s value, we can call the cell method and pass it the row and column indexes. Finally we use xlrd’s row_slice method to read a portion of the row. As you can see, this last method accepts a row index and the starting and ending column indexes to determine what to return. The row_slice method returns a list of cell instances.
This makes it very easy to iterate over a group of cells. Here’s a small snippet to demonstrate:
cells = first_sheet.row_slice(rowx=0, start_colx=0, end_colx=2) for cell in cells: print cell.value
The xlrd package supports the following types of cells: text, number (i.e. float), dates (any number format that “looks” like a date), Boolean, error and empty/blank. The package also supports extracting data from named cells, although the project doesn’t support all types of named cells. The reference text is a bit vague on what exactly it does not support though.
If you need to copy cell formatting you will need to download the xlutils package.
Wrapping Up
At this point you should know enough to read most Excel files that were built using Microsoft’s XLS format. There is another package that also supports reading xls/xlsx files called the openpyxl project. You might want to check it out as an alternative.
Related Reading
- xlrd / xlwt / xlutils home page
- Excel reading guide
- Creating Microsoft Excel Spreadsheets with Python and xlwt
I want to extract comments and hyperlinks of cell in .xls file using xlrd library.
Plz let me know the functions and sample code
Thanks in advance.
I don’t think the xlrd library currently supports this. You’ll probably need to use PyWin32. Here’s one solution I found – http://stackoverflow.com/questions/3711625/how-to-read-or-parse-excel-comments-using-python
Hi, Is it possible to run iterations of data in excel using this library? I want to be able to read the content of all the columns in row1 and use it in the script and then in the next without the script stopping do a second execution using the data inside the next row2. In this way i will be able to run a script multiple times using different data that will be stored in excel rows. thanks
Yes, you should be able to do that
Pingback: Mike Driscoll: Top Ten Articles of 2016 | Adrian Tudor Web Designer and Programmer
i want to extract all data from a sheet by name and write it over another sheet by name is it possible?
I’m pretty sure you can use xlrd to check the sheet name before extracting its contents. Then you would use xlwt to write it to another sheet or file.
Please provide a simple example that would be really helpful for me
Pingback: Top 10 Most Read Mouse vs Python Articles of 2019 - The Mouse Vs. The Python