xlsx file is called a
workbook , in this course we will work with xlsx file call 500.xlsx which contain personal data of 1500 peoples (500 from US, 500 from UK and 500 from CA)
workbook contain 1 or many
worksheet . From image you can see 3 worksheets with name : us-500 , uk-500, ca-500.
rows of data. Normally the first row will contain all column name.
cell is specified by it’s
row . For example company name “Printing Dimensions” is specified by row = 2 and column = “C”
Both Python and openpyxl could be installed by Anaconda. Go to download page and download Anaconda for Python 3.6. After download, just double click to install with default option.
After install, open command prompt and typing in
python . The anaconda and python version should show up as below
From python shell, import openpyxl to make sure that it is installed successfully.
In this course I use Sublimetext as Python IDE to writing code (you can choose any other text edition you want). I choose Sublimtext due to :
We could run python right from editor by short cut key “Ctrl + B”
It very fast text editor
Have good theme to high light code
First thing we need to do is import module
This module provide function allow us to to create a workbook or load a workbook from a file. In the first part of this course we will load a existing workbook from a
# loading workbook from file wb = openpyxl.load_workbook('500.xlsx')
After load workbook, we could access current active worksheet with code. Only has one active worksheet at a time, and you could access data only from active worksheet. Default active worksheet is the first sheet.
# get current active worksheet current_active_worksheet = wb.get_active_sheet() print(current_active_worksheet)
To get name of all worksheets
# get all sheet name all_sheet_names = wb.get_sheet_names() print(all_sheet_names)
Now, you can set active sheet by name
current_active_worksheet = wb.get_sheet_by_name('uk-500') print(current_active_worksheet)
Following loop could help to access all sheet
# for loop to access all sheet inside workbook for ws in wb: print(ws.title)
With MS Excel:
One cell could be accessed simply specified by combination column and row. And then actual data could be accessed with
value cell property.
# print out data at column A and row 2 of current worksheet print(ws['A2'].value) # set value for one cell ws['A2'] = 'Tan Pham' # print out new data print(ws['A2'].value)
If we only specify just column, we could access all cells inside that column
# print all data from column A (the first column) for cell in ws['A']: print(cell.value)
If we only specify just row, we could access all cells inside that row
# print all data from row 1, which are columns name for cell in ws['1']: print(cell.value)
Incase we want to scan the whole sheet,
openpyl provide property call
# access all data , row by row for row in ws.rows: for cell in row: print(cell.value) # access all data , column by column for column in ws.columns: for cell in column: print(cell.value)
After doing modify data, to save workbook with name.
# save workbook with name wb.save("500.xlsx")
To create a new empty workbook. A workbook is always created with at least one worksheet.
from openpyxl import Workbook # create a new workbook wb = Workbook() # get the default worksheet ws = wb.active # set worksheet title ws.title = "first sheet"
Cell data is set by combination of column and row.
# set value for first row as column name ws['A1']='first column' ws['B1']='second column' # set value for second row ws['A2']=1 ws['B2']=2