Commit 6bb88039 authored by George Mount's avatar George Mount
Browse files

upload files

parent 72943bba
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "quarterly-project",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd # Data analysis\n",
"\n",
"import seaborn as sns #Charts\n",
"import matplotlib.pyplot as plt #Charts\n",
"\n",
"import xlsxwriter #Write to Excel\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "referenced-bradford",
"metadata": {},
"outputs": [],
"source": [
"borough = ['The Bronx', 'Brooklyn', 'Manhattan', 'Queens', 'Staten Island']\n",
"pop = [1418207, 2559903, 1628706, 2253858, 476143]\n",
"size = [42.10, 70.82, 22.83, 108.53, 58.37]\n",
"\n",
"data = {\"borough\": borough, \"pop\": pop, \"size\": size}\n",
"\n",
"nyc = pd.DataFrame(data)\n",
"\n",
"# Sort from high to low\n",
"nyc = nyc.sort_values(by='pop', ascending=False)\n",
"\n",
"nyc"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "warming-boating",
"metadata": {},
"outputs": [],
"source": [
"# Write to Excel from pandas, limited formatting\n",
"nyc.to_excel('nyc-pandas.xlsx')"
]
},
{
"cell_type": "markdown",
"id": "civilian-tennis",
"metadata": {},
"source": [
"## Writing to `xlsxwriter`\n",
"\n",
"This will let you format cells, add charts, etc. \n",
"\n",
"A few steps to write a `pandas` DataFrame to Excel with `xlsxwriter`:\n",
"\n",
"1. Set `pandas` engine to `xlsxwriter`\n",
"2. Convert DataFrame to `xlsxwriter` object\n",
"3. Get `xlsxwriter` workbook and worksheet objects from DataFrame writer object\n",
"4. Save and close connection.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "boxed-master",
"metadata": {},
"outputs": [],
"source": [
"# 1. Set Pandas engine to xlsxwriter\n",
"writer = pd.ExcelWriter('nyc.xlsx', engine='xlsxwriter')\n",
"\n",
"# 2. Convert the dataframe to an XlsxWriter Excel object.\n",
"nyc.to_excel(writer, sheet_name='Sheet1', index=False)\n",
"\n",
"# 3. Get the xlsxwriter objects from the DataFrame writer object.\n",
"workbook = writer.book\n",
"worksheet = writer.sheets['Sheet1']"
]
},
{
"cell_type": "markdown",
"id": "considered-thanksgiving",
"metadata": {},
"source": [
"If we were to open the workbook now, it would look like this: \n",
"\n",
"\n",
"<img src=\"images/nyc-start.png\" alt=\"NYC worksheet start\" style=\"width: 750px\"/>\n",
"\n",
"Let's make a few improvements, shall we?\n",
"\n",
"1. Widen column `A`\n",
"2. Format column `B` in thousands\n",
"3. Add charts (We'll do one Excel chart, one Python chart... why not?)\n"
]
},
{
"cell_type": "markdown",
"id": "impaired-server",
"metadata": {},
"source": [
"## Format numbers\n",
"\n",
"We can use `xlsxwriter`'s `set_column()` method: \n",
"\n",
"```\n",
"set_column(first_col, last_col, width, cell_format, options)\n",
"```\n",
"\n",
"I will locate the position of each column by name in the DataFrame with the `get_loc()` method from `pandas`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "delayed-windows",
"metadata": {},
"outputs": [],
"source": [
"# Get population index position\n",
"borough_col = nyc.columns.get_loc('borough')\n",
"borough_col\n",
"\n",
"# Python uses zero-based indexing"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "awful-wisconsin",
"metadata": {},
"outputs": [],
"source": [
"# Get population index position\n",
"\n",
"pop_col = nyc.columns.get_loc('pop')\n",
"pop_col"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "attractive-yacht",
"metadata": {},
"outputs": [],
"source": [
"# Re-set width of Borough column\n",
"# No auto-fit feature \n",
"\n",
"worksheet.set_column(borough_col, borough_col, 12)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "supported-weather",
"metadata": {},
"outputs": [],
"source": [
"# Set format of Population format to thousands\n",
"\n",
"thousands_format = workbook.add_format({'num_format':'#,##0'})\n",
"worksheet.set_column(pop_col, pop_col, None, thousands_format)"
]
},
{
"cell_type": "markdown",
"id": "beneficial-humanity",
"metadata": {},
"source": [
"## Add a chart using Excel\n",
"\n",
"1. Add chart type\n",
"2. Add series: `[sheetname, first_row, first_col, last_row, last_col]`\n",
"3. Add chart axes, titles, etc.\n",
"4. Insert chart into Excel"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "involved-corporation",
"metadata": {},
"outputs": [],
"source": [
"nyc_chart = workbook.add_chart({'type': 'column'})\n",
"\n",
"# Get total number of rows\n",
"max_row = nyc.shape[0]\n",
"\n",
"\n",
"# Don't include header data (\"Oth\" row) in the chart\n",
"nyc_chart.add_series({\n",
" 'name': 'Borough',\n",
" 'categories': ['Sheet1', 1, borough_col, max_row, borough_col], \n",
" 'values': ['Sheet1', 1, pop_col, max_row, pop_col],\n",
"})\n",
"\n",
"# Set chart title\n",
"nyc_chart.set_title({'name': 'NYC population by borough'})\n",
"\n",
"\n",
"# Insert the chart into the worksheet.\n",
"worksheet.insert_chart('G2', nyc_chart)\n"
]
},
{
"cell_type": "markdown",
"id": "neutral-potato",
"metadata": {},
"source": [
"## Add a graph using `seaborn`\n",
"\n",
"1. Create plot in Python\n",
"2. Add chart axes, titles, etc.\n",
"3. Save image locally\n",
"4. Insert it into Excel "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "exterior-costs",
"metadata": {},
"outputs": [],
"source": [
"sns.barplot(x='borough', y='pop', data=nyc, color='blue')\n",
"plt.title('NYC population by borough')\n",
"\n",
"\n",
"# Save the image\n",
"plt.savefig('images/nyc-pop.png', dpi = (300))\n",
"\n",
"\n",
"# Add the image to the workbook\n",
"worksheet.insert_image('G20', 'images/nyc-pop.png')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "corresponding-boards",
"metadata": {},
"outputs": [],
"source": [
"# Close workbook\n",
"workbook.close()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "right-lover",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment