Part 3: This is the third in a multi-part series on incrementally building a dashboard using Plotly Dash. We add selection items to a dropdown filter that will filter the Folium maps in the dashboard we set up (we dynamically create the maps here).

Image for post
Image for post
Himalayan blue poppy

The data from Inside AirBnB provides publicly available data for AirBnB listings in Seattle, USA. We will use the dashboard framework we created in the first part of this series and include spatial data to explore the locations of the listings.

GOAL


Part 4: This is the fourth installment of a multi-part series on incrementally building a dashboard using Plotly Dash. We discuss data cleaning, data type recognition, formatting and preparation especially with regard to creating a Sankey and Bar chart. The Sankey diagram will show the number and types of properties in a neighborhood

Image for post
Image for post
Himalayan blue poppy

The data from Inside AirBnB provides publicly available data for AirBnB listings in Seattle, USA. We will use the dashboard framework we created in the first part of this series and include spatial data to explore the locations of the listings

Goal

We will be creating a Sankey chart for the dashboard as shown below. We also have a segmented (by neighborhood) bar chart, for each neighborhood group, that shows the same information but in a different style next to it. In the Sankey chart, each link shows number of a certain property type in a certain neighborhood. But it can also show the split of the types of properties within a neighborhood. The aggregates of property types as well as the aggregates of properties within a neighborhood are displayed upon hovering on a node. This is the additional information that the Sankey chart offers over the segmented bar chart, within the same space. …


Part 2: This is the second installment of a multi-part series on incrementally building a dashboard using Plotly Dash. We create a spatial data scatter plot using Folium.

Image for post
Image for post
Himalayan blue poppy

The data from Inside AirBnB provides publicly available data for AirBnB listings in Seattle, USA. We will use the dashboard framework we created in the first part of this series and include spatial data to explore the locations of the listings

GOAL

We are aiming for a dashboard layout as seen below. We will add iFrames to our layout to display Folium maps. These maps are interactive, with hover tooltips that provide more details about each point. On the left we have each point representing a listing. On the right the map shows the aggregate count of listings for neighborhoods within Seattle.
Note: The maps are static (.html files) in this article and will respond dynamically to the filters on the left in the next article. …


Part 1 : This is the first of a multi-part series on incrementally building a dashboard using Plotly Dash. We begin with the scaffolding — the HTML structure for the pages.

Image for post
Image for post
Himalayan blue poppy

Note : Much is owed to Gabriele Albini whose project on building a professional dashboard structure helped me get started with my own journey with Plotly Dash dashboards. I have used elements of the HTML building structure that he has shared, notably the reusable HTML element creating functions such as get_header and get_navbar.

A dataset is rendered far more meaningful when translated into images from which you can reap insights. Contextually organizing these images, further enhances the intuitive value. A dashboard is a curated collection of visuals that relay the maximum information most effectively about the underlying dataset. Each visual speaks volumes about an aspect of the data. With a thoughtful arrangement of visuals, an exploratory route leading to a final conclusion can be provided through a well designed dashboard.
Another aspect of the dashboard that is alluring is that we can offer interactive elements to the user. This allows the user to focus on the segment of data that they are interested in. The segment can be a timespan, an area, a product of interest, a comparison choice, you get the point….Tooltips and panning features in interactive visualization are really helpful in connecting the underlying values with the visual representation. …


Image for post
Image for post

Relevant Scenario

In a company that handles leases, one of the rates that accounting applies to generate journal entries is depreciation rate. The accounting team has been tasked with updating the depreciation rate for assets for various countries beginning January 01, 2020. While they can insert the new rates and set their effective date from January 01, 2020, the team also has to cap the previous rates denoting that the historic rates stored are effective only until midnight of December 31, 2020.

This is where the analytic LEAD function in SQL steps in. It can be used to look into the rows ahead (when rows are ordered in ascending order of effective date) and retrieve the start (effective from date) of the next rate validity window. …


Image for post
Image for post

Relevant Scenario

What if we wanted to fly from the US east coast to Seattle, on the west coast, on a flight that is most likely to arrive on time? From a choice of three cities — New York, Newark and Boston, we would like to find out which airline within each sector (Origin city-Destination combination) contributes to the least number of delayed flights.

Having gathered data for a month’s worth of flight data to our destination Seattle, from these three cities on the east coast that we have our sights on, we can aggregate the total flights for each carrier, and the total delayed flights for each carrier. …


Image for post
Image for post

Relevant Scenario

Merging two or more tables (in a relational database), queries (in Power BI) and data frames (in Pandas) is an every day task. The challenge arises when the columns to be joined on, do not contain exact matches. Excuse the pun, but we may not always get to join Apples to Apples. Sometimes Apples need to be joined to Granny Smith Apples to retrieve the sales of the generic fruit Apple for instance.

Data tables

The data we will consider is hosted within the tables of interest shown below. The table NHL_DATA contains win(s) and loss(es) for teams identified by their popular names. The requirement here is to match up the team to the city they belong to so that wins and losses can be aggregated or averaged by city. And the catch is that the information we have in the CITIES_DATA table lists the teams only with a partial, terse name. For instance, “Anaheim Ducks” belong to the city of Los Angeles and is listed with the name “Ducks” in the CITIES table. …


Image for post
Image for post

Relevant Scenario

This scenario was developed in response to a requirement to translate excel pivoting capabilities into Pandas code so that the automation developed could be applied on a large set of excel files. The raw data in the excel sheet primarily consists of contracts won in each fiscal year paired with their contract amounts. The requirement is to report on the number of contracts in each fiscal year that fall into “bins” or “buckets” of pre-defined contract amounts. Six bins of contract amounts were identified as bins of interest:

1. <10M
2. >=10M<25M
3. >=25M<50M
4. >=50M<75M
5. >=75M<100M
6. >=100M

Three techniques have been provided below which create the output desired using a relational database, Pandas and Power BI- with usage of DAX functions. …


Image for post
Image for post

Relevant Scenario

Consider a table with a column with customer names and another column with a comma-separated list of items bought. This scenario or structure of data is extremely common when you ask for information from the field. It could be a Deal Id in the first column and a list of Subscriptions in the next. It could be Country in the first and a list of States in the next or Parent Company as key and Child (subsidiary companies) in the next column.

To work with the data in any meaningful manner though we do need to separate out the list of comma separated data into individual rows identified by the key element. …


Image for post
Image for post

Relevant Scenario:

Retrieve Products from the Products table that have not had a sale (do not have a row in the Sales table)

Related tables:

The Products and Sales tables are related to each other with the productid key. Inner join will retrieve rows common to both. In an outer join, unmatched rows in one or both tables can be returned with NULLs or blank data returned in the columns of the table not containing the join key.

About

Simi Talkar

Certified Data Analyst (DA-100 Power BI), pursuing Masters in University Of Michigan’s Applied Data Science Program https://www.linkedin.com/in/simi-talkar/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store