Author: Mayura Deshmukh
Links: Enterprise Solutions, Business Intelligence and Appliance Solutions
The objective of this project was to show the step-by-step process of building a data warehouse and business intelligence solution. Episode 1 discusses some of the questions we were hoping to answer at the end of this project. Episode 11 focuses on answering these questions using Toad Business Intelligence Suite.
Figure 1 explains how a sample Business Intelligence (BI) architecture can be implemented. For more information on setting it up, refer to the white paper Business Intelligence on Dell Quickstart Data Warehouse Appliance Using Toad Business Intelligence Suite.
Figure 1. BI Architecture for Toad BI Suite and QSDW
For more information on the Toad Business Intelligence Suite, or to download the trial version, see Toad Business Intelligence Suite.
Episode 1 of the series talks about getting answers for the following questions:
Using Toad Data Point Modeling tools, we built various views. Smaller datasets are easier to work with. So only the required data is selected, filters are applied, and the results are published to Toad Intelligence Central as snapshots or views as shown in Figure 2 and Figure 3. These snapshots or views are shared with the end users, who will be working on the dataset to build reports, charts, and find patterns.
Figure 2. Publish to Toad Intelligence Central from Toad Data Point
For example, to find the top 10 patent producing companies for a given year we require data from the Application, Mapping and Inventory table. Using Query builder in Data Point, we selected the required columns, and the query builder automatically created a query and displayed the results. These results were published to Toad Intelligence Central and shared with the users who will be working on the visualization using Toad Decision Point.
Using Toad Decision Point, business users can group data using the Dimensional Data View and can view the details of that data in which they are interested in Tabular view. Users can create rollups and easily drill down into data. Toad Decision Point allows you to easily copy graphs into your presentations, export the reports in Microsoft Excel, or publish them to Intelligence Central and share them with other users. Below are some of the reports and charts built using Toad Decision Point. For more details on building these reports refer to the paper Analyzing Patent Data in Dell Quickstart Data Warehouse Appliance Using Toad Business Intelligence Suite
Top 10 Patent producing companies for 2012
Top 10 patent producing cities for 2012
Top patent categories for a given company for 2012 (Dell)
Maximum time before a patent is awarded for 2012 per Patent Type
List of 10 patents for 2012 that took the most time to be awarded
Days taken to grant patent
Plasmids for transforming plant cells
Genetically transformed plants
Monitoring system for a ship's radiated noise
Flight control of missiles
Optical sensing arrangements
Method for amplification and detection of RNA sequences
Polynucleotides comprising intercalating entities
Semiclosed Brayton cycle power system with direct combustion heat transfer
Cloned DNA sequences hybridizable with genomic RNA of lymphadenopathy associated virus (LAV)
Ligation amplification of nucleic acid sequences
Average time it took for patents to be awarded in 2012
Top 10 patent authors for 2012
The most filed patent types in 2012
Number of patents awarded for each type every month/quarter for 2012
Episode 9 discusses removing duplicates from the data, but while building the reports, we realized there are more scenarios we need to consider for cleansing.
Figure 3. Duplicate Records
As shown in Figure 3, for some patents, the inventor has a middle initial listed, whereas for some patents, only the first name and last name are listed. Even though it is the same inventor, two different records are created in the inventor table and will be treated as two different inventors. Similarly, we observed duplicate records for company names spelled incorrectly, or different spellings were used, such as “Dell Products L.P.,” “Dell Products, LP,” or “Dell Products, Inc.”
Due to duplicate records, the data in the above reports may be inaccurate. Cleansing of such duplicates will be considered in the next phase of the project.