BUSINESS INTELLIGENCE VOCABULARY
Actionable insights to
drive growth for business
data science & analytics
Analysis or report created on the spot to analyze the specific business questions that are not clearly answered in a pre-built report.
Ad Hoc Analysis is performed on an existing data warehouse or transactional database and results can be in form of direct output of query or data visualization or static report or a combination of all of them based on business need.
Alternative data refers to information that is obtained from nontraditional sources, which are used by relevant stack holders(e.g investors, hedge funds) to gain an advantage or sentiment of the market.
Is an RDBMS built for the cloud(Amazon Web Services).
- It provides full compatibility with open-sourced databases e.g. MySql & PostgreSQL.
- It is high in performance and cost-effective at the same time.
- It delivers high performance and availability.
Amazon Redshift is a fully managed, highly scalable(up to petabytes) enterprise data warehouse service provided by Amazon Web Services (AWS). Users can easily set up, operate, and scale a data warehouse in the cloud. Columnar storage & advanced compression techniques provide high performance and scalability for data warehousing and business intelligence workloads.
A bar chart or a bar graph is useful to compare the data across different categories to identify patterns or trends.
The bars can be plotted vertically or horizontally; along with clusters or stacked pattern(each bar represent one category(the total value of that category) and the sections within that bar represent subcategories).
Business intelligence is an umbrella term of a set of processes and best practices for data gathering, data analysis, data visualization using best-fit architectures and technologies. BI practices creates a direct impact on data-driven decision-making processes.
In cloud Analysis; business intelligence or data analytics operations(e.g. Collect, Store & Analyze data) are performed over the cloud against the private, or on premises environments, that enable them to access and analyze data from anywhere & at any time. Cloud computing gives higher performance and ROI for large volumes of data and users.
A BI dashboard is a user interface / canvas to represent KPIs(Key Performance Indicators) and other relevant data in a clear and comprehensive way; graphically. It enables users to easily evaluate the current state of their business performance and make informed decisions.
Dashboard acts as a hub for monitoring and analyzing organizational performance and can be accessed through web browsers or mobile devices.
Transactions generate data; those data need to be accurate, consistent, and secure throughout its lifecycle; to achieve that there should be framework, policies and procedures and responsibilities for data. This management of data is known as Data Governance.
Data Lake is a centralized data storage repository that stores any data e.g. structure, semi-structured, or unstructured data in as-is format at any scale and can be organized using metadata tags. Data is transformed at a time of need that can be reporting, visualization, machine learning, and advanced analytics.
Data Mesh is a relatively new architectural approach for data management and analysis where in data ownership is distributed across different business units or domains. Business units or domain are responsible for managing their own data.
A data strategy is a plan that outlines in details; how an organization will use data as a core asset to achieve its business objectives.
It encompasses the policies, procedures, and technologies that are needed to manage and leverage data effectively. A well-defined data strategy is essential for organizations to make informed decisions, drive innovation, and enhance their competitive edge.
DW or DWH or EDW (Enterprise Data Warehouse) is central repository of integrated data from one or more disparate sources that can be analyzed to make more informed decisions.
Data integrated into a data warehouse from transactional systems (e.g. ERP, CRM) relational databases, and other data sources, typically on a regular cadence.
Data is stored in hierarchical dimensions and table format to Data Warehouse.
DAX Data Analysis Expressions (DAX) is a formula programming language used in Microsoft Power BI(Business Intelligence platform from Microsoft) to create measures, calculated columns, aggregations, and custom tables. It also has the capability to transform data.
A dimension table is a database table of a data warehouse, referring to information about measurable events known as facts. Facts(context of the data) are stored in the Fact Table. Facts stored in a dimension table are organized into categories or attributes.
In other words; it is a table of star or snowflake schema of a data warehouse.
Extract, Load and Transform refer to ELT. ETL is process of extracting and loading data from multiple sources to destination as it is. Transformation operation are carried out at the after loading data not before loading data. Later on this data can be used for analysis & reporting.
Extract, Transform and Load refer to ETL. The ETL process is to extract data from multiple sources, transform it as per business requirements, and load it to the central repository known as a data warehouse that can be used for analysis & reporting.
Fact Table is at the center of a data warehouse and stores measurements, metrics, or facts(context of the data) of business transactions. Fact Tables make it easy for analysis and aggregation of data stored from multiple sources.
Factless Fact Tables
A fact table which doesn’t connotation measures, and act as an interaction of dimensions. Factless Fact Tables are bifurcated; one is to capture an event and second is to describe conditions.
Simple example of possible hierarchy widely adopted by by all dimensional data warehouse is date dimension : Year > Quarter > Month > Week > Day
Incremental load is ETL process to load delta(difference) from source to destination. Based on the requirement it can be streaming incremental load or batch incremental load. Incremental load is faster in comparison to full load as it touches less data.
Key Performance Indicators
Key performance Indicator is value measured by organizations to evaluate the effectiveness of key business objectives.
Metadata narrates other data. Detail data of Data e.g. description, structure, references etc.
Online Analytical Processing is compounding methods of data to perform multidimensional analysis of transactional data.
Power BI is Business Intelligence and Data visualization tool from Microsoft Corporation.
Pentaho is Business Intelligence, Data Integration and OLAP and reporting service tool from Hitachi and now known as Hitachi Ventara
Qlik View and Qlik Sence are Business Intelligence and Data Visualization Tools.
The database schema is logical view(skeleton) of all or part of entier relational database.
The snowflake schema is an extended model of star schema where dimension tables are normalized that reduce data redundancy. This schema is commonly used for OLAP data warehouses. Snowflake schema saves storage space but impacts performance while executing the query results.
Star schema is a mature data modeling approach widely adopted by relational data warehouses and optimized for querying and analysis. It requires modelers to classify their model tables as either dimension or fact. Star Schema consists of one or more fact tables indexing any number of dimensional tables. The diagram of relationships between tables resembles a star shape, wherein the fact table is a central table and multiple dimension tables are surrounding it.
Talend is leading open source ETL and Data Integration tool. Talend Open Studio is FREE and Enterprise versions.
Tableau is leading cloud base Business Intelligence and Data Visualization tool.