Understanding data involves understanding differences in data types. This article describes dichotomies, opposites or dualities that help us broaden our view about the types of data that may be important for our organization and in our jobs. These dichotomies highlight different aspects or perspectives that are important in understanding and analyzing data.
Here are some key questions to start with, before considering data:
- What question or problem are you trying to solve? What need are you trying to fill? Starting with the end in mind helps you evaluate data against a baseline need or goal. Without this focus, all data can become seductive – and lead to “analysis paralysis” without an end in mind.
- What type of data am I looking for to answer that question or need – what’s the data about? You might want data about a product, about different types of events over time (like growth, sales or employment patterns), or about perceptions people have about something.
Apply these questions to think about a case study you have in mind. Then, walk through the following common data dichotomies to see what might fit.
Quantitative and Qualitative Data
Data can be quantitative, consisting of numerical values that can be measured and analyzed using mathematical and statistical methods, or qualitative, consisting of non-numerical information such as text, images, observations that require interpretation and analysis. These data types differ in their characteristics, collection methods and analysis techniques. Here’s an overview of their differences.
- Quantitative data generally include objective numerical data that can be counted or measured. These data generally capture quantities, measurement and statistical relationships.
- Quantitative data may come from organization records, like employee data or sales and inventory records; or industry reports, like market size and buyer demographics. Data can also be collected using structured methods such as surveys with closed-ended questions, or by gathering concrete measurements.
- Quantitative data are often represented in the form of numbers, in tables, charts or graphs.
- Quantitative data analysis involves statistical techniques such as data summarization, descriptive statistics, and inferential statistics. It aims to identify patterns, trends, correlations or cause-and-effect relationships. For example, these data may provide useful information for business forecasts based on previous results.
Quantitative data is often well-organized and formatted according to a predefined structure and scheme. In highly structured data, relationships between data elements are clearly defined. These data sources generally follow a consistent and rigid structure, and can be more easily queried, sorted and analyzed using structured query languages (SQL) or other database tools.
- Qualitative data include descriptive and subjective data that capture qualities, opinions, experiences and behaviors.
- Qualitative data are typically collected through methods such as interviews, observations, focus groups or surveys with open-ended questions. Asking a customer to respond in text boxes is a common way to gather these types of data.
- Qualitative data are often presented in the form of narratives, transcripts, quotes or themes; however, they can also be analyzed and then presented in quantitative form, like a count of satisfied or unsatisfied employees, based on interviews or open-ended survey questions.
- Qualitative data analysis involves interpreting and making sense of the data by identifying patterns, themes, or meanings. It often involves coding, categorizing and interpreting textual or visual data. It can be used to conduct root cause analyses and to support strategic planning.
Qualitative data is often unstructured or semi-structured and may not adhere to a predefined schema or format. It could be generated in real-time from various sources such as social media feeds, text documents or multimedia content. This provides greater context for use, but may be challenging to process and analyze using traditional methods. New advanced analysis techniques include natural language processing, machine learning, or data mining for analysis. The analysis may involve identifying patterns, sentiment analysis, topic modeling or extracting insights from the data.
Internal and External
Internal data sources refer to data that are generated, collected and stored within an organization. These sources are typically proprietary and specific to the organization. Here are some examples of internal data sources that can be used to answer a variety of business questions:
- Databases: Organizations maintain databases and data warehouses to store structured data, such as customer information, sales data and inventory records.
- Enterprise Resource Planning (ERP) Systems: ERP systems integrate business processes and provide a centralized platform for managing data related to finance, human resources and supply chains. They may also include knowledge bases that help employees do their jobs better.
- Customer Relationship Management (CRM) Systems: CRM systems store data about customers, their interactions, preferences, purchase history and other relevant information.
- Transactional Systems: These systems record and store data related to daily business transactions, such as point-of-sale systems, online payment gateways and order management systems.
- Internal Analytics and Reporting Systems: Organizations often have specialized systems and tools for collecting and analyzing data, generating reports and tracking key performance indicators (KPIs). These may be complex systems customized to meet the unique business setting, or as simple as Excel spreadsheets held by one team.
- Records and File Management Systems: A lot of qualitative data may be held in different types of documents generated using different applications; as such, the file management system itself is a source of data.
External data sources refer to data that is obtained from outside the organization. These sources provide additional insights, context or information that complements internal data. Some examples of external data sources include:
- Publicly Available Datasets: These include government data, census data, public surveys, weather data, economic indicators and other publicly accessible information.
- Third-Party Data Providers: There are various companies and associations that specialize in collecting and aggregating data on specific topics, such as market research, industry data, consumer behavior data or social media data. Organizations can purchase or license this data for their analytical needs.
- Data Exchanges and Marketplaces: Online platforms exist where organizations can exchange or purchase data from other organizations. These platforms facilitate data sharing and collaboration between different entities.
- Partner and Vendor Data: Data obtained from business partners, suppliers, vendors or collaborators can provide valuable insights when integrated with internal data.
It is important to stay compliant with data privacy, security and legal regulations when using both internal and external data sources. Know where the data come from, and how they can be used.
Other Data Opposites
Here are other dichotomies to help you assess the types of data you both have and need to answer your organizational question or problem:
- Primary vs. Secondary: Primary data is collected firsthand for a specific purpose, often through surveys, interviews, experiments or observations. Primary data is often internal data. Secondary data, on the other hand, is pre-existing data collected by someone else for a different purpose and made available for analysis, such as government reports, or published studies.
- Historical (Retrospective) vs. Real-time (Prospective): Historical data refers to past records or events, allowing for retrospective analysis and trend identification. Real-time data, on the other hand, is generated and captured in the present moment, providing immediate insights and enabling real-time decision-making. So, historical data helps you understand what services or products may be growing or losing popularity over time, where real-time data tells you what is needed right now (e.g., through an inventory or ticketing system).
- Privacy vs. Utility: The dichotomy between privacy and utility concerns the trade-off between protecting individuals’ personal information and extracting value or insights from data. Balancing privacy concerns with the utility of data is crucial in ethical and responsible data practices.
These dichotomies provide different dimensions to consider when working with data, highlighting the diversity and complexity of data sources, types and characteristics. Understanding these aspects can help inform data collection, analysis and decision-making processes.
Learning About Data
There are several types of training programs you can take to understand data better – many of these subjects are available through Pryor workshops. Pryor’s diverse computer skills courses, “Using Business Analytics to Become a Goal-Oriented Manager” seminar, and “Data-Driven Decision Making and Analysis” seminar are great ways to get started in exploring these areas.
- Basic Data Analysis: A lot of data in workplaces continues to be held in Microsoft® Excel and many systems export data in Excel-compatible tables. Pryor offers several classes and online learning in this popular and powerful software tool.
- Strategic Planning and Indicator or Metrics Development: For most organizations, data are only important and used if they help solve a problem or support a decision. Strong training in strategic planning and metrics development helps fill this need.
- Machine Learning and Big Data Management: Machine learning training is useful for understanding how to build predictive models and make sense of complex datasets. If you’re interested in working with large-scale datasets, training in big data and data engineering can be valuable. Learning languages such as Python and SQL is also useful in this domain, as they help get insights from large data sets.
- Data Visualization: Data visualization training helps you learn how to present data effectively using charts, graphs, and other visual techniques. Such training can teach you how to identify patterns, trends, and outliers in data and communicate insights in a visually compelling manner. Tools like Crystal Reports, Power BI, and Spotfire are often used in data visualization; though strong Microsoft® Excel skills can still go a long way.
- Data Mining and Text Analytics: Data mining and text analytics training can help you uncover patterns and extract valuable insights from structured and unstructured data sources. These techniques include clustering, classification, natural language processing (NLP) and sentiment analysis.
- Data Ethics and Privacy: As data become increasingly important, understanding ethical considerations and privacy issues is crucial. Related training programs focus on data ethics, data governance, data privacy laws and responsible data handling practices.
Domain-Specific Training: Depending on your field of interest, you may benefit from domain-specific training. For example, if you work in finance or human resources, you can explore training in financial data analysis, or employee data. If you’re in healthcare, training in health informatics or medical data analysis may be relevant. Pryor offers a range of courses in these domains.