• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case AskWhy Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

a research is empirical because

Home Market Research

Empirical Research: Definition, Methods, Types and Examples

What is Empirical Research

Content Index

Empirical research: Definition

Empirical research: origin, quantitative research methods, qualitative research methods, steps for conducting empirical research, empirical research methodology cycle, advantages of empirical research, disadvantages of empirical research, why is there a need for empirical research.

Empirical research is defined as any research where conclusions of the study is strictly drawn from concretely empirical evidence, and therefore “verifiable” evidence.

This empirical evidence can be gathered using quantitative market research and  qualitative market research  methods.

For example: A research is being conducted to find out if listening to happy music in the workplace while working may promote creativity? An experiment is conducted by using a music website survey on a set of audience who are exposed to happy music and another set who are not listening to music at all, and the subjects are then observed. The results derived from such a research will give empirical evidence if it does promote creativity or not.

LEARN ABOUT: Behavioral Research

You must have heard the quote” I will not believe it unless I see it”. This came from the ancient empiricists, a fundamental understanding that powered the emergence of medieval science during the renaissance period and laid the foundation of modern science, as we know it today. The word itself has its roots in greek. It is derived from the greek word empeirikos which means “experienced”.

In today’s world, the word empirical refers to collection of data using evidence that is collected through observation or experience or by using calibrated scientific instruments. All of the above origins have one thing in common which is dependence of observation and experiments to collect data and test them to come up with conclusions.

LEARN ABOUT: Causal Research

Types and methodologies of empirical research

Empirical research can be conducted and analysed using qualitative or quantitative methods.

  • Quantitative research : Quantitative research methods are used to gather information through numerical data. It is used to quantify opinions, behaviors or other defined variables . These are predetermined and are in a more structured format. Some of the commonly used methods are survey, longitudinal studies, polls, etc
  • Qualitative research:   Qualitative research methods are used to gather non numerical data.  It is used to find meanings, opinions, or the underlying reasons from its subjects. These methods are unstructured or semi structured. The sample size for such a research is usually small and it is a conversational type of method to provide more insight or in-depth information about the problem Some of the most popular forms of methods are focus groups, experiments, interviews, etc.

Data collected from these will need to be analysed. Empirical evidence can also be analysed either quantitatively and qualitatively. Using this, the researcher can answer empirical questions which have to be clearly defined and answerable with the findings he has got. The type of research design used will vary depending on the field in which it is going to be used. Many of them might choose to do a collective research involving quantitative and qualitative method to better answer questions which cannot be studied in a laboratory setting.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

Quantitative research methods aid in analyzing the empirical evidence gathered. By using these a researcher can find out if his hypothesis is supported or not.

  • Survey research: Survey research generally involves a large audience to collect a large amount of data. This is a quantitative method having a predetermined set of closed questions which are pretty easy to answer. Because of the simplicity of such a method, high responses are achieved. It is one of the most commonly used methods for all kinds of research in today’s world.

Previously, surveys were taken face to face only with maybe a recorder. However, with advancement in technology and for ease, new mediums such as emails , or social media have emerged.

For example: Depletion of energy resources is a growing concern and hence there is a need for awareness about renewable energy. According to recent studies, fossil fuels still account for around 80% of energy consumption in the United States. Even though there is a rise in the use of green energy every year, there are certain parameters because of which the general population is still not opting for green energy. In order to understand why, a survey can be conducted to gather opinions of the general population about green energy and the factors that influence their choice of switching to renewable energy. Such a survey can help institutions or governing bodies to promote appropriate awareness and incentive schemes to push the use of greener energy.

Learn more: Renewable Energy Survey Template Descriptive Research vs Correlational Research

  • Experimental research: In experimental research , an experiment is set up and a hypothesis is tested by creating a situation in which one of the variable is manipulated. This is also used to check cause and effect. It is tested to see what happens to the independent variable if the other one is removed or altered. The process for such a method is usually proposing a hypothesis, experimenting on it, analyzing the findings and reporting the findings to understand if it supports the theory or not.

For example: A particular product company is trying to find what is the reason for them to not be able to capture the market. So the organisation makes changes in each one of the processes like manufacturing, marketing, sales and operations. Through the experiment they understand that sales training directly impacts the market coverage for their product. If the person is trained well, then the product will have better coverage.

  • Correlational research: Correlational research is used to find relation between two set of variables . Regression analysis is generally used to predict outcomes of such a method. It can be positive, negative or neutral correlation.

LEARN ABOUT: Level of Analysis

For example: Higher educated individuals will get higher paying jobs. This means higher education enables the individual to high paying job and less education will lead to lower paying jobs.

  • Longitudinal study: Longitudinal study is used to understand the traits or behavior of a subject under observation after repeatedly testing the subject over a period of time. Data collected from such a method can be qualitative or quantitative in nature.

For example: A research to find out benefits of exercise. The target is asked to exercise everyday for a particular period of time and the results show higher endurance, stamina, and muscle growth. This supports the fact that exercise benefits an individual body.

  • Cross sectional: Cross sectional study is an observational type of method, in which a set of audience is observed at a given point in time. In this type, the set of people are chosen in a fashion which depicts similarity in all the variables except the one which is being researched. This type does not enable the researcher to establish a cause and effect relationship as it is not observed for a continuous time period. It is majorly used by healthcare sector or the retail industry.

For example: A medical study to find the prevalence of under-nutrition disorders in kids of a given population. This will involve looking at a wide range of parameters like age, ethnicity, location, incomes  and social backgrounds. If a significant number of kids coming from poor families show under-nutrition disorders, the researcher can further investigate into it. Usually a cross sectional study is followed by a longitudinal study to find out the exact reason.

  • Causal-Comparative research : This method is based on comparison. It is mainly used to find out cause-effect relationship between two variables or even multiple variables.

For example: A researcher measured the productivity of employees in a company which gave breaks to the employees during work and compared that to the employees of the company which did not give breaks at all.

LEARN ABOUT: Action Research

Some research questions need to be analysed qualitatively, as quantitative methods are not applicable there. In many cases, in-depth information is needed or a researcher may need to observe a target audience behavior, hence the results needed are in a descriptive analysis form. Qualitative research results will be descriptive rather than predictive. It enables the researcher to build or support theories for future potential quantitative research. In such a situation qualitative research methods are used to derive a conclusion to support the theory or hypothesis being studied.

LEARN ABOUT: Qualitative Interview

  • Case study: Case study method is used to find more information through carefully analyzing existing cases. It is very often used for business research or to gather empirical evidence for investigation purpose. It is a method to investigate a problem within its real life context through existing cases. The researcher has to carefully analyse making sure the parameter and variables in the existing case are the same as to the case that is being investigated. Using the findings from the case study, conclusions can be drawn regarding the topic that is being studied.

For example: A report mentioning the solution provided by a company to its client. The challenges they faced during initiation and deployment, the findings of the case and solutions they offered for the problems. Such case studies are used by most companies as it forms an empirical evidence for the company to promote in order to get more business.

  • Observational method:   Observational method is a process to observe and gather data from its target. Since it is a qualitative method it is time consuming and very personal. It can be said that observational research method is a part of ethnographic research which is also used to gather empirical evidence. This is usually a qualitative form of research, however in some cases it can be quantitative as well depending on what is being studied.

For example: setting up a research to observe a particular animal in the rain-forests of amazon. Such a research usually take a lot of time as observation has to be done for a set amount of time to study patterns or behavior of the subject. Another example used widely nowadays is to observe people shopping in a mall to figure out buying behavior of consumers.

  • One-on-one interview: Such a method is purely qualitative and one of the most widely used. The reason being it enables a researcher get precise meaningful data if the right questions are asked. It is a conversational method where in-depth data can be gathered depending on where the conversation leads.

For example: A one-on-one interview with the finance minister to gather data on financial policies of the country and its implications on the public.

  • Focus groups: Focus groups are used when a researcher wants to find answers to why, what and how questions. A small group is generally chosen for such a method and it is not necessary to interact with the group in person. A moderator is generally needed in case the group is being addressed in person. This is widely used by product companies to collect data about their brands and the product.

For example: A mobile phone manufacturer wanting to have a feedback on the dimensions of one of their models which is yet to be launched. Such studies help the company meet the demand of the customer and position their model appropriately in the market.

  • Text analysis: Text analysis method is a little new compared to the other types. Such a method is used to analyse social life by going through images or words used by the individual. In today’s world, with social media playing a major part of everyone’s life, such a method enables the research to follow the pattern that relates to his study.

For example: A lot of companies ask for feedback from the customer in detail mentioning how satisfied are they with their customer support team. Such data enables the researcher to take appropriate decisions to make their support team better.

Sometimes a combination of the methods is also needed for some questions that cannot be answered using only one type of method especially when a researcher needs to gain a complete understanding of complex subject matter.

We recently published a blog that talks about examples of qualitative data in education ; why don’t you check it out for more ideas?

Learn More: Data Collection Methods: Types & Examples

Since empirical research is based on observation and capturing experiences, it is important to plan the steps to conduct the experiment and how to analyse it. This will enable the researcher to resolve problems or obstacles which can occur during the experiment.

Step #1: Define the purpose of the research

This is the step where the researcher has to answer questions like what exactly do I want to find out? What is the problem statement? Are there any issues in terms of the availability of knowledge, data, time or resources. Will this research be more beneficial than what it will cost.

Before going ahead, a researcher has to clearly define his purpose for the research and set up a plan to carry out further tasks.

Step #2 : Supporting theories and relevant literature

The researcher needs to find out if there are theories which can be linked to his research problem . He has to figure out if any theory can help him support his findings. All kind of relevant literature will help the researcher to find if there are others who have researched this before, or what are the problems faced during this research. The researcher will also have to set up assumptions and also find out if there is any history regarding his research problem

Step #3: Creation of Hypothesis and measurement

Before beginning the actual research he needs to provide himself a working hypothesis or guess what will be the probable result. Researcher has to set up variables, decide the environment for the research and find out how can he relate between the variables.

Researcher will also need to define the units of measurements, tolerable degree for errors, and find out if the measurement chosen will be acceptable by others.

Step #4: Methodology, research design and data collection

In this step, the researcher has to define a strategy for conducting his research. He has to set up experiments to collect data which will enable him to propose the hypothesis. The researcher will decide whether he will need experimental or non experimental method for conducting the research. The type of research design will vary depending on the field in which the research is being conducted. Last but not the least, the researcher will have to find out parameters that will affect the validity of the research design. Data collection will need to be done by choosing appropriate samples depending on the research question. To carry out the research, he can use one of the many sampling techniques. Once data collection is complete, researcher will have empirical data which needs to be analysed.

LEARN ABOUT: Best Data Collection Tools

Step #5: Data Analysis and result

Data analysis can be done in two ways, qualitatively and quantitatively. Researcher will need to find out what qualitative method or quantitative method will be needed or will he need a combination of both. Depending on the unit of analysis of his data, he will know if his hypothesis is supported or rejected. Analyzing this data is the most important part to support his hypothesis.

Step #6: Conclusion

A report will need to be made with the findings of the research. The researcher can give the theories and literature that support his research. He can make suggestions or recommendations for further research on his topic.

Empirical research methodology cycle

A.D. de Groot, a famous dutch psychologist and a chess expert conducted some of the most notable experiments using chess in the 1940’s. During his study, he came up with a cycle which is consistent and now widely used to conduct empirical research. It consists of 5 phases with each phase being as important as the next one. The empirical cycle captures the process of coming up with hypothesis about how certain subjects work or behave and then testing these hypothesis against empirical data in a systematic and rigorous approach. It can be said that it characterizes the deductive approach to science. Following is the empirical cycle.

  • Observation: At this phase an idea is sparked for proposing a hypothesis. During this phase empirical data is gathered using observation. For example: a particular species of flower bloom in a different color only during a specific season.
  • Induction: Inductive reasoning is then carried out to form a general conclusion from the data gathered through observation. For example: As stated above it is observed that the species of flower blooms in a different color during a specific season. A researcher may ask a question “does the temperature in the season cause the color change in the flower?” He can assume that is the case, however it is a mere conjecture and hence an experiment needs to be set up to support this hypothesis. So he tags a few set of flowers kept at a different temperature and observes if they still change the color?
  • Deduction: This phase helps the researcher to deduce a conclusion out of his experiment. This has to be based on logic and rationality to come up with specific unbiased results.For example: In the experiment, if the tagged flowers in a different temperature environment do not change the color then it can be concluded that temperature plays a role in changing the color of the bloom.
  • Testing: This phase involves the researcher to return to empirical methods to put his hypothesis to the test. The researcher now needs to make sense of his data and hence needs to use statistical analysis plans to determine the temperature and bloom color relationship. If the researcher finds out that most flowers bloom a different color when exposed to the certain temperature and the others do not when the temperature is different, he has found support to his hypothesis. Please note this not proof but just a support to his hypothesis.
  • Evaluation: This phase is generally forgotten by most but is an important one to keep gaining knowledge. During this phase the researcher puts forth the data he has collected, the support argument and his conclusion. The researcher also states the limitations for the experiment and his hypothesis and suggests tips for others to pick it up and continue a more in-depth research for others in the future. LEARN MORE: Population vs Sample

LEARN MORE: Population vs Sample

There is a reason why empirical research is one of the most widely used method. There are a few advantages associated with it. Following are a few of them.

  • It is used to authenticate traditional research through various experiments and observations.
  • This research methodology makes the research being conducted more competent and authentic.
  • It enables a researcher understand the dynamic changes that can happen and change his strategy accordingly.
  • The level of control in such a research is high so the researcher can control multiple variables.
  • It plays a vital role in increasing internal validity .

Even though empirical research makes the research more competent and authentic, it does have a few disadvantages. Following are a few of them.

  • Such a research needs patience as it can be very time consuming. The researcher has to collect data from multiple sources and the parameters involved are quite a few, which will lead to a time consuming research.
  • Most of the time, a researcher will need to conduct research at different locations or in different environments, this can lead to an expensive affair.
  • There are a few rules in which experiments can be performed and hence permissions are needed. Many a times, it is very difficult to get certain permissions to carry out different methods of this research.
  • Collection of data can be a problem sometimes, as it has to be collected from a variety of sources through different methods.

LEARN ABOUT:  Social Communication Questionnaire

Empirical research is important in today’s world because most people believe in something only that they can see, hear or experience. It is used to validate multiple hypothesis and increase human knowledge and continue doing it to keep advancing in various fields.

For example: Pharmaceutical companies use empirical research to try out a specific drug on controlled groups or random groups to study the effect and cause. This way, they prove certain theories they had proposed for the specific drug. Such research is very important as sometimes it can lead to finding a cure for a disease that has existed for many years. It is useful in science and many other fields like history, social sciences, business, etc.

LEARN ABOUT: 12 Best Tools for Researchers

With the advancement in today’s world, empirical research has become critical and a norm in many fields to support their hypothesis and gain more knowledge. The methods mentioned above are very useful for carrying out such research. However, a number of new methods will keep coming up as the nature of new investigative questions keeps getting unique or changing.

Create a single source of real data with a built-for-insights platform. Store past data, add nuggets of insights, and import research data from various sources into a CRM for insights. Build on ever-growing research with a real-time dashboard in a unified research management platform to turn insights into knowledge.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

a research is empirical because

Customer Experience Lessons from 13,000 Feet — Tuesday CX Thoughts

Aug 20, 2024

insight

Insight: Definition & meaning, types and examples

Aug 19, 2024

employee loyalty

Employee Loyalty: Strategies for Long-Term Business Success 

Jotform vs SurveyMonkey

Jotform vs SurveyMonkey: Which Is Best in 2024

Aug 15, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Tuesday CX Thoughts (TCXT)
  • Uncategorized
  • What’s Coming Up
  • Workforce Intelligence

a research is empirical because

Summer is here, and so is the sale. Get a yearly plan with up to 65% off today! 🌴🌞

  • Form Builder
  • Survey Maker
  • AI Form Generator
  • AI Survey Tool
  • AI Quiz Maker
  • Store Builder
  • WordPress Plugin

a research is empirical because

HubSpot CRM

a research is empirical because

Google Sheets

a research is empirical because

Google Analytics

a research is empirical because

Microsoft Excel

a research is empirical because

  • Popular Forms
  • Job Application Form Template
  • Rental Application Form Template
  • Hotel Accommodation Form Template
  • Online Registration Form Template
  • Employment Application Form Template
  • Application Forms
  • Booking Forms
  • Consent Forms
  • Contact Forms
  • Donation Forms
  • Customer Satisfaction Surveys
  • Employee Satisfaction Surveys
  • Evaluation Surveys
  • Feedback Surveys
  • Market Research Surveys
  • Personality Quiz Template
  • Geography Quiz Template
  • Math Quiz Template
  • Science Quiz Template
  • Vocabulary Quiz Template

Try without registration Quick Start

Read engaging stories, how-to guides, learn about forms.app features.

Inspirational ready-to-use templates for getting started fast and powerful.

Spot-on guides on how to use forms.app and make the most out of it.

a research is empirical because

See the technical measures we take and learn how we keep your data safe and secure.

  • Integrations
  • Help Center
  • Sign In Sign Up Free
  • What is empirical research: Methods, types & examples

What is empirical research: Methods, types & examples

Defne Çobanoğlu

Having opinions on matters based on observation is okay sometimes. Same as having theories on the subject you want to solve. However, some theories need to be tested. Just like Robert Oppenheimer says, “Theory will take you only so far .” 

In that case, when you have your research question ready and you want to make sure it is correct, the next step would be experimentation. Because only then you can test your ideas and collect tangible information. Now, let us start with the empirical research definition:

  • What is empirical research?

Empirical research is a research type where the aim of the study is based on finding concrete and provable evidence . The researcher using this method to draw conclusions can use both quantitative and qualitative methods. Different than theoretical research, empirical research uses scientific experimentation and investigation. 

Using experimentation makes sense when you need to have tangible evidence to act on whatever you are planning to do. As the researcher, you can be a marketer who is planning on creating a new ad for the target audience, or you can be an educator who wants the best for the students. No matter how big or small, data gathered from the real world using this research helps break down the question at hand. 

  • When to use empirical research?

Empirical research methods are used when the researcher needs to gather data analysis on direct, observable, and measurable data. Research findings are a great way to make grounded ideas. Here are some situations when one may need to do empirical research:

1. When quantitative or qualitative data is needed

There are times when a researcher, marketer, or producer needs to gather data on specific research questions to make an informed decision. And the concrete data gathered in the research process gives a good starting point.

2. When you need to test a hypothesis

When you have a hypothesis on a subject, you can test the hypothesis through observation or experiment. Making a planned study is a great way to collect information and test whether or not your hypothesis is correct.

3. When you want to establish causality

Experimental research is a good way to explore whether or not there is any correlation between two variables. Researchers usually establish causality by changing a variable and observing if the independent variable changes accordingly.

  • Types of empirical research

The aim of empirical research is to collect information about a subject from the people by doing experimentation and other data collection methods. However, the methods and data collected are divided into two groups: one collects numerical data, and the other one collects opinion-like data. Let us see the difference between these two types:

Quantitative research

Quantitative research methods are used to collect data in a numerical way. Therefore, the results gathered by these methods will be numbers, statistics, charts, etc. The results can be used to quantify behaviors, opinions, and other variables. Quantitative research methods are surveys, questionnaires, and experimental research.

Qualitiative research

Qualitative research methods are not used to collect numerical answers, instead, they are used to collect the participants’ reasons, opinions, and other meaningful aspects. Qualitative research methods include case studies, observations, interviews, focus groups, and text analysis.

  • 5 steps to conduct empirical research

Necessary steps for empirical research

Necessary steps for empirical research

When you want to collect direct and concrete data on a subject, empirical research is a great way to go. And, just like every other project and research, it is best to have a clear structure in mind. This is even more important in studies that may take a long time, such as experiments that take years. Let us look at a clear plan on how to do empirical research:

1. Define the research question

The very first step of every study is to have the question you will explore ready. Because you do not want to change your mind in the middle of the study after investing and spending time on the experimentation.

2. Go through relevant literature

This is the step where you sit down and do a desk research where you gather relevant data and see if other researchers have tried to explore similar research questions. If so, you can see how well they were able to answer the question or what kind of difficulties they faced during the research process.

3. Decide on the methodology

Once you are done going through the relevant literature, you can decide on which method or methods you can use. The appropriate methods are observation, experimentation, surveys, interviews, focus groups, etc.

4. Do data analysis

When you get to this step, it means you have successfully gathered enough data to make a data analysis. Now, all you need to do is look at the data you collected and make an informed analysis.

5. Conclusion

This is the last step, where you are finished with the experimentation and data analysis process. Now, it is time to decide what to do with this information. You can publish a paper and make informed decisions about whatever your goal is.

  • Empirical research methodologies

Some essential methodologies to conduct empirical research

Some essential methodologies to conduct empirical research

The aim of this type of research is to explore brand-new evidence and facts. Therefore, the methods should be primary and gathered in real life, directly from the people. There is more than one method for this goal, and it is up to the researcher to use which one(s). Let us see the methods of empirical research: 

  • Observation

The method of observation is a great way to collect information on people without the effect of interference. The researcher can choose the appropriate area, time, or situation and observe the people and their interactions with one another. The researcher can be just an outside observer or can be a participant as an observer or a full participant.

  • Experimentation

The experimentation process can be done in the real world by intervening in some elements to unify the environment for all participants. This method can also be done in a laboratory environment. The experimentation process is good for being able to change the variables according to the aim of the study.

The case study method is done by making an in-depth analysis of already existing cases. When the parameters and variables are similar to the research question at hand, it is wise to go through what was researched before.

  • Focus groups

The case study method is done by using a group of individuals or multiple groups and using their opinions, characteristics, and responses. The scientists gather the data from this group and generalize it to the whole population.

Surveys are an effective way to gather data directly from people. It is a systematic approach to collecting information. If it is done in an online setting as an online survey , it would be even easier to reach out to people and ask their opinions in open-ended or close-ended questions.

Interviews are similar to surveys as you are using questions to collect information and opinions of the people. Unlike a survey, this process is done face-to-face, as a phone call, or as a video call.

  • Advantages of empirical research

Empirical research is effective for many reasons, and helps researchers from numerous fields. Here are some advantages of empirical research to have in mind for your next research:

  • Empirical research improves the internal validity of the study.
  • Empirical evidence gathered from the study is used to authenticate the research question.
  • Collecting provable evidence is important for the success of the study.
  • The researcher is able to make informed decisions based on the data collected using empirical research.
  • Disadvantages of empirical research

After learning about the positive aspects of empirical research, it is time to mention the negative aspects. Because this type may not be suitable for everyone and the researcher should be mindful of the disadvantages of empirical research. Here are the disadvantages of empirical research:

  • As it is similar to other research types, a case study where experimentation is included will be time-consuming no matter what. It has more steps and variables than concluding a secondary research.
  • There are a lot of variables that need to be controlled and considered. Therefore, it may be a challenging task to be mindful of all the details.
  • Doing evidence-based research can be expensive if you need to complete it on a large scale.
  • When you are conducting an experiment, you may need some waivers and permissions.
  • Frequently asked questions about empirical research

Empirical research is one of the many research types, and there may be some questions in mind about its similarities and differences to other research types.

Is empirical research qualitative or quantitative?

The data collected by empirical research can be qualitative, quantitative, or a mix of both. It is up to the aim of researcher to what kind of data is needed and searched for.

Is empirical research the same as quantitative research?

As quantitative research heavily relies on data collection methods of observation and experimentation, it is, in nature, an empirical study. Some professors may even use the terms interchangeably. However, that does not mean that empirical research is only a quantitative one.

What is the difference between theoretical and empirical research?

Empirical studies are based on data collection to prove theories or answer questions, and it is done by using methods such as observation and experimentation. Therefore, empirical research relies on finding evidence that backs up theories. On the other hand, theoretical research relies on theorizing on empirical research data and trying to make connections and correlations.

What is the difference between conceptual and empirical research?

Conceptual research is about thoughts and ideas and does not involve any kind of experimentation. Empirical research, on the other hand, works with provable data and hard evidence.

What is the difference between empirical vs applied research?

Some scientists may use these two terms interchangeably however, there is a difference between them. Applied research involves applying theories to solve real-life problems. On the other hand, empirical research involves the obtaining and analysis of data to test hypotheses and theories.

  • Final words

Empirical research is a good means when the goal of your study is to find concrete data to go with. You may need to do empirical research when you need to test a theory, establish causality, or need qualitative/quantitative data. For example, you are a scientist and want to know if certain colors have an effect on people’s moods, or you are a marketer and want to test your theory on ad places on websites. 

In both scenarios, you can collect information by using empirical research methods and make informed decisions afterward. These are just the two of empirical research examples. This research type can be applied to many areas of work life and social sciences. Lastly, for all your research needs, you can visit forms.app to use its many useful features and over 1000 form and survey templates!

Defne is a content writer at forms.app. She is also a translator specializing in literary translation. Defne loves reading, writing, and translating professionally and as a hobby. Her expertise lies in survey research, research methodologies, content writing, and translation.

  • Form Features
  • Data Collection

Table of Contents

Related posts.

9 online form design tips to optimize & boost conversions

9 online form design tips to optimize & boost conversions

Abi Ramamoorthy

Informed consent vs. waiver - What is the difference?

Informed consent vs. waiver - What is the difference?

7 Common mistakes to avoid in your customer satisfaction surveys

7 Common mistakes to avoid in your customer satisfaction surveys

Sophia Young

What is Empirical Research? Definition, Methods, Examples

Appinio Research · 09.02.2024 · 36min read

What is Empirical Research Definition Methods Examples

Ever wondered how we gather the facts, unveil hidden truths, and make informed decisions in a world filled with questions? Empirical research holds the key.

In this guide, we'll delve deep into the art and science of empirical research, unraveling its methods, mysteries, and manifold applications. From defining the core principles to mastering data analysis and reporting findings, we're here to equip you with the knowledge and tools to navigate the empirical landscape.

What is Empirical Research?

Empirical research is the cornerstone of scientific inquiry, providing a systematic and structured approach to investigating the world around us. It is the process of gathering and analyzing empirical or observable data to test hypotheses, answer research questions, or gain insights into various phenomena. This form of research relies on evidence derived from direct observation or experimentation, allowing researchers to draw conclusions based on real-world data rather than purely theoretical or speculative reasoning.

Characteristics of Empirical Research

Empirical research is characterized by several key features:

  • Observation and Measurement : It involves the systematic observation or measurement of variables, events, or behaviors.
  • Data Collection : Researchers collect data through various methods, such as surveys, experiments, observations, or interviews.
  • Testable Hypotheses : Empirical research often starts with testable hypotheses that are evaluated using collected data.
  • Quantitative or Qualitative Data : Data can be quantitative (numerical) or qualitative (non-numerical), depending on the research design.
  • Statistical Analysis : Quantitative data often undergo statistical analysis to determine patterns , relationships, or significance.
  • Objectivity and Replicability : Empirical research strives for objectivity, minimizing researcher bias . It should be replicable, allowing other researchers to conduct the same study to verify results.
  • Conclusions and Generalizations : Empirical research generates findings based on data and aims to make generalizations about larger populations or phenomena.

Importance of Empirical Research

Empirical research plays a pivotal role in advancing knowledge across various disciplines. Its importance extends to academia, industry, and society as a whole. Here are several reasons why empirical research is essential:

  • Evidence-Based Knowledge : Empirical research provides a solid foundation of evidence-based knowledge. It enables us to test hypotheses, confirm or refute theories, and build a robust understanding of the world.
  • Scientific Progress : In the scientific community, empirical research fuels progress by expanding the boundaries of existing knowledge. It contributes to the development of theories and the formulation of new research questions.
  • Problem Solving : Empirical research is instrumental in addressing real-world problems and challenges. It offers insights and data-driven solutions to complex issues in fields like healthcare, economics, and environmental science.
  • Informed Decision-Making : In policymaking, business, and healthcare, empirical research informs decision-makers by providing data-driven insights. It guides strategies, investments, and policies for optimal outcomes.
  • Quality Assurance : Empirical research is essential for quality assurance and validation in various industries, including pharmaceuticals, manufacturing, and technology. It ensures that products and processes meet established standards.
  • Continuous Improvement : Businesses and organizations use empirical research to evaluate performance, customer satisfaction , and product effectiveness. This data-driven approach fosters continuous improvement and innovation.
  • Human Advancement : Empirical research in fields like medicine and psychology contributes to the betterment of human health and well-being. It leads to medical breakthroughs, improved therapies, and enhanced psychological interventions.
  • Critical Thinking and Problem Solving : Engaging in empirical research fosters critical thinking skills, problem-solving abilities, and a deep appreciation for evidence-based decision-making.

Empirical research empowers us to explore, understand, and improve the world around us. It forms the bedrock of scientific inquiry and drives progress in countless domains, shaping our understanding of both the natural and social sciences.

How to Conduct Empirical Research?

So, you've decided to dive into the world of empirical research. Let's begin by exploring the crucial steps involved in getting started with your research project.

1. Select a Research Topic

Selecting the right research topic is the cornerstone of a successful empirical study. It's essential to choose a topic that not only piques your interest but also aligns with your research goals and objectives. Here's how to go about it:

  • Identify Your Interests : Start by reflecting on your passions and interests. What topics fascinate you the most? Your enthusiasm will be your driving force throughout the research process.
  • Brainstorm Ideas : Engage in brainstorming sessions to generate potential research topics. Consider the questions you've always wanted to answer or the issues that intrigue you.
  • Relevance and Significance : Assess the relevance and significance of your chosen topic. Does it contribute to existing knowledge? Is it a pressing issue in your field of study or the broader community?
  • Feasibility : Evaluate the feasibility of your research topic. Do you have access to the necessary resources, data, and participants (if applicable)?

2. Formulate Research Questions

Once you've narrowed down your research topic, the next step is to formulate clear and precise research questions . These questions will guide your entire research process and shape your study's direction. To create effective research questions:

  • Specificity : Ensure that your research questions are specific and focused. Vague or overly broad questions can lead to inconclusive results.
  • Relevance : Your research questions should directly relate to your chosen topic. They should address gaps in knowledge or contribute to solving a particular problem.
  • Testability : Ensure that your questions are testable through empirical methods. You should be able to gather data and analyze it to answer these questions.
  • Avoid Bias : Craft your questions in a way that avoids leading or biased language. Maintain neutrality to uphold the integrity of your research.

3. Review Existing Literature

Before you embark on your empirical research journey, it's essential to immerse yourself in the existing body of literature related to your chosen topic. This step, often referred to as a literature review, serves several purposes:

  • Contextualization : Understand the historical context and current state of research in your field. What have previous studies found, and what questions remain unanswered?
  • Identifying Gaps : Identify gaps or areas where existing research falls short. These gaps will help you formulate meaningful research questions and hypotheses.
  • Theory Development : If your study is theoretical, consider how existing theories apply to your topic. If it's empirical, understand how previous studies have approached data collection and analysis.
  • Methodological Insights : Learn from the methodologies employed in previous research. What methods were successful, and what challenges did researchers face?

4. Define Variables

Variables are fundamental components of empirical research. They are the factors or characteristics that can change or be manipulated during your study. Properly defining and categorizing variables is crucial for the clarity and validity of your research. Here's what you need to know:

  • Independent Variables : These are the variables that you, as the researcher, manipulate or control. They are the "cause" in cause-and-effect relationships.
  • Dependent Variables : Dependent variables are the outcomes or responses that you measure or observe. They are the "effect" influenced by changes in independent variables.
  • Operational Definitions : To ensure consistency and clarity, provide operational definitions for your variables. Specify how you will measure or manipulate each variable.
  • Control Variables : In some studies, controlling for other variables that may influence your dependent variable is essential. These are known as control variables.

Understanding these foundational aspects of empirical research will set a solid foundation for the rest of your journey. Now that you've grasped the essentials of getting started, let's delve deeper into the intricacies of research design.

Empirical Research Design

Now that you've selected your research topic, formulated research questions, and defined your variables, it's time to delve into the heart of your empirical research journey – research design . This pivotal step determines how you will collect data and what methods you'll employ to answer your research questions. Let's explore the various facets of research design in detail.

Types of Empirical Research

Empirical research can take on several forms, each with its own unique approach and methodologies. Understanding the different types of empirical research will help you choose the most suitable design for your study. Here are some common types:

  • Experimental Research : In this type, researchers manipulate one or more independent variables to observe their impact on dependent variables. It's highly controlled and often conducted in a laboratory setting.
  • Observational Research : Observational research involves the systematic observation of subjects or phenomena without intervention. Researchers are passive observers, documenting behaviors, events, or patterns.
  • Survey Research : Surveys are used to collect data through structured questionnaires or interviews. This method is efficient for gathering information from a large number of participants.
  • Case Study Research : Case studies focus on in-depth exploration of one or a few cases. Researchers gather detailed information through various sources such as interviews, documents, and observations.
  • Qualitative Research : Qualitative research aims to understand behaviors, experiences, and opinions in depth. It often involves open-ended questions, interviews, and thematic analysis.
  • Quantitative Research : Quantitative research collects numerical data and relies on statistical analysis to draw conclusions. It involves structured questionnaires, experiments, and surveys.

Your choice of research type should align with your research questions and objectives. Experimental research, for example, is ideal for testing cause-and-effect relationships, while qualitative research is more suitable for exploring complex phenomena.

Experimental Design

Experimental research is a systematic approach to studying causal relationships. It's characterized by the manipulation of one or more independent variables while controlling for other factors. Here are some key aspects of experimental design:

  • Control and Experimental Groups : Participants are randomly assigned to either a control group or an experimental group. The independent variable is manipulated for the experimental group but not for the control group.
  • Randomization : Randomization is crucial to eliminate bias in group assignment. It ensures that each participant has an equal chance of being in either group.
  • Hypothesis Testing : Experimental research often involves hypothesis testing. Researchers formulate hypotheses about the expected effects of the independent variable and use statistical analysis to test these hypotheses.

Observational Design

Observational research entails careful and systematic observation of subjects or phenomena. It's advantageous when you want to understand natural behaviors or events. Key aspects of observational design include:

  • Participant Observation : Researchers immerse themselves in the environment they are studying. They become part of the group being observed, allowing for a deep understanding of behaviors.
  • Non-Participant Observation : In non-participant observation, researchers remain separate from the subjects. They observe and document behaviors without direct involvement.
  • Data Collection Methods : Observational research can involve various data collection methods, such as field notes, video recordings, photographs, or coding of observed behaviors.

Survey Design

Surveys are a popular choice for collecting data from a large number of participants. Effective survey design is essential to ensure the validity and reliability of your data. Consider the following:

  • Questionnaire Design : Create clear and concise questions that are easy for participants to understand. Avoid leading or biased questions.
  • Sampling Methods : Decide on the appropriate sampling method for your study, whether it's random, stratified, or convenience sampling.
  • Data Collection Tools : Choose the right tools for data collection, whether it's paper surveys, online questionnaires, or face-to-face interviews.

Case Study Design

Case studies are an in-depth exploration of one or a few cases to gain a deep understanding of a particular phenomenon. Key aspects of case study design include:

  • Single Case vs. Multiple Case Studies : Decide whether you'll focus on a single case or multiple cases. Single case studies are intensive and allow for detailed examination, while multiple case studies provide comparative insights.
  • Data Collection Methods : Gather data through interviews, observations, document analysis, or a combination of these methods.

Qualitative vs. Quantitative Research

In empirical research, you'll often encounter the distinction between qualitative and quantitative research . Here's a closer look at these two approaches:

  • Qualitative Research : Qualitative research seeks an in-depth understanding of human behavior, experiences, and perspectives. It involves open-ended questions, interviews, and the analysis of textual or narrative data. Qualitative research is exploratory and often used when the research question is complex and requires a nuanced understanding.
  • Quantitative Research : Quantitative research collects numerical data and employs statistical analysis to draw conclusions. It involves structured questionnaires, experiments, and surveys. Quantitative research is ideal for testing hypotheses and establishing cause-and-effect relationships.

Understanding the various research design options is crucial in determining the most appropriate approach for your study. Your choice should align with your research questions, objectives, and the nature of the phenomenon you're investigating.

Data Collection for Empirical Research

Now that you've established your research design, it's time to roll up your sleeves and collect the data that will fuel your empirical research. Effective data collection is essential for obtaining accurate and reliable results.

Sampling Methods

Sampling methods are critical in empirical research, as they determine the subset of individuals or elements from your target population that you will study. Here are some standard sampling methods:

  • Random Sampling : Random sampling ensures that every member of the population has an equal chance of being selected. It minimizes bias and is often used in quantitative research.
  • Stratified Sampling : Stratified sampling involves dividing the population into subgroups or strata based on specific characteristics (e.g., age, gender, location). Samples are then randomly selected from each stratum, ensuring representation of all subgroups.
  • Convenience Sampling : Convenience sampling involves selecting participants who are readily available or easily accessible. While it's convenient, it may introduce bias and limit the generalizability of results.
  • Snowball Sampling : Snowball sampling is instrumental when studying hard-to-reach or hidden populations. One participant leads you to another, creating a "snowball" effect. This method is common in qualitative research.
  • Purposive Sampling : In purposive sampling, researchers deliberately select participants who meet specific criteria relevant to their research questions. It's often used in qualitative studies to gather in-depth information.

The choice of sampling method depends on the nature of your research, available resources, and the degree of precision required. It's crucial to carefully consider your sampling strategy to ensure that your sample accurately represents your target population.

Data Collection Instruments

Data collection instruments are the tools you use to gather information from your participants or sources. These instruments should be designed to capture the data you need accurately. Here are some popular data collection instruments:

  • Questionnaires : Questionnaires consist of structured questions with predefined response options. When designing questionnaires, consider the clarity of questions, the order of questions, and the response format (e.g., Likert scale , multiple-choice).
  • Interviews : Interviews involve direct communication between the researcher and participants. They can be structured (with predetermined questions) or unstructured (open-ended). Effective interviews require active listening and probing for deeper insights.
  • Observations : Observations entail systematically and objectively recording behaviors, events, or phenomena. Researchers must establish clear criteria for what to observe, how to record observations, and when to observe.
  • Surveys : Surveys are a common data collection instrument for quantitative research. They can be administered through various means, including online surveys, paper surveys, and telephone surveys.
  • Documents and Archives : In some cases, data may be collected from existing documents, records, or archives. Ensure that the sources are reliable, relevant, and properly documented.

To streamline your process and gather insights with precision and efficiency, consider leveraging innovative tools like Appinio . With Appinio's intuitive platform, you can harness the power of real-time consumer data to inform your research decisions effectively. Whether you're conducting surveys, interviews, or observations, Appinio empowers you to define your target audience, collect data from diverse demographics, and analyze results seamlessly.

By incorporating Appinio into your data collection toolkit, you can unlock a world of possibilities and elevate the impact of your empirical research. Ready to revolutionize your approach to data collection?

Book a Demo

Data Collection Procedures

Data collection procedures outline the step-by-step process for gathering data. These procedures should be meticulously planned and executed to maintain the integrity of your research.

  • Training : If you have a research team, ensure that they are trained in data collection methods and protocols. Consistency in data collection is crucial.
  • Pilot Testing : Before launching your data collection, conduct a pilot test with a small group to identify any potential problems with your instruments or procedures. Make necessary adjustments based on feedback.
  • Data Recording : Establish a systematic method for recording data. This may include timestamps, codes, or identifiers for each data point.
  • Data Security : Safeguard the confidentiality and security of collected data. Ensure that only authorized individuals have access to the data.
  • Data Storage : Properly organize and store your data in a secure location, whether in physical or digital form. Back up data to prevent loss.

Ethical Considerations

Ethical considerations are paramount in empirical research, as they ensure the well-being and rights of participants are protected.

  • Informed Consent : Obtain informed consent from participants, providing clear information about the research purpose, procedures, risks, and their right to withdraw at any time.
  • Privacy and Confidentiality : Protect the privacy and confidentiality of participants. Ensure that data is anonymized and sensitive information is kept confidential.
  • Beneficence : Ensure that your research benefits participants and society while minimizing harm. Consider the potential risks and benefits of your study.
  • Honesty and Integrity : Conduct research with honesty and integrity. Report findings accurately and transparently, even if they are not what you expected.
  • Respect for Participants : Treat participants with respect, dignity, and sensitivity to cultural differences. Avoid any form of coercion or manipulation.
  • Institutional Review Board (IRB) : If required, seek approval from an IRB or ethics committee before conducting your research, particularly when working with human participants.

Adhering to ethical guidelines is not only essential for the ethical conduct of research but also crucial for the credibility and validity of your study. Ethical research practices build trust between researchers and participants and contribute to the advancement of knowledge with integrity.

With a solid understanding of data collection, including sampling methods, instruments, procedures, and ethical considerations, you are now well-equipped to gather the data needed to answer your research questions.

Empirical Research Data Analysis

Now comes the exciting phase of data analysis, where the raw data you've diligently collected starts to yield insights and answers to your research questions. We will explore the various aspects of data analysis, from preparing your data to drawing meaningful conclusions through statistics and visualization.

Data Preparation

Data preparation is the crucial first step in data analysis. It involves cleaning, organizing, and transforming your raw data into a format that is ready for analysis. Effective data preparation ensures the accuracy and reliability of your results.

  • Data Cleaning : Identify and rectify errors, missing values, and inconsistencies in your dataset. This may involve correcting typos, removing outliers, and imputing missing data.
  • Data Coding : Assign numerical values or codes to categorical variables to make them suitable for statistical analysis. For example, converting "Yes" and "No" to 1 and 0.
  • Data Transformation : Transform variables as needed to meet the assumptions of the statistical tests you plan to use. Common transformations include logarithmic or square root transformations.
  • Data Integration : If your data comes from multiple sources, integrate it into a unified dataset, ensuring that variables match and align.
  • Data Documentation : Maintain clear documentation of all data preparation steps, as well as the rationale behind each decision. This transparency is essential for replicability.

Effective data preparation lays the foundation for accurate and meaningful analysis. It allows you to trust the results that will follow in the subsequent stages.

Descriptive Statistics

Descriptive statistics help you summarize and make sense of your data by providing a clear overview of its key characteristics. These statistics are essential for understanding the central tendencies, variability, and distribution of your variables. Descriptive statistics include:

  • Measures of Central Tendency : These include the mean (average), median (middle value), and mode (most frequent value). They help you understand the typical or central value of your data.
  • Measures of Dispersion : Measures like the range, variance, and standard deviation provide insights into the spread or variability of your data points.
  • Frequency Distributions : Creating frequency distributions or histograms allows you to visualize the distribution of your data across different values or categories.

Descriptive statistics provide the initial insights needed to understand your data's basic characteristics, which can inform further analysis.

Inferential Statistics

Inferential statistics take your analysis to the next level by allowing you to make inferences or predictions about a larger population based on your sample data. These methods help you test hypotheses and draw meaningful conclusions. Key concepts in inferential statistics include:

  • Hypothesis Testing : Hypothesis tests (e.g., t-tests , chi-squared tests ) help you determine whether observed differences or associations in your data are statistically significant or occurred by chance.
  • Confidence Intervals : Confidence intervals provide a range within which population parameters (e.g., population mean) are likely to fall based on your sample data.
  • Regression Analysis : Regression models (linear, logistic, etc.) help you explore relationships between variables and make predictions.
  • Analysis of Variance (ANOVA) : ANOVA tests are used to compare means between multiple groups, allowing you to assess whether differences are statistically significant.

Chi-Square Calculator :

t-Test Calculator :

One-way ANOVA Calculator :

Inferential statistics are powerful tools for drawing conclusions from your data and assessing the generalizability of your findings to the broader population.

Qualitative Data Analysis

Qualitative data analysis is employed when working with non-numerical data, such as text, interviews, or open-ended survey responses. It focuses on understanding the underlying themes, patterns, and meanings within qualitative data. Qualitative analysis techniques include:

  • Thematic Analysis : Identifying and analyzing recurring themes or patterns within textual data.
  • Content Analysis : Categorizing and coding qualitative data to extract meaningful insights.
  • Grounded Theory : Developing theories or frameworks based on emergent themes from the data.
  • Narrative Analysis : Examining the structure and content of narratives to uncover meaning.

Qualitative data analysis provides a rich and nuanced understanding of complex phenomena and human experiences.

Data Visualization

Data visualization is the art of representing data graphically to make complex information more understandable and accessible. Effective data visualization can reveal patterns, trends, and outliers in your data. Common types of data visualization include:

  • Bar Charts and Histograms : Used to display the distribution of categorical data or discrete data .
  • Line Charts : Ideal for showing trends and changes in data over time.
  • Scatter Plots : Visualize relationships and correlations between two variables.
  • Pie Charts : Display the composition of a whole in terms of its parts.
  • Heatmaps : Depict patterns and relationships in multidimensional data through color-coding.
  • Box Plots : Provide a summary of the data distribution, including outliers.
  • Interactive Dashboards : Create dynamic visualizations that allow users to explore data interactively.

Data visualization not only enhances your understanding of the data but also serves as a powerful communication tool to convey your findings to others.

As you embark on the data analysis phase of your empirical research, remember that the specific methods and techniques you choose will depend on your research questions, data type, and objectives. Effective data analysis transforms raw data into valuable insights, bringing you closer to the answers you seek.

How to Report Empirical Research Results?

At this stage, you get to share your empirical research findings with the world. Effective reporting and presentation of your results are crucial for communicating your research's impact and insights.

1. Write the Research Paper

Writing a research paper is the culmination of your empirical research journey. It's where you synthesize your findings, provide context, and contribute to the body of knowledge in your field.

  • Title and Abstract : Craft a clear and concise title that reflects your research's essence. The abstract should provide a brief summary of your research objectives, methods, findings, and implications.
  • Introduction : In the introduction, introduce your research topic, state your research questions or hypotheses, and explain the significance of your study. Provide context by discussing relevant literature.
  • Methods : Describe your research design, data collection methods, and sampling procedures. Be precise and transparent, allowing readers to understand how you conducted your study.
  • Results : Present your findings in a clear and organized manner. Use tables, graphs, and statistical analyses to support your results. Avoid interpreting your findings in this section; focus on the presentation of raw data.
  • Discussion : Interpret your findings and discuss their implications. Relate your results to your research questions and the existing literature. Address any limitations of your study and suggest avenues for future research.
  • Conclusion : Summarize the key points of your research and its significance. Restate your main findings and their implications.
  • References : Cite all sources used in your research following a specific citation style (e.g., APA, MLA, Chicago). Ensure accuracy and consistency in your citations.
  • Appendices : Include any supplementary material, such as questionnaires, data coding sheets, or additional analyses, in the appendices.

Writing a research paper is a skill that improves with practice. Ensure clarity, coherence, and conciseness in your writing to make your research accessible to a broader audience.

2. Create Visuals and Tables

Visuals and tables are powerful tools for presenting complex data in an accessible and understandable manner.

  • Clarity : Ensure that your visuals and tables are clear and easy to interpret. Use descriptive titles and labels.
  • Consistency : Maintain consistency in formatting, such as font size and style, across all visuals and tables.
  • Appropriateness : Choose the most suitable visual representation for your data. Bar charts, line graphs, and scatter plots work well for different types of data.
  • Simplicity : Avoid clutter and unnecessary details. Focus on conveying the main points.
  • Accessibility : Make sure your visuals and tables are accessible to a broad audience, including those with visual impairments.
  • Captions : Include informative captions that explain the significance of each visual or table.

Compelling visuals and tables enhance the reader's understanding of your research and can be the key to conveying complex information efficiently.

3. Interpret Findings

Interpreting your findings is where you bridge the gap between data and meaning. It's your opportunity to provide context, discuss implications, and offer insights. When interpreting your findings:

  • Relate to Research Questions : Discuss how your findings directly address your research questions or hypotheses.
  • Compare with Literature : Analyze how your results align with or deviate from previous research in your field. What insights can you draw from these comparisons?
  • Discuss Limitations : Be transparent about the limitations of your study. Address any constraints, biases, or potential sources of error.
  • Practical Implications : Explore the real-world implications of your findings. How can they be applied or inform decision-making?
  • Future Research Directions : Suggest areas for future research based on the gaps or unanswered questions that emerged from your study.

Interpreting findings goes beyond simply presenting data; it's about weaving a narrative that helps readers grasp the significance of your research in the broader context.

With your research paper written, structured, and enriched with visuals, and your findings expertly interpreted, you are now prepared to communicate your research effectively. Sharing your insights and contributing to the body of knowledge in your field is a significant accomplishment in empirical research.

Examples of Empirical Research

To solidify your understanding of empirical research, let's delve into some real-world examples across different fields. These examples will illustrate how empirical research is applied to gather data, analyze findings, and draw conclusions.

Social Sciences

In the realm of social sciences, consider a sociological study exploring the impact of socioeconomic status on educational attainment. Researchers gather data from a diverse group of individuals, including their family backgrounds, income levels, and academic achievements.

Through statistical analysis, they can identify correlations and trends, revealing whether individuals from lower socioeconomic backgrounds are less likely to attain higher levels of education. This empirical research helps shed light on societal inequalities and informs policymakers on potential interventions to address disparities in educational access.

Environmental Science

Environmental scientists often employ empirical research to assess the effects of environmental changes. For instance, researchers studying the impact of climate change on wildlife might collect data on animal populations, weather patterns, and habitat conditions over an extended period.

By analyzing this empirical data, they can identify correlations between climate fluctuations and changes in wildlife behavior, migration patterns, or population sizes. This empirical research is crucial for understanding the ecological consequences of climate change and informing conservation efforts.

Business and Economics

In the business world, empirical research is essential for making data-driven decisions. Consider a market research study conducted by a business seeking to launch a new product. They collect data through surveys , focus groups , and consumer behavior analysis.

By examining this empirical data, the company can gauge consumer preferences, demand, and potential market size. Empirical research in business helps guide product development, pricing strategies, and marketing campaigns, increasing the likelihood of a successful product launch.

Psychological studies frequently rely on empirical research to understand human behavior and cognition. For instance, a psychologist interested in examining the impact of stress on memory might design an experiment. Participants are exposed to stress-inducing situations, and their memory performance is assessed through various tasks.

By analyzing the data collected, the psychologist can determine whether stress has a significant effect on memory recall. This empirical research contributes to our understanding of the complex interplay between psychological factors and cognitive processes.

These examples highlight the versatility and applicability of empirical research across diverse fields. Whether in medicine, social sciences, environmental science, business, or psychology, empirical research serves as a fundamental tool for gaining insights, testing hypotheses, and driving advancements in knowledge and practice.

Conclusion for Empirical Research

Empirical research is a powerful tool for gaining insights, testing hypotheses, and making informed decisions. By following the steps outlined in this guide, you've learned how to select research topics, collect data, analyze findings, and effectively communicate your research to the world. Remember, empirical research is a journey of discovery, and each step you take brings you closer to a deeper understanding of the world around you. Whether you're a scientist, a student, or someone curious about the process, the principles of empirical research empower you to explore, learn, and contribute to the ever-expanding realm of knowledge.

How to Collect Data for Empirical Research?

Introducing Appinio , the real-time market research platform revolutionizing how companies gather consumer insights for their empirical research endeavors. With Appinio, you can conduct your own market research in minutes, gaining valuable data to fuel your data-driven decisions.

Appinio is more than just a market research platform; it's a catalyst for transforming the way you approach empirical research, making it exciting, intuitive, and seamlessly integrated into your decision-making process.

Here's why Appinio is the go-to solution for empirical research:

  • From Questions to Insights in Minutes : With Appinio's streamlined process, you can go from formulating your research questions to obtaining actionable insights in a matter of minutes, saving you time and effort.
  • Intuitive Platform for Everyone : No need for a PhD in research; Appinio's platform is designed to be intuitive and user-friendly, ensuring that anyone can navigate and utilize it effectively.
  • Rapid Response Times : With an average field time of under 23 minutes for 1,000 respondents, Appinio delivers rapid results, allowing you to gather data swiftly and efficiently.
  • Global Reach with Targeted Precision : With access to over 90 countries and the ability to define target groups based on 1200+ characteristics, Appinio empowers you to reach your desired audience with precision and ease.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

What is Employee Experience EX and How to Improve It

20.08.2024 | 30min read

What is Employee Experience (EX) and How to Improve It?

Grow your brand and sales market share with a Mental Availability Brand Health Tracking

19.08.2024 | 14min read

Revolutionizing Brand Health with Mental Availability: Key Takeaways

360-Degree Feedback Survey Process Software Examples

15.08.2024 | 31min read

360-Degree Feedback: Survey, Process, Software, Examples

Purdue University

  • Ask a Librarian

Research: Overview & Approaches

  • Getting Started with Undergraduate Research
  • Planning & Getting Started
  • Building Your Knowledge Base
  • Locating Sources
  • Reading Scholarly Articles
  • Creating a Literature Review
  • Productivity & Organizing Research
  • Scholarly and Professional Relationships

Introduction to Empirical Research

Databases for finding empirical research, guided search, google scholar, examples of empirical research, sources and further reading.

  • Interpretive Research
  • Action-Based Research
  • Creative & Experimental Approaches

Your Librarian

Profile Photo

  • Introductory Video This video covers what empirical research is, what kinds of questions and methods empirical researchers use, and some tips for finding empirical research articles in your discipline.

Video Tutorial

  • Guided Search: Finding Empirical Research Articles This is a hands-on tutorial that will allow you to use your own search terms to find resources.

Google Scholar Search

  • Study on radiation transfer in human skin for cosmetics
  • Long-Term Mobile Phone Use and the Risk of Vestibular Schwannoma: A Danish Nationwide Cohort Study
  • Emissions Impacts and Benefits of Plug-In Hybrid Electric Vehicles and Vehicle-to-Grid Services
  • Review of design considerations and technological challenges for successful development and deployment of plug-in hybrid electric vehicles
  • Endocrine disrupters and human health: could oestrogenic chemicals in body care cosmetics adversely affect breast cancer incidence in women?

a research is empirical because

  • << Previous: Scholarly and Professional Relationships
  • Next: Interpretive Research >>
  • Last Updated: Aug 13, 2024 12:18 PM
  • URL: https://guides.lib.purdue.edu/research_approaches

a research is empirical because

Empirical Research: A Comprehensive Guide for Academics 

empirical research

Empirical research relies on gathering and studying real, observable data. The term ’empirical’ comes from the Greek word ’empeirikos,’ meaning ‘experienced’ or ‘based on experience.’ So, what is empirical research? Instead of using theories or opinions, empirical research depends on real data obtained through direct observation or experimentation. 

Why Empirical Research?

Empirical research plays a key role in checking or improving current theories, providing a systematic way to grow knowledge across different areas. By focusing on objectivity, it makes research findings more trustworthy, which is critical in research fields like medicine, psychology, economics, and public policy. In the end, the strengths of empirical research lie in deepening our awareness of the world and improving our capacity to tackle problems wisely. 1,2  

Qualitative and Quantitative Methods

There are two main types of empirical research methods – qualitative and quantitative. 3,4 Qualitative research delves into intricate phenomena using non-numerical data, such as interviews or observations, to offer in-depth insights into human experiences. In contrast, quantitative research analyzes numerical data to spot patterns and relationships, aiming for objectivity and the ability to apply findings to a wider context. 

Steps for Conducting Empirical Research

When it comes to conducting research, there are some simple steps that researchers can follow. 5,6  

  • Create Research Hypothesis:  Clearly state the specific question you want to answer or the hypothesis you want to explore in your study. 
  • Examine Existing Research:  Read and study existing research on your topic. Understand what’s already known, identify existing gaps in knowledge, and create a framework for your own study based on what you learn. 
  • Plan Your Study:  Decide how you’ll conduct your research—whether through qualitative methods, quantitative methods, or a mix of both. Choose suitable techniques like surveys, experiments, interviews, or observations based on your research question. 
  • Develop Research Instruments:  Create reliable research collection tools, such as surveys or questionnaires, to help you collate data. Ensure these tools are well-designed and effective. 
  • Collect Data:  Systematically gather the information you need for your research according to your study design and protocols using the chosen research methods. 
  • Data Analysis:  Analyze the collected data using suitable statistical or qualitative methods that align with your research question and objectives. 
  • Interpret Results:  Understand and explain the significance of your analysis results in the context of your research question or hypothesis. 
  • Draw Conclusions:  Summarize your findings and draw conclusions based on the evidence. Acknowledge any study limitations and propose areas for future research. 

Advantages of Empirical Research

Empirical research is valuable because it stays objective by relying on observable data, lessening the impact of personal biases. This objectivity boosts the trustworthiness of research findings. Also, using precise quantitative methods helps in accurate measurement and statistical analysis. This precision ensures researchers can draw reliable conclusions from numerical data, strengthening our understanding of the studied phenomena. 4  

Disadvantages of Empirical Research

While empirical research has notable strengths, researchers must also be aware of its limitations when deciding on the right research method for their study.4 One significant drawback of empirical research is the risk of oversimplifying complex phenomena, especially when relying solely on quantitative methods. These methods may struggle to capture the richness and nuances present in certain social, cultural, or psychological contexts. Another challenge is the potential for confounding variables or biases during data collection, impacting result accuracy.  

Tips for Empirical Writing

In empirical research, the writing is usually done in research papers, articles, or reports. The empirical writing follows a set structure, and each section has a specific role. Here are some tips for your empirical writing. 7   

  • Define Your Objectives:  When you write about your research, start by making your goals clear. Explain what you want to find out or prove in a simple and direct way. This helps guide your research and lets others know what you have set out to achieve. 
  • Be Specific in Your Literature Review:  In the part where you talk about what others have studied before you, focus on research that directly relates to your research question. Keep it short and pick studies that help explain why your research is important. This part sets the stage for your work. 
  • Explain Your Methods Clearly : When you talk about how you did your research (Methods), explain it in detail. Be clear about your research plan, who took part, and what you did; this helps others understand and trust your study. Also, be honest about any rules you follow to make sure your study is ethical and reproducible. 
  • Share Your Results Clearly : After doing your empirical research, share what you found in a simple way. Use tables or graphs to make it easier for your audience to understand your research. Also, talk about any numbers you found and clearly state if they are important or not. Ensure that others can see why your research findings matter. 
  • Talk About What Your Findings Mean:  In the part where you discuss your research results, explain what they mean. Discuss why your findings are important and if they connect to what others have found before. Be honest about any problems with your study and suggest ideas for more research in the future. 
  • Wrap It Up Clearly:  Finally, end your empirical research paper by summarizing what you found and why it’s important. Remind everyone why your study matters. Keep your writing clear and fix any mistakes before you share it. Ask someone you trust to read it and give you feedback before you finish. 

References:  

  • Empirical Research in the Social Sciences and Education, Penn State University Libraries. Available online at  https://guides.libraries.psu.edu/emp  
  • How to conduct empirical research, Emerald Publishing. Available online at  https://www.emeraldgrouppublishing.com/how-to/research-methods/conduct-empirical-research  
  • Empirical Research: Quantitative & Qualitative, Arrendale Library, Piedmont University. Available online at  https://library.piedmont.edu/empirical-research  
  • Bouchrika, I.  What Is Empirical Research? Definition, Types & Samples  in 2024. Research.com, January 2024. Available online at  https://research.com/research/what-is-empirical-research  
  • Quantitative and Empirical Research vs. Other Types of Research. California State University, April 2023. Available online at  https://libguides.csusb.edu/quantitative  
  • Empirical Research, Definitions, Methods, Types and Examples, Studocu.com website. Available online at  https://www.studocu.com/row/document/uganda-christian-university/it-research-methods/emperical-research-definitions-methods-types-and-examples/55333816  
  • Writing an Empirical Paper in APA Style. Psychology Writing Center, University of Washington. Available online at  https://psych.uw.edu/storage/writing_center/APApaper.pdf  

Paperpal is an AI writing assistant that help academics write better, faster with real-time suggestions for in-depth language and grammar correction. Trained on millions of research manuscripts enhanced by professional academic editors, Paperpal delivers human precision at machine speed.  

Try it for free or upgrade to  Paperpal Prime , which unlocks unlimited access to premium features like academic translation, paraphrasing, contextual synonyms, consistency checks and more. It’s like always having a professional academic editor by your side! Go beyond limitations and experience the future of academic writing.  Get Paperpal Prime now at just US$19 a month!  

Related Reads:

  • How to Write a Scientific Paper in 10 Steps 
  • What is a Literature Review? How to Write It (with Examples)
  • What is an Argumentative Essay? How to Write It (With Examples)
  • Ethical Research Practices For Research with Human Subjects

Ethics in Science: Importance, Principles & Guidelines 

Presenting research data effectively through tables and figures, you may also like, academic integrity vs academic dishonesty: types & examples, dissertation printing and binding | types & comparison , what is a dissertation preface definition and examples , the ai revolution: authors’ role in upholding academic..., the future of academia: how ai tools are..., how to write a research proposal: (with examples..., how to write your research paper in apa..., how to choose a dissertation topic, how to write a phd research proposal, how to write an academic paragraph (step-by-step guide).

Banner

Identify Empirical Research Articles

  • What is empirical research?
  • Finding empirical research in library databases
  • Research design
  • Need additional help?

Getting started

According to the APA , empirical research is defined as the following: "Study based on facts, systematic observation, or experiment, rather than theory or general philosophical principle." Empirical research articles are generally located in scholarly, peer-reviewed journals and often follow a specific layout known as IMRaD: 1) Introduction - This provides a theoretical framework and might discuss previous studies related to the topic at hand. 2) Methodology - This describes the analytical tools used, research process, and the populations included. 3) Results - Sometimes this is referred to as findings, and it typically includes statistical data.  4) Discussion - This can also be known as the conclusion to the study, this usually describes what was learned and how the results can impact future practices.

In addition to IMRaD, it's important to see a conclusion and references that can back up the author's claims.

Characteristics to look for

In addition to the IMRaD format mentioned above, empirical research articles contain several key characteristics for identification purposes:

  • The length of empirical research is often substantial, usually eight to thirty pages long.
  • You should see data of some kind, this includes graphs, charts, or some kind of statistical analysis.
  • There is always a bibliography found at the end of the article.

Publications

Empirical research articles can be found in scholarly or academic journals. These types of journals are often referred to as "peer-reviewed" publications; this means qualified members of an academic discipline review and evaluate an academic paper's suitability in order to be published. 

The CRAAP Checklist should be utilized to help you examine the currency, relevancy, authority, accuracy, and purpose of an information resource. This checklist was developed by California State University's Meriam Library . 

This page has been adapted from the Sociology Research Guide: Identify Empirical Articles at Cal State Fullerton Pollak Library.

  • << Previous: Home
  • Next: Finding empirical research in library databases >>
  • Last Updated: Feb 22, 2024 10:12 AM
  • URL: https://paloaltou.libguides.com/empiricalresearch

Penn State University Libraries

Empirical research in the social sciences and education.

  • What is Empirical Research and How to Read It
  • Finding Empirical Research in Library Databases
  • Designing Empirical Research
  • Ethics, Cultural Responsiveness, and Anti-Racism in Research
  • Citing, Writing, and Presenting Your Work

Contact the Librarian at your campus for more help!

Ellysa Cahoy

Introduction: What is Empirical Research?

Empirical research is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. 

How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology."  Ask yourself: Could I recreate this study and test these results?

Key characteristics to look for:

  • Specific research questions to be answered
  • Definition of the population, behavior, or phenomena being studied
  • Description of the process used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys)

Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components:

  • Introduction: sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous studies
  • Methodology: sometimes called "research design" -- how to recreate the study -- usually describes the population, research process, and analytical tools used in the present study
  • Results: sometimes called "findings" -- what was learned through the study -- usually appears as statistical data or as substantial quotations from research participants
  • Discussion: sometimes called "conclusion" or "implications" -- why the study is important -- usually describes how the research results influence professional practices or future studies

Reading and Evaluating Scholarly Materials

Reading research can be a challenge. However, the tutorials and videos below can help. They explain what scholarly articles look like, how to read them, and how to evaluate them:

  • CRAAP Checklist A frequently-used checklist that helps you examine the currency, relevance, authority, accuracy, and purpose of an information source.
  • IF I APPLY A newer model of evaluating sources which encourages you to think about your own biases as a reader, as well as concerns about the item you are reading.
  • Credo Video: How to Read Scholarly Materials (4 min.)
  • Credo Tutorial: How to Read Scholarly Materials
  • Credo Tutorial: Evaluating Information
  • Credo Video: Evaluating Statistics (4 min.)
  • Credo Tutorial: Evaluating for Diverse Points of View
  • Next: Finding Empirical Research in Library Databases >>
  • Last Updated: Aug 13, 2024 3:16 PM
  • URL: https://guides.libraries.psu.edu/emp
  • quicklinks Academic admin council Academic calendar Academic stds cte Admission Advising African studies Alumni engagement American studies Anthropology/sociology Arabic Arboretum Archives Arcus center Art Assessment committee Athletics Athletic training Biology Biology&chem center Black faculty&staff assoc Bookstore BrandK Business office Campus event calendar Campus safety Catalog Career & prof dev Health science Ctr for civic engagement Ctr for international pgrms Chemistry Chinese Classics College communication Community & global health Community council Complex systems studies Computer science Copyright Counseling Council of student reps Crisis response Critical ethnic studies Critical theory Development Dining services Directories Disability services Donor relations East Asian studies Economics and business Educational policies cte Educational quality assmt Engineering Environmental stewardship Environmental studies English Experiential education cte Facilities management Facilities reservations Faculty development cte Faculty executive cte Faculty grants Faculty personnel cte Fellowships & grants Festival playhouse Film & media studies Financial aid First year experience Fitness & wellness ctr French Gardens & growing spaces German Global crossroads Health center Jewish studies History Hornet hive Hornet HQ Hornet sports Human resources Inclusive excellence Index (student newspaper) Information services Institutional research Institutional review board Intercultural student life International & area studies International programs Intramural sports Japanese LandSea Learning commons Learning support Lgbtqai+ student resources Library Mail and copy center Math Math/physics center Microsoft Stream Microsoft Teams Moodle Movies (ch 22 online) Music OneDrive Outdoor programs Parents' resources Payroll Phi Beta Kappa Philharmonia Philosophy Physics Physical education Political science Pre-law advising Provost Psychology Public pol & urban affairs Recycling Registrar Religion Religious & spiritual life Research Guides (libguides) Residential life Safety (security) Sexual safety Shared passages program SharePoint online Sophomore experience Spanish Strategic plan Student accounts Student development Student activities Student organizations Study abroad Support staff Sustainability Teaching and learning cte Teaching commons Theatre arts Title IX Webmail Women, gender & sexuality Writing center

Psychology Research Guide

What is empirical research, finding empirical research, what is peer review.

  • Research Tips & Tricks
  • Statistics This link opens in a new window
  • Cite Sources
  • Library FAQ This link opens in a new window

two overlapping conversation bubbles

Empirical research  is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. 

How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology." Ask yourself: Could I recreate this study and test these results?

Key characteristics to look for:

  • Specific research questions  to be answered
  • Definition of the  population, behavior, or   phenomena  being studied
  • Description of the  process  used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys)

Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components:

  • Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous studies
  • Methodology:  sometimes called "research design" -- how to recreate the study -- usually describes the population, research process, and analytical tools
  • Results : sometimes called "findings"  --  what was learned through the study -- usually appears as statistical data or as substantial quotations from research participants
  • Discussion : sometimes called "conclusion" or "implications" -- why the study is important -- usually describes how the research results influence professional practices or future studies

Adapted from PennState University Libraries, Empirical Research in the Social Sciences and Education

Empirical research is published in books and in scholarly, peer-reviewed journals. Keep in mind that most library databases do not offer straightforward ways to identifying empirical research.

Finding Empirical Research in PsycINFO

  • PsycInfo Use the "Advanced Search" Type your keywords into the search boxes Scroll down the page to "Methodology," and choose "Empirical Study" Choose other limits, such as publication date, if needed Click on the "Search" button

Finding Empirical Research in PubMed

  • PubMED One technique is to limit your search results after you perform a search: Type in your keywords and click on the "Search" button To the left of your results, under "Article Types," check off the types of studies that interest you Another alternative is to construct a more sophisticated search: From PubMed's main screen, click on "Advanced" link underneath the search box On the Advanced Search Builder screen type your keywords into the search boxes Change one of the empty boxes from "All Fields" to "Publication Type" To the right of Publication Type, click on "Show Index List" and choose a methodology that interests you. You can choose more than one by holding down the "Ctrl" or "⌘" on your keyboard as you click on each methodology Click on the "Search" button

Finding Empirical Research in Library OneSearch & Google Scholar

These tools do not have a method for locating empirical research. Using "empirical" as a keyword will find some studies, but miss many others. Consider using one of the more specialized databases above.

  • Library OneSearch
  • Google Scholar

This refers to the process where authors who are doing research submit a paper they have written to a journal. The journal editor then sends the article to the author's peers (researchers and scholars) who are in the same discipline for review. The reviewers determine if the article should be published based on the quality of the research, including the validity of the data, the conclusions the authors' draw and the originality of the research. This process is important because it validates the research and gives it a sort of "seal of approval" from others in the research community.

Identifying a Journal is Peer-Reviewed

One of the best places to find out if a journal is peer-reviewed is to go to the journal website.

Most publishers have a website for a journal that tells you about the journal, how authors can submit an article, and what the process is for getting published.

If you find the journal website, look for the link that says information for authors, instructions for authors, submitting an article or something similar.

Finding Peer-Reviewed Articles

Start in a library database. Look for a peer-review or scholarly filter.

  • PsycInfo Most comprehensive database of psychology. Filters allow you to limit by methodology. Articles without full-text can be requested via Interlibrary loan.
  • Library OneSearch Search almost all the library resources. Look for a peer-review filter on the left.
  • << Previous: Start Here
  • Next: Research Tips & Tricks >>
  • Last Updated: Jun 8, 2024 6:20 PM
  • URL: https://libguides.kzoo.edu/psyc

Canvas | University | Ask a Librarian

  • Library Homepage
  • Arrendale Library

Empirical & Non-Empirical Research

  • Empirical Research

Introduction: What is Empirical Research?

Quantitative methods, qualitative methods.

  • Quantitative vs. Qualitative
  • Reference Works for Social Sciences Research
  • What is Non-Empirical Research?
  • Contact Us!

 Call us at 706-776-0111

  Chat with a Librarian

  Send Us Email

  Library Hours

Empirical research  is based on phenomena that can be observed and measured. Empirical research derives knowledge from actual experience rather than from theory or belief. 

Key characteristics of empirical research include:

  • Specific research questions to be answered;
  • Definitions of the population, behavior, or phenomena being studied;
  • Description of the methodology or research design used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys);
  • Two basic research processes or methods in empirical research: quantitative methods and qualitative methods (see the rest of the guide for more about these methods).

(based on the original from the Connelly LIbrary of LaSalle University)

a research is empirical because

Empirical Research: Qualitative vs. Quantitative

Learn about common types of journal articles that use APA Style, including empirical studies; meta-analyses; literature reviews; and replication, theoretical, and methodological articles.

Academic Writer

© 2024 American Psychological Association.

  • More about Academic Writer ...

Quantitative Research

A quantitative research project is characterized by having a population about which the researcher wants to draw conclusions, but it is not possible to collect data on the entire population.

  • For an observational study, it is necessary to select a proper, statistical random sample and to use methods of statistical inference to draw conclusions about the population. 
  • For an experimental study, it is necessary to have a random assignment of subjects to experimental and control groups in order to use methods of statistical inference.

Statistical methods are used in all three stages of a quantitative research project.

For observational studies, the data are collected using statistical sampling theory. Then, the sample data are analyzed using descriptive statistical analysis. Finally, generalizations are made from the sample data to the entire population using statistical inference.

For experimental studies, the subjects are allocated to experimental and control group using randomizing methods. Then, the experimental data are analyzed using descriptive statistical analysis. Finally, just as for observational data, generalizations are made to a larger population.

Iversen, G. (2004). Quantitative research . In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.), Encyclopedia of social science research methods . (pp. 897-898). Thousand Oaks, CA: SAGE Publications, Inc.

Qualitative Research

What makes a work deserving of the label qualitative research is the demonstrable effort to produce richly and relevantly detailed descriptions and particularized interpretations of people and the social, linguistic, material, and other practices and events that shape and are shaped by them.

Qualitative research typically includes, but is not limited to, discerning the perspectives of these people, or what is often referred to as the actor’s point of view. Although both philosophically and methodologically a highly diverse entity, qualitative research is marked by certain defining imperatives that include its case (as opposed to its variable) orientation, sensitivity to cultural and historical context, and reflexivity. 

In its many guises, qualitative research is a form of empirical inquiry that typically entails some form of purposive sampling for information-rich cases; in-depth interviews and open-ended interviews, lengthy participant/field observations, and/or document or artifact study; and techniques for analysis and interpretation of data that move beyond the data generated and their surface appearances. 

Sandelowski, M. (2004).  Qualitative research . In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.),  Encyclopedia of social science research methods . (pp. 893-894). Thousand Oaks, CA: SAGE Publications, Inc.

  • Next: Quantitative vs. Qualitative >>
  • Last Updated: Jul 24, 2024 12:04 PM
  • URL: https://library.piedmont.edu/empirical-research
  • Ebooks & Online Video
  • New Materials
  • Renew Checkouts
  • Faculty Resources
  • Library Friends
  • Library Services
  • Our Mission
  • Library History
  • Ask a Librarian!
  • Making Citations
  • Working Online

Friend us on Facebook!

Arrendale Library Piedmont University 706-776-0111

  • What is Empirical Research Study? [Examples & Method]

busayo.longe

The bulk of human decisions relies on evidence, that is, what can be measured or proven as valid. In choosing between plausible alternatives, individuals are more likely to tilt towards the option that is proven to work, and this is the same approach adopted in empirical research. 

In empirical research, the researcher arrives at outcomes by testing his or her empirical evidence using qualitative or quantitative methods of observation, as determined by the nature of the research. An empirical research study is set apart from other research approaches by its methodology and features hence; it is important for every researcher to know what constitutes this investigation method. 

What is Empirical Research? 

Empirical research is a type of research methodology that makes use of verifiable evidence in order to arrive at research outcomes. In other words, this  type of research relies solely on evidence obtained through observation or scientific data collection methods. 

Empirical research can be carried out using qualitative or quantitative observation methods , depending on the data sample, that is, quantifiable data or non-numerical data . Unlike theoretical research that depends on preconceived notions about the research variables, empirical research carries a scientific investigation to measure the experimental probability of the research variables 

Characteristics of Empirical Research

  • Research Questions

An empirical research begins with a set of research questions that guide the investigation. In many cases, these research questions constitute the research hypothesis which is tested using qualitative and quantitative methods as dictated by the nature of the research.

In an empirical research study, the research questions are built around the core of the research, that is, the central issue which the research seeks to resolve. They also determine the course of the research by highlighting the specific objectives and aims of the systematic investigation. 

  • Definition of the Research Variables

The research variables are clearly defined in terms of their population, types, characteristics, and behaviors. In other words, the data sample is clearly delimited and placed within the context of the research. 

  • Description of the Research Methodology

 An empirical research also clearly outlines the methods adopted in the systematic investigation. Here, the research process is described in detail including the selection criteria for the data sample, qualitative or quantitative research methods plus testing instruments. 

An empirical research is usually divided into 4 parts which are the introduction, methodology, findings, and discussions. The introduction provides a background of the empirical study while the methodology describes the research design, processes, and tools for the systematic investigation. 

The findings refer to the research outcomes and they can be outlined as statistical data or in the form of information obtained through the qualitative observation of research variables. The discussions highlight the significance of the study and its contributions to knowledge. 

Uses of Empirical Research

Without any doubt, empirical research is one of the most useful methods of systematic investigation. It can be used for validating multiple research hypotheses in different fields including Law, Medicine, and Anthropology. 

  • Empirical Research in Law : In Law, empirical research is used to study institutions, rules, procedures, and personnel of the law, with a view to understanding how they operate and what effects they have. It makes use of direct methods rather than secondary sources, and this helps you to arrive at more valid conclusions.
  • Empirical Research in Medicine : In medicine, empirical research is used to test and validate multiple hypotheses and increase human knowledge.
  • Empirical Research in Anthropology : In anthropology, empirical research is used as an evidence-based systematic method of inquiry into patterns of human behaviors and cultures. This helps to validate and advance human knowledge.
Discover how Extrapolation Powers statistical research: Definition, examples, types, and applications explained.

The Empirical Research Cycle

The empirical research cycle is a 5-phase cycle that outlines the systematic processes for conducting and empirical research. It was developed by Dutch psychologist, A.D. de Groot in the 1940s and it aligns 5 important stages that can be viewed as deductive approaches to empirical research. 

In the empirical research methodological cycle, all processes are interconnected and none of the processes is more important than the other. This cycle clearly outlines the different phases involved in generating the research hypotheses and testing these hypotheses systematically using the empirical data. 

  • Observation: This is the process of gathering empirical data for the research. At this stage, the researcher gathers relevant empirical data using qualitative or quantitative observation methods, and this goes ahead to inform the research hypotheses.
  • Induction: At this stage, the researcher makes use of inductive reasoning in order to arrive at a general probable research conclusion based on his or her observation. The researcher generates a general assumption that attempts to explain the empirical data and s/he goes on to observe the empirical data in line with this assumption.
  • Deduction: This is the deductive reasoning stage. This is where the researcher generates hypotheses by applying logic and rationality to his or her observation.
  • Testing: Here, the researcher puts the hypotheses to test using qualitative or quantitative research methods. In the testing stage, the researcher combines relevant instruments of systematic investigation with empirical methods in order to arrive at objective results that support or negate the research hypotheses.
  • Evaluation: The evaluation research is the final stage in an empirical research study. Here, the research outlines the empirical data, the research findings and the supporting arguments plus any challenges encountered during the research process.

This information is useful for further research. 

Learn about qualitative data: uncover its types and examples here.

Examples of Empirical Research 

  • An empirical research study can be carried out to determine if listening to happy music improves the mood of individuals. The researcher may need to conduct an experiment that involves exposing individuals to happy music to see if this improves their moods.

The findings from such an experiment will provide empirical evidence that confirms or refutes the hypotheses. 

  • An empirical research study can also be carried out to determine the effects of a new drug on specific groups of people. The researcher may expose the research subjects to controlled quantities of the drug and observe research subjects to controlled quantities of the drug and observe the effects over a specific period of time to gather empirical data.
  • Another example of empirical research is measuring the levels of noise pollution found in an urban area to determine the average levels of sound exposure experienced by its inhabitants. Here, the researcher may have to administer questionnaires or carry out a survey in order to gather relevant data based on the experiences of the research subjects.
  • Empirical research can also be carried out to determine the relationship between seasonal migration and the body mass of flying birds. A researcher may need to observe the birds and carry out necessary observation and experimentation in order to arrive at objective outcomes that answer the research question.

Empirical Research Data Collection Methods

Empirical data can be gathered using qualitative and quantitative data collection methods. Quantitative data collection methods are used for numerical data gathering while qualitative data collection processes are used to gather empirical data that cannot be quantified, that is, non-numerical data. 

The following are common methods of gathering data in empirical research

  • Survey/ Questionnaire

A survey is a method of data gathering that is typically employed by researchers to gather large sets of data from a specific number of respondents with regards to a research subject. This method of data gathering is often used for quantitative data collection , although it can also be deployed during quantitative research.

A survey contains a set of questions that can range from close-ended to open-ended questions together with other question types that revolve around the research subject. A survey can be administered physically or with the use of online data-gathering platforms like Formplus. 

Empirical data can also be collected by carrying out an experiment. An experiment is a controlled simulation in which one or more of the research variables is manipulated using a set of interconnected processes in order to confirm or refute the research hypotheses.

An experiment is a useful method of measuring causality; that is cause and effect between dependent and independent variables in a research environment. It is an integral data gathering method in an empirical research study because it involves testing calculated assumptions in order to arrive at the most valid data and research outcomes. 

T he case study method is another common data gathering method in an empirical research study. It involves sifting through and analyzing relevant cases and real-life experiences about the research subject or research variables in order to discover in-depth information that can serve as empirical data.

  • Observation

The observational method is a method of qualitative data gathering that requires the researcher to study the behaviors of research variables in their natural environments in order to gather relevant information that can serve as empirical data.

How to collect Empirical Research Data with Questionnaire

With Formplus, you can create a survey or questionnaire for collecting empirical data from your research subjects. Formplus also offers multiple form sharing options so that you can share your empirical research survey to research subjects via a variety of methods.

Here is a step-by-step guide of how to collect empirical data using Formplus:

Sign in to Formplus

empirical-research-data-collection

In the Formplus builder, you can easily create your empirical research survey by dragging and dropping preferred fields into your form. To access the Formplus builder, you will need to create an account on Formplus. 

Once you do this, sign in to your account and click on “Create Form ” to begin. 

Unlock the secrets of Quantitative Data: Click here to explore the types and examples.

Edit Form Title

Click on the field provided to input your form title, for example, “Empirical Research Survey”.

empirical-research-questionnaire

Edit Form  

  • Click on the edit button to edit the form.
  • Add Fields: Drag and drop preferred form fields into your form in the Formplus builder inputs column. There are several field input options for survey forms in the Formplus builder.
  • Edit fields
  • Click on “Save”
  • Preview form.

empirical-research-survey

Customize Form

Formplus allows you to add unique features to your empirical research survey form. You can personalize your survey using various customization options. Here, you can add background images, your organization’s logo, and use other styling options. You can also change the display theme of your form. 

empirical-research-questionnaire

  • Share your Form Link with Respondents

Formplus offers multiple form sharing options which enables you to easily share your empirical research survey form with respondents. You can use the direct social media sharing buttons to share your form link to your organization’s social media pages. 

You can send out your survey form as email invitations to your research subjects too. If you wish, you can share your form’s QR code or embed it on your organization’s website for easy access. 

formplus-form-share

Empirical vs Non-Empirical Research

Empirical and non-empirical research are common methods of systematic investigation employed by researchers. Unlike empirical research that tests hypotheses in order to arrive at valid research outcomes, non-empirical research theorizes the logical assumptions of research variables. 

Definition: Empirical research is a research approach that makes use of evidence-based data while non-empirical research is a research approach that makes use of theoretical data. 

Method: In empirical research, the researcher arrives at valid outcomes by mainly observing research variables, creating a hypothesis and experimenting on research variables to confirm or refute the hypothesis. In non-empirical research, the researcher relies on inductive and deductive reasoning to theorize logical assumptions about the research subjects.

The major difference between the research methodology of empirical and non-empirical research is while the assumptions are tested in empirical research, they are entirely theorized in non-empirical research. 

Data Sample: Empirical research makes use of empirical data while non-empirical research does not make use of empirical data. Empirical data refers to information that is gathered through experience or observation. 

Unlike empirical research, theoretical or non-empirical research does not rely on data gathered through evidence. Rather, it works with logical assumptions and beliefs about the research subject. 

Data Collection Methods : Empirical research makes use of quantitative and qualitative data gathering methods which may include surveys, experiments, and methods of observation. This helps the researcher to gather empirical data, that is, data backed by evidence.  

Non-empirical research, on the other hand, does not make use of qualitative or quantitative methods of data collection . Instead, the researcher gathers relevant data through critical studies, systematic review and meta-analysis. 

Advantages of Empirical Research 

  • Empirical research is flexible. In this type of systematic investigation, the researcher can adjust the research methodology including the data sample size, data gathering methods plus the data analysis methods as necessitated by the research process.
  • It helps the research to understand how the research outcomes can be influenced by different research environments.
  • Empirical research study helps the researcher to develop relevant analytical and observation skills that can be useful in dynamic research contexts.
  • This type of research approach allows the researcher to control multiple research variables in order to arrive at the most relevant research outcomes.
  • Empirical research is widely considered as one of the most authentic and competent research designs.
  • It improves the internal validity of traditional research using a variety of experiments and research observation methods.

Disadvantages of Empirical Research 

  • An empirical research study is time-consuming because the researcher needs to gather the empirical data from multiple resources which typically takes a lot of time.
  • It is not a cost-effective research approach. Usually, this method of research incurs a lot of cost because of the monetary demands of the field research.
  • It may be difficult to gather the needed empirical data sample because of the multiple data gathering methods employed in an empirical research study.
  • It may be difficult to gain access to some communities and firms during the data gathering process and this can affect the validity of the research.
  • The report from an empirical research study is intensive and can be very lengthy in nature.

Conclusion 

Empirical research is an important method of systematic investigation because it gives the researcher the opportunity to test the validity of different assumptions, in the form of hypotheses, before arriving at any findings. Hence, it is a more research approach. 

There are different quantitative and qualitative methods of data gathering employed during an empirical research study based on the purpose of the research which include surveys, experiments, and various observatory methods. Surveys are one of the most common methods or empirical data collection and they can be administered online or physically. 

You can use Formplus to create and administer your online empirical research survey. Formplus allows you to create survey forms that you can share with target respondents in order to obtain valuable feedback about your research context, question or subject. 

In the form builder, you can add different fields to your survey form and you can also modify these form fields to suit your research process. Sign up to Formplus to access the form builder and start creating powerful online empirical research survey forms. 

Logo

Connect to Formplus, Get Started Now - It's Free!

  • advantage of empirical research
  • disadvantages of empirical resarch
  • empirical research characteristics
  • empirical research cycle
  • empirical research method
  • example of empirical research
  • uses of empirical research
  • busayo.longe

Formplus

You may also like:

Research Questions: Definitions, Types + [Examples]

A comprehensive guide on the definition of research questions, types, importance, good and bad research question examples

a research is empirical because

Extrapolation in Statistical Research: Definition, Examples, Types, Applications

In this article we’ll look at the different types and characteristics of extrapolation, plus how it contrasts to interpolation.

Recall Bias: Definition, Types, Examples & Mitigation

This article will discuss the impact of recall bias in studies and the best ways to avoid them during research.

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

Banner

Empirical Research: What is Empirical Research?

  • What is Empirical Research?
  • Finding Empirical Research in Library Databases
  • Designing Empirical Research

Introduction

Empirical research is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. 

How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology." Ask yourself: Could I recreate this study and test these results?

Key characteristics to look for:

  • Specific research questions to be answered
  • Definition of the population, behavior, or phenomena being studied
  • Description of the process used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys)

Another hint: some scholarly journals use a specific layout, called the "IMRaD" format (Introduction – Method – Results – and – Discussion), to communicate empirical research findings. Such articles typically have 4 components:

  • Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous studies
  • Methodology : sometimes called "research design" -- how to recreate the study -- usually describes the population, research process, and analytical tools
  • Results : sometimes called "findings" -- what was learned through the study -- usually appears as statistical data or as substantial quotations from research participants
  • Discussion : sometimes called "conclusion" or "implications" -- why the study is important -- usually describes how the research results influence professional practices or future studies

a research is empirical because

Empirical research  is published in books and in  scholarly, peer-reviewed journals .

Make sure to select the  peer-review box  within each database!

  • Next: Finding Empirical Research in Library Databases >>
  • Last Updated: Nov 21, 2022 8:55 AM
  • URL: https://libguides.lahc.edu/empirical

La Salle University

Connelly library, library main menu.

  • Course Reserves
  • Interlibrary Loan (ILL)
  • Study Room Use & Reservations
  • Technology & Printing
  • Citation Guides
  • Reserve Library Space
  • Request Instruction
  • Copyright Information
  • Guides for Faculty

Special Collections

  • University Archives
  • Historical & Cultural Collections
  • Rare Bibles & Prayer Books
  • Historical Research Guides
  • Information & Guidelines
  • Staff Directory
  • Meet with a Librarian
  • Directions & Building Maps

Research Hub

  • Research Tools
  • Research Guides

Qualitative and Quantitative Research

What is "empirical research".

  • empirical research
  • Locating Articles in Cinahl and PsycInfo
  • Locating Articles in PubMed
  • Getting the Articles

Empirical research  is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. 

How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology."  Ask yourself: Could I recreate this study and test these results?

Key characteristics to look for:

  • Specific research questions  to be answered
  • Definition of the  population, behavior, or   phenomena  being studied
  • Description of the  process  used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys)

Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components:

  • Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous studies
  • Methodology:  sometimes called "research design" --  how to recreate the study -- usually describes the population, research process, and analytical tools
  • Results : sometimes called "findings"  --  what was learned through the study -- usually appears as statistical data or as substantial quotations from research participants
  • Discussion : sometimes called "conclusion" or "implications" -- why the study is important -- usually describes how the research results influence professional practices or future studies
  • << Previous: Home
  • Next: Locating Articles in Cinahl and PsycInfo >>

Chat Assistance

Empirical evidence: A definition

Empirical evidence is information that is acquired by observation or experimentation.

Scientists in a lab

The scientific method

Types of empirical research, identifying empirical evidence, empirical law vs. scientific law, empirical, anecdotal and logical evidence, additional resources and reading, bibliography.

Empirical evidence is information acquired by observation or experimentation. Scientists record and analyze this data. The process is a central part of the scientific method , leading to the proving or disproving of a hypothesis and our better understanding of the world as a result.

Empirical evidence might be obtained through experiments that seek to provide a measurable or observable reaction, trials that repeat an experiment to test its efficacy (such as a drug trial, for instance) or other forms of data gathering against which a hypothesis can be tested and reliably measured. 

"If a statement is about something that is itself observable, then the empirical testing can be direct. We just have a look to see if it is true. For example, the statement, 'The litmus paper is pink', is subject to direct empirical testing," wrote Peter Kosso in " A Summary of Scientific Method " (Springer, 2011).

"Science is most interesting and most useful to us when it is describing the unobservable things like atoms , germs , black holes , gravity , the process of evolution as it happened in the past, and so on," wrote Kosso. Scientific theories , meaning theories about nature that are unobservable, cannot be proven by direct empirical testing, but they can be tested indirectly, according to Kosso. "The nature of this indirect evidence, and the logical relation between evidence and theory, are the crux of scientific method," wrote Kosso.

The scientific method begins with scientists forming questions, or hypotheses , and then acquiring the knowledge through observations and experiments to either support or disprove a specific theory. "Empirical" means "based on observation or experience," according to the Merriam-Webster Dictionary . Empirical research is the process of finding empirical evidence. Empirical data is the information that comes from the research.

Before any pieces of empirical data are collected, scientists carefully design their research methods to ensure the accuracy, quality and integrity of the data. If there are flaws in the way that empirical data is collected, the research will not be considered valid.

The scientific method often involves lab experiments that are repeated over and over, and these experiments result in quantitative data in the form of numbers and statistics. However, that is not the only process used for gathering information to support or refute a theory. 

This methodology mostly applies to the natural sciences. "The role of empirical experimentation and observation is negligible in mathematics compared to natural sciences such as psychology, biology or physics," wrote Mark Chang, an adjunct professor at Boston University, in " Principles of Scientific Methods " (Chapman and Hall, 2017).

"Empirical evidence includes measurements or data collected through direct observation or experimentation," said Jaime Tanner, a professor of biology at Marlboro College in Vermont. There are two research methods used to gather empirical measurements and data: qualitative and quantitative.

Qualitative research, often used in the social sciences, examines the reasons behind human behavior, according to the National Center for Biotechnology Information (NCBI) . It involves data that can be found using the human senses . This type of research is often done in the beginning of an experiment. "When combined with quantitative measures, qualitative study can give a better understanding of health related issues," wrote Dr. Sanjay Kalra for NCBI.

Quantitative research involves methods that are used to collect numerical data and analyze it using statistical methods, ."Quantitative research methods emphasize objective measurements and the statistical, mathematical, or numerical analysis of data collected through polls, questionnaires, and surveys, or by manipulating pre-existing statistical data using computational techniques," according to the LeTourneau University . This type of research is often used at the end of an experiment to refine and test the previous research.

Scientist in a lab

Identifying empirical evidence in another researcher's experiments can sometimes be difficult. According to the Pennsylvania State University Libraries , there are some things one can look for when determining if evidence is empirical:

  • Can the experiment be recreated and tested?
  • Does the experiment have a statement about the methodology, tools and controls used?
  • Is there a definition of the group or phenomena being studied?

The objective of science is that all empirical data that has been gathered through observation, experience and experimentation is without bias. The strength of any scientific research depends on the ability to gather and analyze empirical data in the most unbiased and controlled fashion possible. 

However, in the 1960s, scientific historian and philosopher Thomas Kuhn promoted the idea that scientists can be influenced by prior beliefs and experiences, according to the Center for the Study of Language and Information . 

— Amazing Black scientists

— Marie Curie: Facts and biography

— What is multiverse theory?

"Missing observations or incomplete data can also cause bias in data analysis, especially when the missing mechanism is not random," wrote Chang.

Because scientists are human and prone to error, empirical data is often gathered by multiple scientists who independently replicate experiments. This also guards against scientists who unconsciously, or in rare cases consciously, veer from the prescribed research parameters, which could skew the results.

The recording of empirical data is also crucial to the scientific method, as science can only be advanced if data is shared and analyzed. Peer review of empirical data is essential to protect against bad science, according to the University of California .

Empirical laws and scientific laws are often the same thing. "Laws are descriptions — often mathematical descriptions — of natural phenomenon," Peter Coppinger, associate professor of biology and biomedical engineering at the Rose-Hulman Institute of Technology, told Live Science. 

Empirical laws are scientific laws that can be proven or disproved using observations or experiments, according to the Merriam-Webster Dictionary . So, as long as a scientific law can be tested using experiments or observations, it is considered an empirical law.

Empirical, anecdotal and logical evidence should not be confused. They are separate types of evidence that can be used to try to prove or disprove and idea or claim.

Logical evidence is used proven or disprove an idea using logic. Deductive reasoning may be used to come to a conclusion to provide logical evidence. For example, "All men are mortal. Harold is a man. Therefore, Harold is mortal."

Anecdotal evidence consists of stories that have been experienced by a person that are told to prove or disprove a point. For example, many people have told stories about their alien abductions to prove that aliens exist. Often, a person's anecdotal evidence cannot be proven or disproven. 

There are some things in nature that science is still working to build evidence for, such as the hunt to explain consciousness .

Meanwhile, in other scientific fields, efforts are still being made to improve research methods, such as the plan by some psychologists to fix the science of psychology .

" A Summary of Scientific Method " by Peter Kosso (Springer, 2011)

"Empirical" Merriam-Webster Dictionary

" Principles of Scientific Methods " by Mark Chang (Chapman and Hall, 2017)

"Qualitative research" by Dr. Sanjay Kalra National Center for Biotechnology Information (NCBI)

"Quantitative Research and Analysis: Quantitative Methods Overview" LeTourneau University

"Empirical Research in the Social Sciences and Education" Pennsylvania State University Libraries

"Thomas Kuhn" Center for the Study of Language and Information

"Misconceptions about science" University of California

Sign up for the Live Science daily newsletter now

Get the world’s most fascinating discoveries delivered straight to your inbox.

Live Science x HowTheLightGetsIn — Get discounted tickets to the world’s largest ideas and music festival

Do opposites really attract in relationships?

32 truly bizarre deep-sea creatures

Most Popular

  • 2 MIT scientists build hair-size batteries that can power cell-sized robots
  • 3 Evolution: Facts about the processes that shape the diversity of life on Earth
  • 4 What is the 'tree of life'?
  • 5 Meet LUCA, the 4.2 billion-year-old cell that's the ancestor of all life on Earth today

a research is empirical because

a research is empirical because

How to... Conduct empirical research

Share this content

Empirical research is research that is based on observation and measurement of phenomena, as directly experienced by the researcher. The data thus gathered may be compared against a theory or hypothesis, but the results are still based on real life experience. The data gathered is all primary data, although secondary data from a literature review may form the theoretical background.

On this page

What is empirical research, the research question, the theoretical framework, sampling techniques, design of the research.

  • Methods of empirical research
  • Techniques of data collection & analysis
  • Reporting the findings of empirical research
  • Further information

Typically, empirical research embodies the following elements:

  • A  research question , which will determine research objectives.
  • A particular and planned  design  for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources.
  • The gathering of  primary data , which is then analysed.
  • A particular  methodology  for collecting and analysing the data, such as an experiment or survey.
  • The limitation of the data to a particular group, area or time scale, known as a sample: for example, a specific number of employees of a particular company type, or all users of a library over a given time scale. The sample should be somehow representative of a wider population.
  • The ability to  recreate  the study and test the results. This is known as  reliability .
  • The ability to  generalise  from the findings to a larger sample and to other situations.

The starting point for your research should be your research question. This should be a formulation of the issue which is at the heart of the area which you are researching, which has the right degree of breadth and depth to make the research feasible within your resources. The following points are useful to remember when coming up with your research question, or RQ:

  • your doctoral thesis;
  • reading the relevant literature in journals, especially literature reviews which are good at giving an overview, and spotting interesting conceptual developments;
  • looking at research priorities of funding bodies, professional institutes etc.;
  • going to conferences;
  • looking out for calls for papers;
  • developing a dialogue with other researchers in your area.
  • To narrow down your research topic, brainstorm ideas around it, possibly with your colleagues if you have decided to collaborate, noting all the questions down.
  • Come up with a "general focus" question; then develop some other more specific ones.
  • they are not too broad;
  • they are not so narrow as to yield uninteresting results;
  • will the research entailed be covered by your resources, i.e. will you have sufficient time and money;
  • there is sufficient background literature on the topic;
  • you can carry out appropriate field research;
  • you have stated your question in the simplest possible way.

Let's look at some examples:

Bisking et al. examine whether or not gender has an influence on disciplinary action in their article  Does the sex of the leader and subordinate influence a leader's disciplinary decisions?  ( Management Decision , Volume 41 Number 10) and come up with the following series of inter-related questions:

  • Given the same infraction, would a male leader impose the same disciplinary action on male and female subordinates?
  • Given the same infraction, would a female leader impose the same disciplinary action on male and female subordinates?
  • Given the same infraction, would a female leader impose the same disciplinary action on female subordinates as a male leader would on male subordinates?
  • Given the same infraction, would a female leader impose the same disciplinary action on male subordinates as a male leader would on female subordinates?
  • Given the same infraction, would a male and female leader impose the same disciplinary action on male subordinates?
  • Given the same infraction, would a male and female leader impose the same disciplinary action on female subordinates?
  • Do female and male leaders impose the same discipline on subordinates regardless of the type of infraction?
  • Is it possible to predict how female and male leaders will impose disciplinary actions based on their respective BSRI femininity and masculinity scores?

Motion et al. examined co-branding in  Equity in Corporate Co-branding  ( European Journal of Marketing , Volume 37 Number 7/8) and came up with the following RQs:

RQ1:  What objectives underpinned the corporate brand?

RQ2:  How were brand values deployed to establish the corporate co-brand within particular discourse contexts?

RQ3:  How was the desired rearticulation promoted to shareholders?

RQ4:  What are the sources of corporate co-brand equity?

Note, the above two examples state the RQs very explicitly; sometimes the RQ is implicit:

Qun G. Jiao, Anthony J. Onwuegbuzie are library researchers who examined the question:  "What is the relationship between library anxiety and social interdependence?"  in a number of articles, see  Dimensions of library anxiety and social interdependence: implications for library services   ( Library Review , Volume 51 Number 2).

Or sometimes the RQ is stated as a general objective:

Ying Fan describes outsourcing in British companies in  Strategic outsourcing: evidence from British companies  ( Marketing Intelligence & Planning , Volume 18 Number 4) and states his research question as an objective:

The main objective of the research was to explore the two key areas in the outsourcing process, namely:

  • pre-outsourcing decision process; and
  • post-outsourcing supplier management.

or as a proposition:

Karin Klenke explores issues of gender in management decisions in  Gender influences in decision-making processes in top management teams   ( Management Decision , Volume 41 Number 10).

Given the exploratory nature of this research, no specific hypotheses were formulated. Instead, the following general propositions are postulated:

P1.  Female and male members of TMTs exercise different types of power in the strategic decision making process.

P2.  Female and male members of TMTs differ in the extent in which they employ political savvy in the strategic decision making process.

P3.  Male and female members of TMTs manage conflict in strategic decision making situations differently.

P4.  Female and male members of TMTs utilise different types of trust in the decision making process.

Sometimes, the theoretical underpinning (see next section) of the research leads you to formulate a hypothesis rather than a question:

Martin et al. explored the effect of fast-forwarding of ads (called zipping) in  Remote control marketing: how ad fast-forwarding and ad repetition affect consumers  ( Marketing Intelligence & Planning , Volume 20 Number 1) and his research explores the following hypotheses:

The influence of zipping H1. Individuals viewing advertisements played at normal speed will exhibit higher ad recall and recognition than those who view zipped advertisements.

Ad repetition effects H2. Individuals viewing a repeated advertisement will exhibit higher ad recall and recognition than those who see an advertisement once.

Zipping and ad repetition H3. Individuals viewing zipped, repeated advertisements will exhibit higher ad recall and recognition than those who see a normal speed advertisement that is played once.

Empirical research is not divorced from theoretical considerations; and a consideration of theory should form one of the starting points of your research. This applies particularly in the case of management research which by its very nature is practical and applied to the real world. The link between research and theory is symbiotic: theory should inform research, and the findings of research should inform theory.

There are a number of different theoretical perspectives; if you are unfamiliar with them, we suggest that you look at any good research methods textbook for a full account (see Further information), but this page will contain notes on the following:

This is the approach of the natural sciences, emphasising total objectivity and independence on the part of the researcher, a highly scientific methodology, with data being collected in a value-free manner and using quantitative techniques with some statistical measures of analysis. Assumes that there are 'independent facts' in the social world as in the natural world. The object is to generalise from what has been observed and hence add to the body of theory.

Very similar to positivism in that it has a strong reliance on objectivity and quantitative methods of data collection, but with less of a reliance on theory. There is emphasis on data and facts in their own right; they do not need to be linked to theory.

Interpretivism

This view criticises positivism as being inappropriate for the social world of business and management which is dominated by people rather than the laws of nature and hence has an inevitable subjective element as people will have different interpretations of situations and events. The business world can only be understood through people's interpretation. This view is more likely to emphasise qualitative methods such as participant observation, focus groups and semi-structured interviewing.

 
typically use  typically use 
are  are 
involve the researcher as ideally an  require more   and   on the part of the researcher.
may focus on cause and effect. focuses on understanding of phenomena in their social, institutional, political and economic context.
require a hypothesis.  require a 
have the   that they may force people into categories, also it cannot go into much depth about subjects and issues. have the   that they focus on a few individuals, and may therefore be difficult to generalise.

While reality exists independently of human experience, people are not like objects in the natural world but are subject to social influences and processes. Like  empiricism  and  positivism , this emphasises the importance of explanation, but is also concerned with the social world and with its underlying structures.

Inductive and deductive approaches

At what point in your research you bring in a theoretical perspective will depend on whether you choose an:

  • Inductive approach  – collect the data, then develop the theory.
  • Deductive approach  – assume a theoretical position then test it against the data.
is more usually linked with an   approach. is more usually linked with the   approach.
is more likely to use qualitative methods, such as interviewing, observation etc., with a more flexible structure. is more likely to use quantitative methods, such as experiments, questionnaires etc., and a highly structured methodology with controls.
does not simply look at cause and effect, but at people's perceptions of events, and at the context of the research. is the more scientific method, concerned with cause and effect, and the relationship between variables.
builds theory after collection of the data. starts from a theoretical perspective, and develops a hypothesis which is tested against the data.
is more likely to use an in-depth study of a smaller sample. is more likely to use a larger sample.
is less likely to be concerned with generalisation (a danger is that no patterns emerge). is concerned with generalisation.
tresses the researcher involvement. stresses the independence of the researcher.

It should be emphasised that none of the above approaches are mutually exclusive and can be used in combination.

Sampling may be done either:

  • On a  random  basis – a given number is selected completely at random.
  • On a  systematic  basis – every  n th element  of the population is selected.
  • On a  stratified random  basis – the population is divided into segments, for example, in a University, you could divide the population into academic, administrators, and academic related. A random number of each group is then selected.
  • On a  cluster  basis – a particular subgroup is chosen at random.
  • Convenience  – being present at a particular time e.g. at lunch in the canteen.
  • Purposive  – people can be selected deliberately because their views are relevant to the issue concerned.
  • Quota  – the assumption is made that there are subgroups in the population, and a quota of respondents is chosen to reflect this diversity.

Useful articles

Richard Laughlin in  Empirical research in accounting: alternative approaches and a case for "middle-range" thinking  provides an interesting general overview of the different perspectives on theory and methodology as applied to accounting. ( Accounting, Auditing & Accountability Journal,  Volume 8 Number 1).

D. Tranfield and K. Starkey in  The Nature, Social Organization and Promotion of Management Research: Towards Policy  look at the relationship between theory and practice in management research, and develop a number of analytical frameworks, including looking at Becher's conceptual schema for disciplines and Gibbons et al.'s taxonomy of knowledge production systems. ( British Journal of Management , vol. 9, no. 4 – abstract only).

Research design is about how you go about answering your question: what strategy you adopt, and what methods do you use to achieve your results. In particular you should ask yourself... 

There's a lot more to this article; just fill in the form below to instantly see the complete content.

Read the complete article

What's in the rest?

  • Continuation of 'Design of the research'
  • Books & websites for further information

Your data will be used, alongside feedback we may request, only to help inform and improve our 'How to' section – thank you.

Banner

  • University of Memphis Libraries
  • Research Guides

Empirical Research: Defining, Identifying, & Finding

Searching for empirical research.

  • Defining Empirical Research
  • Introduction

Where Do I Find Empirical Research?

How do i find more empirical research in my search.

  • Database Tools
  • Search Terms
  • Image Descriptions

Because empirical research refers to the method of investigation rather than a method of publication, it can be published in a number of places. In many disciplines empirical research is most commonly published in scholarly, peer-reviewed journals . Putting empirical research through the peer review process helps ensure that the research is high quality. 

Finding Peer-Reviewed Articles

You can find peer-reviewed articles in a general web search along with a lot of other types of sources. However, these specialized tools are more likely to find peer-reviewed articles:

  • Library databases
  • Academic search engines such as Google Scholar

Common Types of Articles That Are Not Empirical

However, just finding an article in a peer-reviewed journal is not enough to say it is empirical, since not all the articles in a peer-reviewed journal will be empirical research or even peer reviewed. Knowing how to quickly identify some types non-empirical research articles in peer-reviewed journals can help speed up your search. 

  • Peer-reviewed articles that systematically discuss and propose abstract concepts and methods for a field without primary data collection.
  • Example: Grosser, K. & Moon, J. (2019). CSR and feminist organization studies: Towards an integrated theorization for the analysis of gender issues .
  • Peer-reviewed articles that systematically describe, summarize, and often categorize and evaluate previous research on a topic without collecting new data.
  • Example: Heuer, S. & Willer, R. (2020). How is quality of life assessed in people with dementia? A systematic literature review and a primer for speech-language pathologists .
  • Note: empirical research articles will have a literature review section as part of the Introduction , but in an empirical research article the literature review exists to give context to the empirical research, which is the primary focus of the article. In a literature review article, the literature review is the focus. 
  • While these articles are not empirical, they are often a great source of information on previous empirical research on a topic with citations to find that research.
  • Non-peer-reviewed articles where the authors discuss their thoughts on a particular topic without data collection and a systematic method. There are a few differences between these types of articles.
  • Written by the editors or guest editors of the journal. 
  • Example:  Naples, N. A., Mauldin, L., & Dillaway, H. (2018). From the guest editors: Gender, disability, and intersectionality .
  • Written by guest authors. The journal may have a non-peer-reviewed process for authors to submit these articles, and the editors of the journal may invite authors to write opinion articles.
  • Example: García, J. J.-L., & Sharif, M. Z. (2015). Black lives matter: A commentary on racism and public health . 
  • Written by the readers of a journal, often in response to an article previously-published in the journal.
  • Example: Nathan, M. (2013). Letters: Perceived discrimination and racial/ethnic disparities in youth problem behaviors . 
  • Non-peer-reviewed articles that describe and evaluate books, products, services, and other things the audience of the journal would be interested in. 
  • Example: Robinson, R. & Green, J. M. (2020). Book review: Microaggressions and traumatic stress: Theory, research, and clinical treatment .

Even once you know how to recognize empirical research and where it is published, it would be nice to improve your search results so that more empirical research shows up for your topic.

There are two major ways to find the empirical research in a database search:

  • Use built-in database tools to limit results to empirical research.
  • Include search terms that help identify empirical research.
  • << Previous: Discussion
  • Next: Database Tools >>
  • Last Updated: Apr 2, 2024 11:25 AM
  • URL: https://libguides.memphis.edu/empirical-research

Methodical Basics of Empirical Research

  • First Online: 12 January 2022

Cite this chapter

a research is empirical because

  • Hans E. Fischer 4 ,
  • William Boone 5 &
  • Heiko Krabbe 6  

Part of the book series: Challenges in Physics Education ((CPE))

1192 Accesses

1 Citations

To be up-to date, teachers should be able to follow current research on teaching and learning in physics. Therefore, they have to be able to assess if the results presented in publications are meaningful and trustworthy. In this chapter, we detail requirements research studies should comply with in order that measuring results, conclusions and generalisations can be trusted. In the case of the calculation of means for example, the requirements of such calculations are familiar to physics teachers. However, when considering more complicated statistics, for example the meaning of correlations or Rasch analysis, physics teachers may be less familiar with such statistical topics. To identify the relations between numerous variables, for example, the effect of migration background, social status and cognitive abilities on school success, more complex mathematical models are necessary. The starting point for all empirical investigations must be a valid theoretical model. Such a model should include both the variables considered important and the design of a study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

a research is empirical because

Perplexing Times in Educational Research and the Prospects for a New Platinum Standard

a research is empirical because

Assessing the Impact of Educational Programmes: An Evaluation of Research Validity

a research is empirical because

A Review of Three Large-Scale Datasets Critiquing Item Design, Data Collection, and the Usefulness of Claims

Allmendinger J, Kleinert C, Antoni M, Christoph B, Drasch K, Janik F et al (2011) Adult education and life-long learning. In: Blossfeld H-P, Roßbach H-G, Maurice JV (eds) Education as a lifelong process—the German national educational panel study (NEPS), vol 283–299. VS Verlag für Sozialwissenschaften, Heidelberg

Google Scholar  

Ardura D, Pérez-Bitrián A (2019) Motivational pathways towards academic achievement in physics & chemistry: a comparison between students who opt out and those who persist. Chem Educ Res Pract 20(3):618–632. https://doi.org/10.1039/C9RP00073A

Article   Google Scholar  

Bacon F (2017) In: Bennett ABJ (ed) The new organon: or true directions concerning the interpretation of nature

Bahr N, Mellor S (eds) (2016) Building quality in teaching and teacher education, vol 61. ACER Press, Camberwell, Victoria

Bakker A, Cai J, English L, Kaiser G, Mesa V, Van Dooren W (2019) Beyond small, medium, or large: points of consideration when interpreting effect sizes. Educ Stud Math 102(1):1–8. https://doi.org/10.1007/s10649-019-09908-4

Beaton AE, Barone JL (2017) Large-scale group-score assessment. In: Bennett RE, von Davier M (eds) Advancing human assessment: the methodological, psychological and policy contributions of ETS. Springer International Publishing, Cham, pp 233–284

Chapter   Google Scholar  

Boone WJ, Noltemeyer A (2017) Rasch analysis: a primer for school psychology researchers and practitioners. Cogent Educ 4(1):1416898. https://doi.org/10.1080/2331186X.2017.1416898

Borsboom D, Mellenbergh G, Heerden J (2004) The concept of validity. Psychol Rev 111:1061–1071. https://doi.org/10.1037/0033-295X.111.4.1061

Cho E (2016) Making reliability reliable: a systematic approach to reliability coefficients. Organ Res Methods 19(4):651–682. https://doi.org/10.1177/1094428116656239

Clausen M (2002) Qualität von Unterricht – Eine Frage der Perspektive? [Quality teaching—a question of perspective?]. Waxmann, Münster

Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Lawrence Erlbaum Associates, Hillsdale

Cohen J, Cohen P, West SG, Aiken LS (2013) Applied multiple regression. Correlation analysis for the behavioral sciences, 3rd edn. Routledge, London

Book   Google Scholar  

Eichenlaub M, Redish EF (2019) Blending physical knowledge with mathematical form in physics problem solving. In: Pospiech G, Michelini M, Eylon B-S (eds) Mathematics in physics education. Springer International Publishing, Cham, pp 127–151

Evens M, Elen J, Larmuseau C, Depaepe F (2018) Promoting the development of teacher professional knowledge: integrating content and pedagogy in teacher education. Teach Teach Educ 75:244–258. https://doi.org/10.1016/j.tate.2018.07.001

Feldstain A, Woltman H, MacKay J, Rocci M (2012) Introduction to hierarchical linear modeling. Tutorials Quant Methods Psychol 8:62–69. https://doi.org/10.20982/tqmp.08.1.p052

Fiorella L, Mayer RE (2015) Learning as a generative activity: eight learning strategies that promote understanding. Cambridge University Press, Cambridge

Fischer HE, Neumann K (2012) Video analysis as a tool for understanding science instruction. In: Jorde D, Dillon J (eds) Science education research and practice in Europe: retrospective and prospective. Sense Publishers, Rotterdam, pp 115–139

Fischer HE, Klemm K, Leutner D, Sumfleth E, Tiemann R, Wirth J (2005) Framework for empirical research on science teaching and learning. J Sci Teacher Educ 16(4):309–349

Fischer HE, Borowski A, Tepner O (2012) Professional knowledge of science teachers. In: Fraser B, Tobin K, McRobbie C (eds) Second international handbook of science education. Springer, New York, pp 435–448

Fischer HE, Boone WJ, Neumann K (2014a) Quantitative research designs and approaches. In: Lederman NG, Abell SK (eds) Handbook of research on science education, vol II. Taylor and Francis (Routledge), New York, pp 18–37

Fischer HE, Labudde P, Neumann K, Viiri J (eds) (2014b) Quality of instruction in physics—comparing Finland, Germany and Switzerland. Waxmann, Münster, New York

Fricke K (2016) Classroom management and its impact on lesson outcomes in physics. A multi-perspective comparison of teaching practices in primary and secondary schools. Logos Verlag Berlin GmbH, Berlin

Furtak EM, Seidel T, Iverson H, Briggs D (2012) Experimental and quasi-experimental studies of inquiry-based science teaching: a meta-analysis. Rev Educ Res 82(3):300–329

Geller C, Neumann K, Boone WJ, Fischer HE (2014) What makes the finnish different in science? Assessing and comparing students’ science learning in three countries. Int J Sci Educ 36(18):3042–3066. https://doi.org/10.1080/09500693.2014.950185

Guidelines for safeguarding good research practice—code of conduct (2019)

Hattie J (2016) Visible learning for literacy, grades K-12: implementing the practices that work best to accelerate student learning. Corwin Press, Thousand Oaks, USA

Hattie J, Timperley H (2007) The power of feedback. Rev Educ Res 77(1):81–112. https://doi.org/10.3102/003465430298487

Helmke A (2009) Unterrichtsqualität und Lehrerprofessionalität. Diagnose, evaluation und Verbesserung des Unterrichts [Instruction quality and teacher professionality. Diagnose, evaluation and improvement]. Kallmeyer, Seelze

Hiebert J, Gallimore R, Garnier H, Bogard Givvin K, Hollingsworth H, Jacobs J et al (2003) Teaching mathematics in seven countries: results from the TIMSS 1999 video study

Hill HC, Chin M (2018) Connections between teachers’ knowledge of students, instruction, and achievement outcomes. Am Educ Res J. Advance online publication. http://doi.org/10.3102/0002831218769614

Krapp A (2007) An educational–psychological conceptualisation of interest. Int J Educ Vocat Guidance 7(1):5–21. https://doi.org/10.1007/s10775-007-9113-9

Leutner D, Fleischer J, Grünkorn J, Klieme E (eds) (2017) Competence assessment in education: research, models and instruments. Springer International Publishing, Cham

Luan Z, Poorthuis AMG, Hutteman R, Denissen JJA, Asendorpf JB, van Aken MAG (2019) Unique predictive power of other-rated personality: an 18-year longitudinal study. J Pers 87(3):532–545. https://doi.org/10.1111/jopy.12413

Mayring P (2007) Mixing qualitative and quantitative methods. In: Mayring P, Huber GL, Gürtler L, Kiegelmann M (eds) Mixed methodology in psychological research. Sense Publishers, Rotterdam, pp 27–36

Mayring P (2014) Qualitative content analysis: theoretical foundation, basic procedures and software solution. Klagenfurt

McLeod SA (2013) What is reliability? Simply psychology. Retrieved from https://www.simplypsychology.org/reliability.html

McLure F, Won M, Treagust DF (2020) Teaching thermal physics to year 9 students: the thinking frames approach. Phys Educ 55(3):035007. http://doi.org/10.1088/1361-6552/ab6c3c

McMillan JH, Schumacher S (2010) Research in education: evidence-based inquiry, 7th edn. Pearson Education Ltd., Essex

Merrill MD (1991) Constructivism and instructional design. Educ Technol 31(5):45–53. Retrieved from http://www.jstor.org/stable/44427520

Mertler C, Vannatta Reinhart R (2017) Advanced and multivariate statistical methods. Routledge, New York

Mullis IVS, Martin MO (eds) (2017) TIMSS 2019 assessment framework. Boston College. Retrieved from TIMSS & PIRLS International Study Center website

Neumann K, Kauertz A, Fischer HE (2012) Quality of instruction in science education. In: Fraser B, Tobin K, McRobbie C (eds) Second international handbook of science education. Springer, New York, pp 247–258

Neumann K, Viering T, Boone WJ, Fischer HE (2013) Towards a learning progression of energy. J Res Sci Teach 50(2):162–188. https://doi.org/10.1002/tea.21061

Nevo B (1985) Face validity revisited. J Educ Meas 22(4):287–293

Article   MathSciNet   Google Scholar  

Olson DR (2004) The triumph of hope over experience in the search for “what works”: a response to Slavin. Educ Res 33(1):24–26

Olszewski J (2010) The impact of physics teachers’ pedagogical content knowledge on teacher action and student outcomes, vol 109. Logos, Berlin

Patten ML, Newhart M (2018) Understanding research methods: an overview of the essentials, 10th edn. Routledge, New York

Pintrich PR (2000) Chapter 14—the role of goal orientation in self-regulated learning. In: Boekaerts M, Pintrich PR, Zeidner M (eds) Handbook of self-regulation. Academic Press, San Diego, pp 451–502

Popper K (2002 [1959]) The logic of scientific discovery. Routledge, Abingdon-on-Thames

Rieger S, Hübner N, Wagner W (2018) NEPS technical report for physics competence: scaling results for the additional study Thuringia. Retrieved from Bamberg: https://www.neps-data.de/Portals/0/SurveyPapers/SP_XL.pdf

Ruiz-Primo MA, Briggs D, Shepard L, Iverson H, Huchton M (2008) Evaluating the impact of instructional innovations in engineering education. In: Duque M (ed) Engineering education for the XXI Century: foundations, strategies and cases. ACOFI Publications, Bogotá, Colombia, pp 241–274

Seidel T, Prenzel M (2006) Stability of teaching patterns in physics instruction: findings from a video study. Learn Instr 16(3):228–240. https://doi.org/10.1016/j.learninstruc.2006.03.002

Seidel T, Shavelson RJ (2007) Teaching effectiveness research in the past decade: the role of theory and research design in disentangling meta-analysis results. Rev Educ Res 77(4):454–499. https://doi.org/10.3102/0034654307310317

Shavelson RJ, Towne L (eds) (2002) Scientific research in education. National Academy Press, Washington, DC

Sorge S, Kröger J, Petersen S, Neumann K (2019) Structure and development of pre-service physics teachers’ professional knowledge. Int J Sci Educ 41:862–889. https://doi.org/10.1080/09500693.2017.1346326

Stigler JW, Hiebert J (1997) Understanding and improving classroom mathematics instruction: an overview of the TIMSS video study. The Phi Delta Kappan 79(1):14–21. Retrieved from www.jstor.org/stable/20405948

Taylor JA, Kowalski SM, Polanin JR, Askinas K, Stuhlsatz MAM, Wilson CD et al (2018) Investigating science education effect sizes: implications for power analyses and programmatic decisions. AERA Open 4(3):2332858418791991. http://doi.org/10.1177/2332858418791991

Tong F, Tang S, Irby BJ, Lara-Alecio R, Guerrero C (2020) Inter-rater reliability data of classroom observation: fidelity in large-scale randomized research in education. Data Brief 29:105303. http://doi.org/10.1016/j.dib.2020.105303

Tsaousis I, Sideridis GD, AlGhamdi HM (2020) Measurement invariance and differential item functioning across gender within a latent class analysis framework: evidence from a high-stakes test for university admission in Saudi Arabia. Front Psychol 11(622). http://doi.org/10.3389/fpsyg.2020.00622

Verma JP (2019) Normal distribution and its application. In: Statistics and research methods in psychology with excel. Springer Singapore, Singapore, pp 201–235

Watt JH, van den Berg S (2002) Populations and samples: the principle of generalization. In: Research methods for communication science, pp 50–61. http://www.cios.org/ : CIOS Open Text Project

Weßnigk S, Neumann K, Viering T, Hadinek D, Fischer HE (2017) The development of students’ physics competence in middle school. In: Leutner D, Fleischer J, Grünkorn J, Klieme E (eds) Competence assessment in education: research, models and instruments. Springer International Publishing, Cham, pp 247–262

Westland JC (2015) An introduction to structural equation models. Structural equation models: from paths to networks. Springer International Publishing, Cham, pp 1–8

White RT, Arzi HJ (2005) Longitudinal studies: designs, validity, practicality, and value. Res Sci Educ 35(1):137–149. https://doi.org/10.1007/s11165-004-3437-y

Wischgoll A, Pauli C, Reusser K (2019) High levels of cognitive and motivational contingency with increasing task complexity results in higher performance. Instr Sci 47(3):319–352. https://doi.org/10.1007/s11251-019-09485-2

Yuan C, Hedeker D, Mermelstein R, Xie H (2020) A tractable method to account for high-dimensional nonignorable missing data in intensive longitudinal data. Stat Med 39(20):2589–2605. https://doi.org/10.1002/sim.8560

DFG (2019) Guidelines for safeguarding good research practice - code of conduct. https://www.dfg.de/en

OECD (2019) PISA 2018 Assessment and analytical framework. https://doi.org/10.1787/b25efab8-en

Gravetter FJ, Wallnau LB, Forzano LAB, Witnauer JE (eds) (2020) Essentials of statistics for the behavioral sciences (10 ed). Cengage, Boston

Download references

Acknowledgements

We would like to thank Robert Evans (University of Copenhagen) and William Romine (Wright State University, Dayton) for carefully and critically reviewing this chapter.

Author information

Authors and affiliations.

Universität Duisburg-Essen, Essen, Germany

Hans E. Fischer

Miami University, Oxford, OH, USA

William Boone

Ruhr-Universität Bochum, Bochum, Germany

Heiko Krabbe

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Hans E. Fischer .

Editor information

Editors and affiliations.

Hans Ernst Fischer

Ludwig-Maximilian-Universität, München, Germany

Raimund Girwidz

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Fischer, H.E., Boone, W., Krabbe, H. (2021). Methodical Basics of Empirical Research. In: Fischer, H.E., Girwidz, R. (eds) Physics Education. Challenges in Physics Education. Springer, Cham. https://doi.org/10.1007/978-3-030-87391-2_16

Download citation

DOI : https://doi.org/10.1007/978-3-030-87391-2_16

Published : 12 January 2022

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-87390-5

Online ISBN : 978-3-030-87391-2

eBook Packages : Education Education (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of springeropen

What is Qualitative in Qualitative Research

Patrik aspers.

1 Department of Sociology, Uppsala University, Uppsala, Sweden

2 Seminar for Sociology, Universität St. Gallen, St. Gallen, Switzerland

3 Department of Media and Social Sciences, University of Stavanger, Stavanger, Norway

What is qualitative research? If we look for a precise definition of qualitative research, and specifically for one that addresses its distinctive feature of being “qualitative,” the literature is meager. In this article we systematically search, identify and analyze a sample of 89 sources using or attempting to define the term “qualitative.” Then, drawing on ideas we find scattered across existing work, and based on Becker’s classic study of marijuana consumption, we formulate and illustrate a definition that tries to capture its core elements. We define qualitative research as an iterative process in which improved understanding to the scientific community is achieved by making new significant distinctions resulting from getting closer to the phenomenon studied. This formulation is developed as a tool to help improve research designs while stressing that a qualitative dimension is present in quantitative work as well. Additionally, it can facilitate teaching, communication between researchers, diminish the gap between qualitative and quantitative researchers, help to address critiques of qualitative methods, and be used as a standard of evaluation of qualitative research.

If we assume that there is something called qualitative research, what exactly is this qualitative feature? And how could we evaluate qualitative research as good or not? Is it fundamentally different from quantitative research? In practice, most active qualitative researchers working with empirical material intuitively know what is involved in doing qualitative research, yet perhaps surprisingly, a clear definition addressing its key feature is still missing.

To address the question of what is qualitative we turn to the accounts of “qualitative research” in textbooks and also in empirical work. In his classic, explorative, interview study of deviance Howard Becker ( 1963 ) asks ‘How does one become a marijuana user?’ In contrast to pre-dispositional and psychological-individualistic theories of deviant behavior, Becker’s inherently social explanation contends that becoming a user of this substance is the result of a three-phase sequential learning process. First, potential users need to learn how to smoke it properly to produce the “correct” effects. If not, they are likely to stop experimenting with it. Second, they need to discover the effects associated with it; in other words, to get “high,” individuals not only have to experience what the drug does, but also to become aware that those sensations are related to using it. Third, they require learning to savor the feelings related to its consumption – to develop an acquired taste. Becker, who played music himself, gets close to the phenomenon by observing, taking part, and by talking to people consuming the drug: “half of the fifty interviews were conducted with musicians, the other half covered a wide range of people, including laborers, machinists, and people in the professions” (Becker 1963 :56).

Another central aspect derived through the common-to-all-research interplay between induction and deduction (Becker 2017 ), is that during the course of his research Becker adds scientifically meaningful new distinctions in the form of three phases—distinctions, or findings if you will, that strongly affect the course of his research: its focus, the material that he collects, and which eventually impact his findings. Each phase typically unfolds through social interaction, and often with input from experienced users in “a sequence of social experiences during which the person acquires a conception of the meaning of the behavior, and perceptions and judgments of objects and situations, all of which make the activity possible and desirable” (Becker 1963 :235). In this study the increased understanding of smoking dope is a result of a combination of the meaning of the actors, and the conceptual distinctions that Becker introduces based on the views expressed by his respondents. Understanding is the result of research and is due to an iterative process in which data, concepts and evidence are connected with one another (Becker 2017 ).

Indeed, there are many definitions of qualitative research, but if we look for a definition that addresses its distinctive feature of being “qualitative,” the literature across the broad field of social science is meager. The main reason behind this article lies in the paradox, which, to put it bluntly, is that researchers act as if they know what it is, but they cannot formulate a coherent definition. Sociologists and others will of course continue to conduct good studies that show the relevance and value of qualitative research addressing scientific and practical problems in society. However, our paper is grounded in the idea that providing a clear definition will help us improve the work that we do. Among researchers who practice qualitative research there is clearly much knowledge. We suggest that a definition makes this knowledge more explicit. If the first rationale for writing this paper refers to the “internal” aim of improving qualitative research, the second refers to the increased “external” pressure that especially many qualitative researchers feel; pressure that comes both from society as well as from other scientific approaches. There is a strong core in qualitative research, and leading researchers tend to agree on what it is and how it is done. Our critique is not directed at the practice of qualitative research, but we do claim that the type of systematic work we do has not yet been done, and that it is useful to improve the field and its status in relation to quantitative research.

The literature on the “internal” aim of improving, or at least clarifying qualitative research is large, and we do not claim to be the first to notice the vagueness of the term “qualitative” (Strauss and Corbin 1998 ). Also, others have noted that there is no single definition of it (Long and Godfrey 2004 :182), that there are many different views on qualitative research (Denzin and Lincoln 2003 :11; Jovanović 2011 :3), and that more generally, we need to define its meaning (Best 2004 :54). Strauss and Corbin ( 1998 ), for example, as well as Nelson et al. (1992:2 cited in Denzin and Lincoln 2003 :11), and Flick ( 2007 :ix–x), have recognized that the term is problematic: “Actually, the term ‘qualitative research’ is confusing because it can mean different things to different people” (Strauss and Corbin 1998 :10–11). Hammersley has discussed the possibility of addressing the problem, but states that “the task of providing an account of the distinctive features of qualitative research is far from straightforward” ( 2013 :2). This confusion, as he has recently further argued (Hammersley 2018 ), is also salient in relation to ethnography where different philosophical and methodological approaches lead to a lack of agreement about what it means.

Others (e.g. Hammersley 2018 ; Fine and Hancock 2017 ) have also identified the treat to qualitative research that comes from external forces, seen from the point of view of “qualitative research.” This threat can be further divided into that which comes from inside academia, such as the critique voiced by “quantitative research” and outside of academia, including, for example, New Public Management. Hammersley ( 2018 ), zooming in on one type of qualitative research, ethnography, has argued that it is under treat. Similarly to Fine ( 2003 ), and before him Gans ( 1999 ), he writes that ethnography’ has acquired a range of meanings, and comes in many different versions, these often reflecting sharply divergent epistemological orientations. And already more than twenty years ago while reviewing Denzin and Lincoln’ s Handbook of Qualitative Methods Fine argued:

While this increasing centrality [of qualitative research] might lead one to believe that consensual standards have developed, this belief would be misleading. As the methodology becomes more widely accepted, querulous challengers have raised fundamental questions that collectively have undercut the traditional models of how qualitative research is to be fashioned and presented (1995:417).

According to Hammersley, there are today “serious treats to the practice of ethnographic work, on almost any definition” ( 2018 :1). He lists five external treats: (1) that social research must be accountable and able to show its impact on society; (2) the current emphasis on “big data” and the emphasis on quantitative data and evidence; (3) the labor market pressure in academia that leaves less time for fieldwork (see also Fine and Hancock 2017 ); (4) problems of access to fields; and (5) the increased ethical scrutiny of projects, to which ethnography is particularly exposed. Hammersley discusses some more or less insufficient existing definitions of ethnography.

The current situation, as Hammersley and others note—and in relation not only to ethnography but also qualitative research in general, and as our empirical study shows—is not just unsatisfactory, it may even be harmful for the entire field of qualitative research, and does not help social science at large. We suggest that the lack of clarity of qualitative research is a real problem that must be addressed.

Towards a Definition of Qualitative Research

Seen in an historical light, what is today called qualitative, or sometimes ethnographic, interpretative research – or a number of other terms – has more or less always existed. At the time the founders of sociology – Simmel, Weber, Durkheim and, before them, Marx – were writing, and during the era of the Methodenstreit (“dispute about methods”) in which the German historical school emphasized scientific methods (cf. Swedberg 1990 ), we can at least speak of qualitative forerunners.

Perhaps the most extended discussion of what later became known as qualitative methods in a classic work is Bronisław Malinowski’s ( 1922 ) Argonauts in the Western Pacific , although even this study does not explicitly address the meaning of “qualitative.” In Weber’s ([1921–-22] 1978) work we find a tension between scientific explanations that are based on observation and quantification and interpretative research (see also Lazarsfeld and Barton 1982 ).

If we look through major sociology journals like the American Sociological Review , American Journal of Sociology , or Social Forces we will not find the term qualitative sociology before the 1970s. And certainly before then much of what we consider qualitative classics in sociology, like Becker’ study ( 1963 ), had already been produced. Indeed, the Chicago School often combined qualitative and quantitative data within the same study (Fine 1995 ). Our point being that before a disciplinary self-awareness the term quantitative preceded qualitative, and the articulation of the former was a political move to claim scientific status (Denzin and Lincoln 2005 ). In the US the World War II seem to have sparked a critique of sociological work, including “qualitative work,” that did not follow the scientific canon (Rawls 2018 ), which was underpinned by a scientifically oriented and value free philosophy of science. As a result the attempts and practice of integrating qualitative and quantitative sociology at Chicago lost ground to sociology that was more oriented to surveys and quantitative work at Columbia under Merton-Lazarsfeld. The quantitative tradition was also able to present textbooks (Lundberg 1951 ) that facilitated the use this approach and its “methods.” The practices of the qualitative tradition, by and large, remained tacit or was part of the mentoring transferred from the renowned masters to their students.

This glimpse into history leads us back to the lack of a coherent account condensed in a definition of qualitative research. Many of the attempts to define the term do not meet the requirements of a proper definition: A definition should be clear, avoid tautology, demarcate its domain in relation to the environment, and ideally only use words in its definiens that themselves are not in need of definition (Hempel 1966 ). A definition can enhance precision and thus clarity by identifying the core of the phenomenon. Preferably, a definition should be short. The typical definition we have found, however, is an ostensive definition, which indicates what qualitative research is about without informing us about what it actually is :

Qualitative research is multimethod in focus, involving an interpretative, naturalistic approach to its subject matter. This means that qualitative researchers study things in their natural settings, attempting to make sense of, or interpret, phenomena in terms of the meanings people bring to them. Qualitative research involves the studied use and collection of a variety of empirical materials – case study, personal experience, introspective, life story, interview, observational, historical, interactional, and visual texts – that describe routine and problematic moments and meanings in individuals’ lives. (Denzin and Lincoln 2005 :2)

Flick claims that the label “qualitative research” is indeed used as an umbrella for a number of approaches ( 2007 :2–4; 2002 :6), and it is not difficult to identify research fitting this designation. Moreover, whatever it is, it has grown dramatically over the past five decades. In addition, courses have been developed, methods have flourished, arguments about its future have been advanced (for example, Denzin and Lincoln 1994) and criticized (for example, Snow and Morrill 1995 ), and dedicated journals and books have mushroomed. Most social scientists have a clear idea of research and how it differs from journalism, politics and other activities. But the question of what is qualitative in qualitative research is either eluded or eschewed.

We maintain that this lacuna hinders systematic knowledge production based on qualitative research. Paul Lazarsfeld noted the lack of “codification” as early as 1955 when he reviewed 100 qualitative studies in order to offer a codification of the practices (Lazarsfeld and Barton 1982 :239). Since then many texts on “qualitative research” and its methods have been published, including recent attempts (Goertz and Mahoney 2012 ) similar to Lazarsfeld’s. These studies have tried to extract what is qualitative by looking at the large number of empirical “qualitative” studies. Our novel strategy complements these endeavors by taking another approach and looking at the attempts to codify these practices in the form of a definition, as well as to a minor extent take Becker’s study as an exemplar of what qualitative researchers actually do, and what the characteristic of being ‘qualitative’ denotes and implies. We claim that qualitative researchers, if there is such a thing as “qualitative research,” should be able to codify their practices in a condensed, yet general way expressed in language.

Lingering problems of “generalizability” and “how many cases do I need” (Small 2009 ) are blocking advancement – in this line of work qualitative approaches are said to differ considerably from quantitative ones, while some of the former unsuccessfully mimic principles related to the latter (Small 2009 ). Additionally, quantitative researchers sometimes unfairly criticize the first based on their own quality criteria. Scholars like Goertz and Mahoney ( 2012 ) have successfully focused on the different norms and practices beyond what they argue are essentially two different cultures: those working with either qualitative or quantitative methods. Instead, similarly to Becker ( 2017 ) who has recently questioned the usefulness of the distinction between qualitative and quantitative research, we focus on similarities.

The current situation also impedes both students and researchers in focusing their studies and understanding each other’s work (Lazarsfeld and Barton 1982 :239). A third consequence is providing an opening for critiques by scholars operating within different traditions (Valsiner 2000 :101). A fourth issue is that the “implicit use of methods in qualitative research makes the field far less standardized than the quantitative paradigm” (Goertz and Mahoney 2012 :9). Relatedly, the National Science Foundation in the US organized two workshops in 2004 and 2005 to address the scientific foundations of qualitative research involving strategies to improve it and to develop standards of evaluation in qualitative research. However, a specific focus on its distinguishing feature of being “qualitative” while being implicitly acknowledged, was discussed only briefly (for example, Best 2004 ).

In 2014 a theme issue was published in this journal on “Methods, Materials, and Meanings: Designing Cultural Analysis,” discussing central issues in (cultural) qualitative research (Berezin 2014 ; Biernacki 2014 ; Glaeser 2014 ; Lamont and Swidler 2014 ; Spillman 2014). We agree with many of the arguments put forward, such as the risk of methodological tribalism, and that we should not waste energy on debating methods separated from research questions. Nonetheless, a clarification of the relation to what is called “quantitative research” is of outmost importance to avoid misunderstandings and misguided debates between “qualitative” and “quantitative” researchers. Our strategy means that researchers, “qualitative” or “quantitative” they may be, in their actual practice may combine qualitative work and quantitative work.

In this article we accomplish three tasks. First, we systematically survey the literature for meanings of qualitative research by looking at how researchers have defined it. Drawing upon existing knowledge we find that the different meanings and ideas of qualitative research are not yet coherently integrated into one satisfactory definition. Next, we advance our contribution by offering a definition of qualitative research and illustrate its meaning and use partially by expanding on the brief example introduced earlier related to Becker’s work ( 1963 ). We offer a systematic analysis of central themes of what researchers consider to be the core of “qualitative,” regardless of style of work. These themes – which we summarize in terms of four keywords: distinction, process, closeness, improved understanding – constitute part of our literature review, in which each one appears, sometimes with others, but never all in the same definition. They serve as the foundation of our contribution. Our categories are overlapping. Their use is primarily to organize the large amount of definitions we have identified and analyzed, and not necessarily to draw a clear distinction between them. Finally, we continue the elaboration discussed above on the advantages of a clear definition of qualitative research.

In a hermeneutic fashion we propose that there is something meaningful that deserves to be labelled “qualitative research” (Gadamer 1990 ). To approach the question “What is qualitative in qualitative research?” we have surveyed the literature. In conducting our survey we first traced the word’s etymology in dictionaries, encyclopedias, handbooks of the social sciences and of methods and textbooks, mainly in English, which is common to methodology courses. It should be noted that we have zoomed in on sociology and its literature. This discipline has been the site of the largest debate and development of methods that can be called “qualitative,” which suggests that this field should be examined in great detail.

In an ideal situation we should expect that one good definition, or at least some common ideas, would have emerged over the years. This common core of qualitative research should be so accepted that it would appear in at least some textbooks. Since this is not what we found, we decided to pursue an inductive approach to capture maximal variation in the field of qualitative research; we searched in a selection of handbooks, textbooks, book chapters, and books, to which we added the analysis of journal articles. Our sample comprises a total of 89 references.

In practice we focused on the discipline that has had a clear discussion of methods, namely sociology. We also conducted a broad search in the JSTOR database to identify scholarly sociology articles published between 1998 and 2017 in English with a focus on defining or explaining qualitative research. We specifically zoom in on this time frame because we would have expect that this more mature period would have produced clear discussions on the meaning of qualitative research. To find these articles we combined a number of keywords to search the content and/or the title: qualitative (which was always included), definition, empirical, research, methodology, studies, fieldwork, interview and observation .

As a second phase of our research we searched within nine major sociological journals ( American Journal of Sociology , Sociological Theory , American Sociological Review , Contemporary Sociology , Sociological Forum , Sociological Theory , Qualitative Research , Qualitative Sociology and Qualitative Sociology Review ) for articles also published during the past 19 years (1998–2017) that had the term “qualitative” in the title and attempted to define qualitative research.

Lastly we picked two additional journals, Qualitative Research and Qualitative Sociology , in which we could expect to find texts addressing the notion of “qualitative.” From Qualitative Research we chose Volume 14, Issue 6, December 2014, and from Qualitative Sociology we chose Volume 36, Issue 2, June 2017. Within each of these we selected the first article; then we picked the second article of three prior issues. Again we went back another three issues and investigated article number three. Finally we went back another three issues and perused article number four. This selection criteria was used to get a manageable sample for the analysis.

The coding process of the 89 references we gathered in our selected review began soon after the first round of material was gathered, and we reduced the complexity created by our maximum variation sampling (Snow and Anderson 1993 :22) to four different categories within which questions on the nature and properties of qualitative research were discussed. We call them: Qualitative and Quantitative Research, Qualitative Research, Fieldwork, and Grounded Theory. This – which may appear as an illogical grouping – merely reflects the “context” in which the matter of “qualitative” is discussed. If the selection process of the material – books and articles – was informed by pre-knowledge, we used an inductive strategy to code the material. When studying our material, we identified four central notions related to “qualitative” that appear in various combinations in the literature which indicate what is the core of qualitative research. We have labeled them: “distinctions”, “process,” “closeness,” and “improved understanding.” During the research process the categories and notions were improved, refined, changed, and reordered. The coding ended when a sense of saturation in the material arose. In the presentation below all quotations and references come from our empirical material of texts on qualitative research.

Analysis – What is Qualitative Research?

In this section we describe the four categories we identified in the coding, how they differently discuss qualitative research, as well as their overall content. Some salient quotations are selected to represent the type of text sorted under each of the four categories. What we present are examples from the literature.

Qualitative and Quantitative

This analytic category comprises quotations comparing qualitative and quantitative research, a distinction that is frequently used (Brown 2010 :231); in effect this is a conceptual pair that structures the discussion and that may be associated with opposing interests. While the general goal of quantitative and qualitative research is the same – to understand the world better – their methodologies and focus in certain respects differ substantially (Becker 1966 :55). Quantity refers to that property of something that can be determined by measurement. In a dictionary of Statistics and Methodology we find that “(a) When referring to *variables, ‘qualitative’ is another term for *categorical or *nominal. (b) When speaking of kinds of research, ‘qualitative’ refers to studies of subjects that are hard to quantify, such as art history. Qualitative research tends to be a residual category for almost any kind of non-quantitative research” (Stiles 1998:183). But it should be obvious that one could employ a quantitative approach when studying, for example, art history.

The same dictionary states that quantitative is “said of variables or research that can be handled numerically, usually (too sharply) contrasted with *qualitative variables and research” (Stiles 1998:184). From a qualitative perspective “quantitative research” is about numbers and counting, and from a quantitative perspective qualitative research is everything that is not about numbers. But this does not say much about what is “qualitative.” If we turn to encyclopedias we find that in the 1932 edition of the Encyclopedia of the Social Sciences there is no mention of “qualitative.” In the Encyclopedia from 1968 we can read:

Qualitative Analysis. For methods of obtaining, analyzing, and describing data, see [the various entries:] CONTENT ANALYSIS; COUNTED DATA; EVALUATION RESEARCH, FIELD WORK; GRAPHIC PRESENTATION; HISTORIOGRAPHY, especially the article on THE RHETORIC OF HISTORY; INTERVIEWING; OBSERVATION; PERSONALITY MEASUREMENT; PROJECTIVE METHODS; PSYCHOANALYSIS, article on EXPERIMENTAL METHODS; SURVEY ANALYSIS, TABULAR PRESENTATION; TYPOLOGIES. (Vol. 13:225)

Some, like Alford, divide researchers into methodologists or, in his words, “quantitative and qualitative specialists” (Alford 1998 :12). Qualitative research uses a variety of methods, such as intensive interviews or in-depth analysis of historical materials, and it is concerned with a comprehensive account of some event or unit (King et al. 1994 :4). Like quantitative research it can be utilized to study a variety of issues, but it tends to focus on meanings and motivations that underlie cultural symbols, personal experiences, phenomena and detailed understanding of processes in the social world. In short, qualitative research centers on understanding processes, experiences, and the meanings people assign to things (Kalof et al. 2008 :79).

Others simply say that qualitative methods are inherently unscientific (Jovanović 2011 :19). Hood, for instance, argues that words are intrinsically less precise than numbers, and that they are therefore more prone to subjective analysis, leading to biased results (Hood 2006 :219). Qualitative methodologies have raised concerns over the limitations of quantitative templates (Brady et al. 2004 :4). Scholars such as King et al. ( 1994 ), for instance, argue that non-statistical research can produce more reliable results if researchers pay attention to the rules of scientific inference commonly stated in quantitative research. Also, researchers such as Becker ( 1966 :59; 1970 :42–43) have asserted that, if conducted properly, qualitative research and in particular ethnographic field methods, can lead to more accurate results than quantitative studies, in particular, survey research and laboratory experiments.

Some researchers, such as Kalof, Dan, and Dietz ( 2008 :79) claim that the boundaries between the two approaches are becoming blurred, and Small ( 2009 ) argues that currently much qualitative research (especially in North America) tries unsuccessfully and unnecessarily to emulate quantitative standards. For others, qualitative research tends to be more humanistic and discursive (King et al. 1994 :4). Ragin ( 1994 ), and similarly also Becker, ( 1996 :53), Marchel and Owens ( 2007 :303) think that the main distinction between the two styles is overstated and does not rest on the simple dichotomy of “numbers versus words” (Ragin 1994 :xii). Some claim that quantitative data can be utilized to discover associations, but in order to unveil cause and effect a complex research design involving the use of qualitative approaches needs to be devised (Gilbert 2009 :35). Consequently, qualitative data are useful for understanding the nuances lying beyond those processes as they unfold (Gilbert 2009 :35). Others contend that qualitative research is particularly well suited both to identify causality and to uncover fine descriptive distinctions (Fine and Hallett 2014 ; Lichterman and Isaac Reed 2014 ; Katz 2015 ).

There are other ways to separate these two traditions, including normative statements about what qualitative research should be (that is, better or worse than quantitative approaches, concerned with scientific approaches to societal change or vice versa; Snow and Morrill 1995 ; Denzin and Lincoln 2005 ), or whether it should develop falsifiable statements; Best 2004 ).

We propose that quantitative research is largely concerned with pre-determined variables (Small 2008 ); the analysis concerns the relations between variables. These categories are primarily not questioned in the study, only their frequency or degree, or the correlations between them (cf. Franzosi 2016 ). If a researcher studies wage differences between women and men, he or she works with given categories: x number of men are compared with y number of women, with a certain wage attributed to each person. The idea is not to move beyond the given categories of wage, men and women; they are the starting point as well as the end point, and undergo no “qualitative change.” Qualitative research, in contrast, investigates relations between categories that are themselves subject to change in the research process. Returning to Becker’s study ( 1963 ), we see that he questioned pre-dispositional theories of deviant behavior working with pre-determined variables such as an individual’s combination of personal qualities or emotional problems. His take, in contrast, was to understand marijuana consumption by developing “variables” as part of the investigation. Thereby he presented new variables, or as we would say today, theoretical concepts, but which are grounded in the empirical material.

Qualitative Research

This category contains quotations that refer to descriptions of qualitative research without making comparisons with quantitative research. Researchers such as Denzin and Lincoln, who have written a series of influential handbooks on qualitative methods (1994; Denzin and Lincoln 2003 ; 2005 ), citing Nelson et al. (1992:4), argue that because qualitative research is “interdisciplinary, transdisciplinary, and sometimes counterdisciplinary” it is difficult to derive one single definition of it (Jovanović 2011 :3). According to them, in fact, “the field” is “many things at the same time,” involving contradictions, tensions over its focus, methods, and how to derive interpretations and findings ( 2003 : 11). Similarly, others, such as Flick ( 2007 :ix–x) contend that agreeing on an accepted definition has increasingly become problematic, and that qualitative research has possibly matured different identities. However, Best holds that “the proliferation of many sorts of activities under the label of qualitative sociology threatens to confuse our discussions” ( 2004 :54). Atkinson’s position is more definite: “the current state of qualitative research and research methods is confused” ( 2005 :3–4).

Qualitative research is about interpretation (Blumer 1969 ; Strauss and Corbin 1998 ; Denzin and Lincoln 2003 ), or Verstehen [understanding] (Frankfort-Nachmias and Nachmias 1996 ). It is “multi-method,” involving the collection and use of a variety of empirical materials (Denzin and Lincoln 1998; Silverman 2013 ) and approaches (Silverman 2005 ; Flick 2007 ). It focuses not only on the objective nature of behavior but also on its subjective meanings: individuals’ own accounts of their attitudes, motivations, behavior (McIntyre 2005 :127; Creswell 2009 ), events and situations (Bryman 1989) – what people say and do in specific places and institutions (Goodwin and Horowitz 2002 :35–36) in social and temporal contexts (Morrill and Fine 1997). For this reason, following Weber ([1921-22] 1978), it can be described as an interpretative science (McIntyre 2005 :127). But could quantitative research also be concerned with these questions? Also, as pointed out below, does all qualitative research focus on subjective meaning, as some scholars suggest?

Others also distinguish qualitative research by claiming that it collects data using a naturalistic approach (Denzin and Lincoln 2005 :2; Creswell 2009 ), focusing on the meaning actors ascribe to their actions. But again, does all qualitative research need to be collected in situ? And does qualitative research have to be inherently concerned with meaning? Flick ( 2007 ), referring to Denzin and Lincoln ( 2005 ), mentions conversation analysis as an example of qualitative research that is not concerned with the meanings people bring to a situation, but rather with the formal organization of talk. Still others, such as Ragin ( 1994 :85), note that qualitative research is often (especially early on in the project, we would add) less structured than other kinds of social research – a characteristic connected to its flexibility and that can lead both to potentially better, but also worse results. But is this not a feature of this type of research, rather than a defining description of its essence? Wouldn’t this comment also apply, albeit to varying degrees, to quantitative research?

In addition, Strauss ( 2003 ), along with others, such as Alvesson and Kärreman ( 2011 :10–76), argue that qualitative researchers struggle to capture and represent complex phenomena partially because they tend to collect a large amount of data. While his analysis is correct at some points – “It is necessary to do detailed, intensive, microscopic examination of the data in order to bring out the amazing complexity of what lies in, behind, and beyond those data” (Strauss 2003 :10) – much of his analysis concerns the supposed focus of qualitative research and its challenges, rather than exactly what it is about. But even in this instance we would make a weak case arguing that these are strictly the defining features of qualitative research. Some researchers seem to focus on the approach or the methods used, or even on the way material is analyzed. Several researchers stress the naturalistic assumption of investigating the world, suggesting that meaning and interpretation appear to be a core matter of qualitative research.

We can also see that in this category there is no consensus about specific qualitative methods nor about qualitative data. Many emphasize interpretation, but quantitative research, too, involves interpretation; the results of a regression analysis, for example, certainly have to be interpreted, and the form of meta-analysis that factor analysis provides indeed requires interpretation However, there is no interpretation of quantitative raw data, i.e., numbers in tables. One common thread is that qualitative researchers have to get to grips with their data in order to understand what is being studied in great detail, irrespective of the type of empirical material that is being analyzed. This observation is connected to the fact that qualitative researchers routinely make several adjustments of focus and research design as their studies progress, in many cases until the very end of the project (Kalof et al. 2008 ). If you, like Becker, do not start out with a detailed theory, adjustments such as the emergence and refinement of research questions will occur during the research process. We have thus found a number of useful reflections about qualitative research scattered across different sources, but none of them effectively describe the defining characteristics of this approach.

Although qualitative research does not appear to be defined in terms of a specific method, it is certainly common that fieldwork, i.e., research that entails that the researcher spends considerable time in the field that is studied and use the knowledge gained as data, is seen as emblematic of or even identical to qualitative research. But because we understand that fieldwork tends to focus primarily on the collection and analysis of qualitative data, we expected to find within it discussions on the meaning of “qualitative.” But, again, this was not the case.

Instead, we found material on the history of this approach (for example, Frankfort-Nachmias and Nachmias 1996 ; Atkinson et al. 2001), including how it has changed; for example, by adopting a more self-reflexive practice (Heyl 2001), as well as the different nomenclature that has been adopted, such as fieldwork, ethnography, qualitative research, naturalistic research, participant observation and so on (for example, Lofland et al. 2006 ; Gans 1999 ).

We retrieved definitions of ethnography, such as “the study of people acting in the natural courses of their daily lives,” involving a “resocialization of the researcher” (Emerson 1988 :1) through intense immersion in others’ social worlds (see also examples in Hammersley 2018 ). This may be accomplished by direct observation and also participation (Neuman 2007 :276), although others, such as Denzin ( 1970 :185), have long recognized other types of observation, including non-participant (“fly on the wall”). In this category we have also isolated claims and opposing views, arguing that this type of research is distinguished primarily by where it is conducted (natural settings) (Hughes 1971:496), and how it is carried out (a variety of methods are applied) or, for some most importantly, by involving an active, empathetic immersion in those being studied (Emerson 1988 :2). We also retrieved descriptions of the goals it attends in relation to how it is taught (understanding subjective meanings of the people studied, primarily develop theory, or contribute to social change) (see for example, Corte and Irwin 2017 ; Frankfort-Nachmias and Nachmias 1996 :281; Trier-Bieniek 2012 :639) by collecting the richest possible data (Lofland et al. 2006 ) to derive “thick descriptions” (Geertz 1973 ), and/or to aim at theoretical statements of general scope and applicability (for example, Emerson 1988 ; Fine 2003 ). We have identified guidelines on how to evaluate it (for example Becker 1996 ; Lamont 2004 ) and have retrieved instructions on how it should be conducted (for example, Lofland et al. 2006 ). For instance, analysis should take place while the data gathering unfolds (Emerson 1988 ; Hammersley and Atkinson 2007 ; Lofland et al. 2006 ), observations should be of long duration (Becker 1970 :54; Goffman 1989 ), and data should be of high quantity (Becker 1970 :52–53), as well as other questionable distinctions between fieldwork and other methods:

Field studies differ from other methods of research in that the researcher performs the task of selecting topics, decides what questions to ask, and forges interest in the course of the research itself . This is in sharp contrast to many ‘theory-driven’ and ‘hypothesis-testing’ methods. (Lofland and Lofland 1995 :5)

But could not, for example, a strictly interview-based study be carried out with the same amount of flexibility, such as sequential interviewing (for example, Small 2009 )? Once again, are quantitative approaches really as inflexible as some qualitative researchers think? Moreover, this category stresses the role of the actors’ meaning, which requires knowledge and close interaction with people, their practices and their lifeworld.

It is clear that field studies – which are seen by some as the “gold standard” of qualitative research – are nonetheless only one way of doing qualitative research. There are other methods, but it is not clear why some are more qualitative than others, or why they are better or worse. Fieldwork is characterized by interaction with the field (the material) and understanding of the phenomenon that is being studied. In Becker’s case, he had general experience from fields in which marihuana was used, based on which he did interviews with actual users in several fields.

Grounded Theory

Another major category we identified in our sample is Grounded Theory. We found descriptions of it most clearly in Glaser and Strauss’ ([1967] 2010 ) original articulation, Strauss and Corbin ( 1998 ) and Charmaz ( 2006 ), as well as many other accounts of what it is for: generating and testing theory (Strauss 2003 :xi). We identified explanations of how this task can be accomplished – such as through two main procedures: constant comparison and theoretical sampling (Emerson 1998:96), and how using it has helped researchers to “think differently” (for example, Strauss and Corbin 1998 :1). We also read descriptions of its main traits, what it entails and fosters – for instance, an exceptional flexibility, an inductive approach (Strauss and Corbin 1998 :31–33; 1990; Esterberg 2002 :7), an ability to step back and critically analyze situations, recognize tendencies towards bias, think abstractly and be open to criticism, enhance sensitivity towards the words and actions of respondents, and develop a sense of absorption and devotion to the research process (Strauss and Corbin 1998 :5–6). Accordingly, we identified discussions of the value of triangulating different methods (both using and not using grounded theory), including quantitative ones, and theories to achieve theoretical development (most comprehensively in Denzin 1970 ; Strauss and Corbin 1998 ; Timmermans and Tavory 2012 ). We have also located arguments about how its practice helps to systematize data collection, analysis and presentation of results (Glaser and Strauss [1967] 2010 :16).

Grounded theory offers a systematic approach which requires researchers to get close to the field; closeness is a requirement of identifying questions and developing new concepts or making further distinctions with regard to old concepts. In contrast to other qualitative approaches, grounded theory emphasizes the detailed coding process, and the numerous fine-tuned distinctions that the researcher makes during the process. Within this category, too, we could not find a satisfying discussion of the meaning of qualitative research.

Defining Qualitative Research

In sum, our analysis shows that some notions reappear in the discussion of qualitative research, such as understanding, interpretation, “getting close” and making distinctions. These notions capture aspects of what we think is “qualitative.” However, a comprehensive definition that is useful and that can further develop the field is lacking, and not even a clear picture of its essential elements appears. In other words no definition emerges from our data, and in our research process we have moved back and forth between our empirical data and the attempt to present a definition. Our concrete strategy, as stated above, is to relate qualitative and quantitative research, or more specifically, qualitative and quantitative work. We use an ideal-typical notion of quantitative research which relies on taken for granted and numbered variables. This means that the data consists of variables on different scales, such as ordinal, but frequently ratio and absolute scales, and the representation of the numbers to the variables, i.e. the justification of the assignment of numbers to object or phenomenon, are not questioned, though the validity may be questioned. In this section we return to the notion of quality and try to clarify it while presenting our contribution.

Broadly, research refers to the activity performed by people trained to obtain knowledge through systematic procedures. Notions such as “objectivity” and “reflexivity,” “systematic,” “theory,” “evidence” and “openness” are here taken for granted in any type of research. Next, building on our empirical analysis we explain the four notions that we have identified as central to qualitative work: distinctions, process, closeness, and improved understanding. In discussing them, ultimately in relation to one another, we make their meaning even more precise. Our idea, in short, is that only when these ideas that we present separately for analytic purposes are brought together can we speak of qualitative research.

Distinctions

We believe that the possibility of making new distinctions is one the defining characteristics of qualitative research. It clearly sets it apart from quantitative analysis which works with taken-for-granted variables, albeit as mentioned, meta-analyses, for example, factor analysis may result in new variables. “Quality” refers essentially to distinctions, as already pointed out by Aristotle. He discusses the term “qualitative” commenting: “By a quality I mean that in virtue of which things are said to be qualified somehow” (Aristotle 1984:14). Quality is about what something is or has, which means that the distinction from its environment is crucial. We see qualitative research as a process in which significant new distinctions are made to the scholarly community; to make distinctions is a key aspect of obtaining new knowledge; a point, as we will see, that also has implications for “quantitative research.” The notion of being “significant” is paramount. New distinctions by themselves are not enough; just adding concepts only increases complexity without furthering our knowledge. The significance of new distinctions is judged against the communal knowledge of the research community. To enable this discussion and judgements central elements of rational discussion are required (cf. Habermas [1981] 1987 ; Davidsson [ 1988 ] 2001) to identify what is new and relevant scientific knowledge. Relatedly, Ragin alludes to the idea of new and useful knowledge at a more concrete level: “Qualitative methods are appropriate for in-depth examination of cases because they aid the identification of key features of cases. Most qualitative methods enhance data” (1994:79). When Becker ( 1963 ) studied deviant behavior and investigated how people became marihuana smokers, he made distinctions between the ways in which people learned how to smoke. This is a classic example of how the strategy of “getting close” to the material, for example the text, people or pictures that are subject to analysis, may enable researchers to obtain deeper insight and new knowledge by making distinctions – in this instance on the initial notion of learning how to smoke. Others have stressed the making of distinctions in relation to coding or theorizing. Emerson et al. ( 1995 ), for example, hold that “qualitative coding is a way of opening up avenues of inquiry,” meaning that the researcher identifies and develops concepts and analytic insights through close examination of and reflection on data (Emerson et al. 1995 :151). Goodwin and Horowitz highlight making distinctions in relation to theory-building writing: “Close engagement with their cases typically requires qualitative researchers to adapt existing theories or to make new conceptual distinctions or theoretical arguments to accommodate new data” ( 2002 : 37). In the ideal-typical quantitative research only existing and so to speak, given, variables would be used. If this is the case no new distinction are made. But, would not also many “quantitative” researchers make new distinctions?

Process does not merely suggest that research takes time. It mainly implies that qualitative new knowledge results from a process that involves several phases, and above all iteration. Qualitative research is about oscillation between theory and evidence, analysis and generating material, between first- and second -order constructs (Schütz 1962 :59), between getting in contact with something, finding sources, becoming deeply familiar with a topic, and then distilling and communicating some of its essential features. The main point is that the categories that the researcher uses, and perhaps takes for granted at the beginning of the research process, usually undergo qualitative changes resulting from what is found. Becker describes how he tested hypotheses and let the jargon of the users develop into theoretical concepts. This happens over time while the study is being conducted, exemplifying what we mean by process.

In the research process, a pilot-study may be used to get a first glance of, for example, the field, how to approach it, and what methods can be used, after which the method and theory are chosen or refined before the main study begins. Thus, the empirical material is often central from the start of the project and frequently leads to adjustments by the researcher. Likewise, during the main study categories are not fixed; the empirical material is seen in light of the theory used, but it is also given the opportunity to kick back, thereby resisting attempts to apply theoretical straightjackets (Becker 1970 :43). In this process, coding and analysis are interwoven, and thus are often important steps for getting closer to the phenomenon and deciding what to focus on next. Becker began his research by interviewing musicians close to him, then asking them to refer him to other musicians, and later on doubling his original sample of about 25 to include individuals in other professions (Becker 1973:46). Additionally, he made use of some participant observation, documents, and interviews with opiate users made available to him by colleagues. As his inductive theory of deviance evolved, Becker expanded his sample in order to fine tune it, and test the accuracy and generality of his hypotheses. In addition, he introduced a negative case and discussed the null hypothesis ( 1963 :44). His phasic career model is thus based on a research design that embraces processual work. Typically, process means to move between “theory” and “material” but also to deal with negative cases, and Becker ( 1998 ) describes how discovering these negative cases impacted his research design and ultimately its findings.

Obviously, all research is process-oriented to some degree. The point is that the ideal-typical quantitative process does not imply change of the data, and iteration between data, evidence, hypotheses, empirical work, and theory. The data, quantified variables, are, in most cases fixed. Merging of data, which of course can be done in a quantitative research process, does not mean new data. New hypotheses are frequently tested, but the “raw data is often the “the same.” Obviously, over time new datasets are made available and put into use.

Another characteristic that is emphasized in our sample is that qualitative researchers – and in particular ethnographers – can, or as Goffman put it, ought to ( 1989 ), get closer to the phenomenon being studied and their data than quantitative researchers (for example, Silverman 2009 :85). Put differently, essentially because of their methods qualitative researchers get into direct close contact with those being investigated and/or the material, such as texts, being analyzed. Becker started out his interview study, as we noted, by talking to those he knew in the field of music to get closer to the phenomenon he was studying. By conducting interviews he got even closer. Had he done more observations, he would undoubtedly have got even closer to the field.

Additionally, ethnographers’ design enables researchers to follow the field over time, and the research they do is almost by definition longitudinal, though the time in the field is studied obviously differs between studies. The general characteristic of closeness over time maximizes the chances of unexpected events, new data (related, for example, to archival research as additional sources, and for ethnography for situations not necessarily previously thought of as instrumental – what Mannay and Morgan ( 2015 ) term the “waiting field”), serendipity (Merton and Barber 2004 ; Åkerström 2013 ), and possibly reactivity, as well as the opportunity to observe disrupted patterns that translate into exemplars of negative cases. Two classic examples of this are Becker’s finding of what medical students call “crocks” (Becker et al. 1961 :317), and Geertz’s ( 1973 ) study of “deep play” in Balinese society.

By getting and staying so close to their data – be it pictures, text or humans interacting (Becker was himself a musician) – for a long time, as the research progressively focuses, qualitative researchers are prompted to continually test their hunches, presuppositions and hypotheses. They test them against a reality that often (but certainly not always), and practically, as well as metaphorically, talks back, whether by validating them, or disqualifying their premises – correctly, as well as incorrectly (Fine 2003 ; Becker 1970 ). This testing nonetheless often leads to new directions for the research. Becker, for example, says that he was initially reading psychological theories, but when facing the data he develops a theory that looks at, you may say, everything but psychological dispositions to explain the use of marihuana. Especially researchers involved with ethnographic methods have a fairly unique opportunity to dig up and then test (in a circular, continuous and temporal way) new research questions and findings as the research progresses, and thereby to derive previously unimagined and uncharted distinctions by getting closer to the phenomenon under study.

Let us stress that getting close is by no means restricted to ethnography. The notion of hermeneutic circle and hermeneutics as a general way of understanding implies that we must get close to the details in order to get the big picture. This also means that qualitative researchers can literally also make use of details of pictures as evidence (cf. Harper 2002). Thus, researchers may get closer both when generating the material or when analyzing it.

Quantitative research, we maintain, in the ideal-typical representation cannot get closer to the data. The data is essentially numbers in tables making up the variables (Franzosi 2016 :138). The data may originally have been “qualitative,” but once reduced to numbers there can only be a type of “hermeneutics” about what the number may stand for. The numbers themselves, however, are non-ambiguous. Thus, in quantitative research, interpretation, if done, is not about the data itself—the numbers—but what the numbers stand for. It follows that the interpretation is essentially done in a more “speculative” mode without direct empirical evidence (cf. Becker 2017 ).

Improved Understanding

While distinction, process and getting closer refer to the qualitative work of the researcher, improved understanding refers to its conditions and outcome of this work. Understanding cuts deeper than explanation, which to some may mean a causally verified correlation between variables. The notion of explanation presupposes the notion of understanding since explanation does not include an idea of how knowledge is gained (Manicas 2006 : 15). Understanding, we argue, is the core concept of what we call the outcome of the process when research has made use of all the other elements that were integrated in the research. Understanding, then, has a special status in qualitative research since it refers both to the conditions of knowledge and the outcome of the process. Understanding can to some extent be seen as the condition of explanation and occurs in a process of interpretation, which naturally refers to meaning (Gadamer 1990 ). It is fundamentally connected to knowing, and to the knowing of how to do things (Heidegger [1927] 2001 ). Conceptually the term hermeneutics is used to account for this process. Heidegger ties hermeneutics to human being and not possible to separate from the understanding of being ( 1988 ). Here we use it in a broader sense, and more connected to method in general (cf. Seiffert 1992 ). The abovementioned aspects – for example, “objectivity” and “reflexivity” – of the approach are conditions of scientific understanding. Understanding is the result of a circular process and means that the parts are understood in light of the whole, and vice versa. Understanding presupposes pre-understanding, or in other words, some knowledge of the phenomenon studied. The pre-understanding, even in the form of prejudices, are in qualitative research process, which we see as iterative, questioned, which gradually or suddenly change due to the iteration of data, evidence and concepts. However, qualitative research generates understanding in the iterative process when the researcher gets closer to the data, e.g., by going back and forth between field and analysis in a process that generates new data that changes the evidence, and, ultimately, the findings. Questioning, to ask questions, and put what one assumes—prejudices and presumption—in question, is central to understand something (Heidegger [1927] 2001 ; Gadamer 1990 :368–384). We propose that this iterative process in which the process of understanding occurs is characteristic of qualitative research.

Improved understanding means that we obtain scientific knowledge of something that we as a scholarly community did not know before, or that we get to know something better. It means that we understand more about how parts are related to one another, and to other things we already understand (see also Fine and Hallett 2014 ). Understanding is an important condition for qualitative research. It is not enough to identify correlations, make distinctions, and work in a process in which one gets close to the field or phenomena. Understanding is accomplished when the elements are integrated in an iterative process.

It is, moreover, possible to understand many things, and researchers, just like children, may come to understand new things every day as they engage with the world. This subjective condition of understanding – namely, that a person gains a better understanding of something –is easily met. To be qualified as “scientific,” the understanding must be general and useful to many; it must be public. But even this generally accessible understanding is not enough in order to speak of “scientific understanding.” Though we as a collective can increase understanding of everything in virtually all potential directions as a result also of qualitative work, we refrain from this “objective” way of understanding, which has no means of discriminating between what we gain in understanding. Scientific understanding means that it is deemed relevant from the scientific horizon (compare Schütz 1962 : 35–38, 46, 63), and that it rests on the pre-understanding that the scientists have and must have in order to understand. In other words, the understanding gained must be deemed useful by other researchers, so that they can build on it. We thus see understanding from a pragmatic, rather than a subjective or objective perspective. Improved understanding is related to the question(s) at hand. Understanding, in order to represent an improvement, must be an improvement in relation to the existing body of knowledge of the scientific community (James [ 1907 ] 1955). Scientific understanding is, by definition, collective, as expressed in Weber’s famous note on objectivity, namely that scientific work aims at truths “which … can claim, even for a Chinese, the validity appropriate to an empirical analysis” ([1904] 1949 :59). By qualifying “improved understanding” we argue that it is a general defining characteristic of qualitative research. Becker‘s ( 1966 ) study and other research of deviant behavior increased our understanding of the social learning processes of how individuals start a behavior. And it also added new knowledge about the labeling of deviant behavior as a social process. Few studies, of course, make the same large contribution as Becker’s, but are nonetheless qualitative research.

Understanding in the phenomenological sense, which is a hallmark of qualitative research, we argue, requires meaning and this meaning is derived from the context, and above all the data being analyzed. The ideal-typical quantitative research operates with given variables with different numbers. This type of material is not enough to establish meaning at the level that truly justifies understanding. In other words, many social science explanations offer ideas about correlations or even causal relations, but this does not mean that the meaning at the level of the data analyzed, is understood. This leads us to say that there are indeed many explanations that meet the criteria of understanding, for example the explanation of how one becomes a marihuana smoker presented by Becker. However, we may also understand a phenomenon without explaining it, and we may have potential explanations, or better correlations, that are not really understood.

We may speak more generally of quantitative research and its data to clarify what we see as an important distinction. The “raw data” that quantitative research—as an idealtypical activity, refers to is not available for further analysis; the numbers, once created, are not to be questioned (Franzosi 2016 : 138). If the researcher is to do “more” or “change” something, this will be done by conjectures based on theoretical knowledge or based on the researcher’s lifeworld. Both qualitative and quantitative research is based on the lifeworld, and all researchers use prejudices and pre-understanding in the research process. This idea is present in the works of Heidegger ( 2001 ) and Heisenberg (cited in Franzosi 2010 :619). Qualitative research, as we argued, involves the interaction and questioning of concepts (theory), data, and evidence.

Ragin ( 2004 :22) points out that “a good definition of qualitative research should be inclusive and should emphasize its key strengths and features, not what it lacks (for example, the use of sophisticated quantitative techniques).” We define qualitative research as an iterative process in which improved understanding to the scientific community is achieved by making new significant distinctions resulting from getting closer to the phenomenon studied. Qualitative research, as defined here, is consequently a combination of two criteria: (i) how to do things –namely, generating and analyzing empirical material, in an iterative process in which one gets closer by making distinctions, and (ii) the outcome –improved understanding novel to the scholarly community. Is our definition applicable to our own study? In this study we have closely read the empirical material that we generated, and the novel distinction of the notion “qualitative research” is the outcome of an iterative process in which both deduction and induction were involved, in which we identified the categories that we analyzed. We thus claim to meet the first criteria, “how to do things.” The second criteria cannot be judged but in a partial way by us, namely that the “outcome” —in concrete form the definition-improves our understanding to others in the scientific community.

We have defined qualitative research, or qualitative scientific work, in relation to quantitative scientific work. Given this definition, qualitative research is about questioning the pre-given (taken for granted) variables, but it is thus also about making new distinctions of any type of phenomenon, for example, by coining new concepts, including the identification of new variables. This process, as we have discussed, is carried out in relation to empirical material, previous research, and thus in relation to theory. Theory and previous research cannot be escaped or bracketed. According to hermeneutic principles all scientific work is grounded in the lifeworld, and as social scientists we can thus never fully bracket our pre-understanding.

We have proposed that quantitative research, as an idealtype, is concerned with pre-determined variables (Small 2008 ). Variables are epistemically fixed, but can vary in terms of dimensions, such as frequency or number. Age is an example; as a variable it can take on different numbers. In relation to quantitative research, qualitative research does not reduce its material to number and variables. If this is done the process of comes to a halt, the researcher gets more distanced from her data, and it makes it no longer possible to make new distinctions that increase our understanding. We have above discussed the components of our definition in relation to quantitative research. Our conclusion is that in the research that is called quantitative there are frequent and necessary qualitative elements.

Further, comparative empirical research on researchers primarily working with ”quantitative” approaches and those working with ”qualitative” approaches, we propose, would perhaps show that there are many similarities in practices of these two approaches. This is not to deny dissimilarities, or the different epistemic and ontic presuppositions that may be more or less strongly associated with the two different strands (see Goertz and Mahoney 2012 ). Our point is nonetheless that prejudices and preconceptions about researchers are unproductive, and that as other researchers have argued, differences may be exaggerated (e.g., Becker 1996 : 53, 2017 ; Marchel and Owens 2007 :303; Ragin 1994 ), and that a qualitative dimension is present in both kinds of work.

Several things follow from our findings. The most important result is the relation to quantitative research. In our analysis we have separated qualitative research from quantitative research. The point is not to label individual researchers, methods, projects, or works as either “quantitative” or “qualitative.” By analyzing, i.e., taking apart, the notions of quantitative and qualitative, we hope to have shown the elements of qualitative research. Our definition captures the elements, and how they, when combined in practice, generate understanding. As many of the quotations we have used suggest, one conclusion of our study holds that qualitative approaches are not inherently connected with a specific method. Put differently, none of the methods that are frequently labelled “qualitative,” such as interviews or participant observation, are inherently “qualitative.” What matters, given our definition, is whether one works qualitatively or quantitatively in the research process, until the results are produced. Consequently, our analysis also suggests that those researchers working with what in the literature and in jargon is often called “quantitative research” are almost bound to make use of what we have identified as qualitative elements in any research project. Our findings also suggest that many” quantitative” researchers, at least to some extent, are engaged with qualitative work, such as when research questions are developed, variables are constructed and combined, and hypotheses are formulated. Furthermore, a research project may hover between “qualitative” and “quantitative” or start out as “qualitative” and later move into a “quantitative” (a distinct strategy that is not similar to “mixed methods” or just simply combining induction and deduction). More generally speaking, the categories of “qualitative” and “quantitative,” unfortunately, often cover up practices, and it may lead to “camps” of researchers opposing one another. For example, regardless of the researcher is primarily oriented to “quantitative” or “qualitative” research, the role of theory is neglected (cf. Swedberg 2017 ). Our results open up for an interaction not characterized by differences, but by different emphasis, and similarities.

Let us take two examples to briefly indicate how qualitative elements can fruitfully be combined with quantitative. Franzosi ( 2010 ) has discussed the relations between quantitative and qualitative approaches, and more specifically the relation between words and numbers. He analyzes texts and argues that scientific meaning cannot be reduced to numbers. Put differently, the meaning of the numbers is to be understood by what is taken for granted, and what is part of the lifeworld (Schütz 1962 ). Franzosi shows how one can go about using qualitative and quantitative methods and data to address scientific questions analyzing violence in Italy at the time when fascism was rising (1919–1922). Aspers ( 2006 ) studied the meaning of fashion photographers. He uses an empirical phenomenological approach, and establishes meaning at the level of actors. In a second step this meaning, and the different ideal-typical photographers constructed as a result of participant observation and interviews, are tested using quantitative data from a database; in the first phase to verify the different ideal-types, in the second phase to use these types to establish new knowledge about the types. In both of these cases—and more examples can be found—authors move from qualitative data and try to keep the meaning established when using the quantitative data.

A second main result of our study is that a definition, and we provided one, offers a way for research to clarify, and even evaluate, what is done. Hence, our definition can guide researchers and students, informing them on how to think about concrete research problems they face, and to show what it means to get closer in a process in which new distinctions are made. The definition can also be used to evaluate the results, given that it is a standard of evaluation (cf. Hammersley 2007 ), to see whether new distinctions are made and whether this improves our understanding of what is researched, in addition to the evaluation of how the research was conducted. By making what is qualitative research explicit it becomes easier to communicate findings, and it is thereby much harder to fly under the radar with substandard research since there are standards of evaluation which make it easier to separate “good” from “not so good” qualitative research.

To conclude, our analysis, which ends with a definition of qualitative research can thus both address the “internal” issues of what is qualitative research, and the “external” critiques that make it harder to do qualitative research, to which both pressure from quantitative methods and general changes in society contribute.

Acknowledgements

Financial Support for this research is given by the European Research Council, CEV (263699). The authors are grateful to Susann Krieglsteiner for assistance in collecting the data. The paper has benefitted from the many useful comments by the three reviewers and the editor, comments by members of the Uppsala Laboratory of Economic Sociology, as well as Jukka Gronow, Sebastian Kohl, Marcin Serafin, Richard Swedberg, Anders Vassenden and Turid Rødne.

Biographies

is professor of sociology at the Department of Sociology, Uppsala University and Universität St. Gallen. His main focus is economic sociology, and in particular, markets. He has published numerous articles and books, including Orderly Fashion (Princeton University Press 2010), Markets (Polity Press 2011) and Re-Imagining Economic Sociology (edited with N. Dodd, Oxford University Press 2015). His book Ethnographic Methods (in Swedish) has already gone through several editions.

is associate professor of sociology at the Department of Media and Social Sciences, University of Stavanger. His research has been published in journals such as Social Psychology Quarterly, Sociological Theory, Teaching Sociology, and Music and Arts in Action. As an ethnographer he is working on a book on he social world of big-wave surfing.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Patrik Aspers, Email: [email protected] .

Ugo Corte, Email: [email protected] .

  • Åkerström M. Curiosity and serendipity in qualitative research. Qualitative Sociology Review. 2013; 9 (2):10–18. [ Google Scholar ]
  • Alford, Robert R. 1998. The craft of inquiry. Theories, methods, evidence . Oxford: Oxford University Press.
  • Alvesson M, Kärreman D. Qualitative research and theory development . Mystery as method . London: SAGE Publications; 2011. [ Google Scholar ]
  • Aspers, Patrik. 2006. Markets in Fashion, A Phenomenological Approach. London Routledge.
  • Atkinson P. Qualitative research. Unity and diversity. Forum: Qualitative Social Research. 2005; 6 (3):1–15. [ Google Scholar ]
  • Becker HS. Outsiders. Studies in the sociology of deviance . New York: The Free Press; 1963. [ Google Scholar ]
  • Becker HS. Whose side are we on? Social Problems. 1966; 14 (3):239–247. [ Google Scholar ]
  • Becker HS. Sociological work. Method and substance. New Brunswick: Transaction Books; 1970. [ Google Scholar ]
  • Becker HS. The epistemology of qualitative research. In: Richard J, Anne C, Shweder RA, editors. Ethnography and human development. Context and meaning in social inquiry. Chicago: University of Chicago Press; 1996. pp. 53–71. [ Google Scholar ]
  • Becker HS. Tricks of the trade. How to think about your research while you're doing it. Chicago: University of Chicago Press; 1998. [ Google Scholar ]
  • Becker, Howard S. 2017. Evidence . Chigaco: University of Chicago Press.
  • Becker H, Geer B, Hughes E, Strauss A. Boys in White, student culture in medical school. New Brunswick: Transaction Publishers; 1961. [ Google Scholar ]
  • Berezin M. How do we know what we mean? Epistemological dilemmas in cultural sociology. Qualitative Sociology. 2014; 37 (2):141–151. [ Google Scholar ]
  • Best, Joel. 2004. Defining qualitative research. In Workshop on Scientific Foundations of Qualitative Research , eds . Charles, Ragin, Joanne, Nagel, and Patricia White, 53-54. http://www.nsf.gov/pubs/2004/nsf04219/nsf04219.pdf .
  • Biernacki R. Humanist interpretation versus coding text samples. Qualitative Sociology. 2014; 37 (2):173–188. [ Google Scholar ]
  • Blumer H. Symbolic interactionism: Perspective and method. Berkeley: University of California Press; 1969. [ Google Scholar ]
  • Brady H, Collier D, Seawright J. Refocusing the discussion of methodology. In: Henry B, David C, editors. Rethinking social inquiry. Diverse tools, shared standards. Lanham: Rowman and Littlefield; 2004. pp. 3–22. [ Google Scholar ]
  • Brown AP. Qualitative method and compromise in applied social research. Qualitative Research. 2010; 10 (2):229–248. [ Google Scholar ]
  • Charmaz K. Constructing grounded theory. London: Sage; 2006. [ Google Scholar ]
  • Corte, Ugo, and Katherine Irwin. 2017. “The Form and Flow of Teaching Ethnographic Knowledge: Hands-on Approaches for Learning Epistemology” Teaching Sociology 45(3): 209-219.
  • Creswell JW. Research design. Qualitative, quantitative, and mixed method approaches. 3. Thousand Oaks: SAGE Publications; 2009. [ Google Scholar ]
  • Davidsson D. The myth of the subjective. In: Davidsson D, editor. Subjective, intersubjective, objective. Oxford: Oxford University Press; 1988. pp. 39–52. [ Google Scholar ]
  • Denzin NK. The research act: A theoretical introduction to Ssociological methods. Chicago: Aldine Publishing Company Publishers; 1970. [ Google Scholar ]
  • Denzin NK, Lincoln YS. Introduction. The discipline and practice of qualitative research. In: Denzin NK, Lincoln YS, editors. Collecting and interpreting qualitative materials. Thousand Oaks: SAGE Publications; 2003. pp. 1–45. [ Google Scholar ]
  • Denzin NK, Lincoln YS. Introduction. The discipline and practice of qualitative research. In: Denzin NK, Lincoln YS, editors. The Sage handbook of qualitative research. Thousand Oaks: SAGE Publications; 2005. pp. 1–32. [ Google Scholar ]
  • Emerson RM, editor. Contemporary field research. A collection of readings. Prospect Heights: Waveland Press; 1988. [ Google Scholar ]
  • Emerson RM, Fretz RI, Shaw LL. Writing ethnographic fieldnotes. Chicago: University of Chicago Press; 1995. [ Google Scholar ]
  • Esterberg KG. Qualitative methods in social research. Boston: McGraw-Hill; 2002. [ Google Scholar ]
  • Fine, Gary Alan. 1995. Review of “handbook of qualitative research.” Contemporary Sociology 24 (3): 416–418.
  • Fine, Gary Alan. 2003. “ Toward a Peopled Ethnography: Developing Theory from Group Life.” Ethnography . 4(1):41-60.
  • Fine GA, Hancock BH. The new ethnographer at work. Qualitative Research. 2017; 17 (2):260–268. [ Google Scholar ]
  • Fine GA, Hallett T. Stranger and stranger: Creating theory through ethnographic distance and authority. Journal of Organizational Ethnography. 2014; 3 (2):188–203. [ Google Scholar ]
  • Flick U. Qualitative research. State of the art. Social Science Information. 2002; 41 (1):5–24. [ Google Scholar ]
  • Flick U. Designing qualitative research. London: SAGE Publications; 2007. [ Google Scholar ]
  • Frankfort-Nachmias C, Nachmias D. Research methods in the social sciences. 5. London: Edward Arnold; 1996. [ Google Scholar ]
  • Franzosi R. Sociology, narrative, and the quality versus quantity debate (Goethe versus Newton): Can computer-assisted story grammars help us understand the rise of Italian fascism (1919- 1922)? Theory and Society. 2010; 39 (6):593–629. [ Google Scholar ]
  • Franzosi R. From method and measurement to narrative and number. International journal of social research methodology. 2016; 19 (1):137–141. [ Google Scholar ]
  • Gadamer, Hans-Georg. 1990. Wahrheit und Methode, Grundzüge einer philosophischen Hermeneutik . Band 1, Hermeneutik. Tübingen: J.C.B. Mohr.
  • Gans H. Participant Observation in an Age of “Ethnography” Journal of Contemporary Ethnography. 1999; 28 (5):540–548. [ Google Scholar ]
  • Geertz C. The interpretation of cultures. New York: Basic Books; 1973. [ Google Scholar ]
  • Gilbert N. Researching social life. 3. London: SAGE Publications; 2009. [ Google Scholar ]
  • Glaeser A. Hermeneutic institutionalism: Towards a new synthesis. Qualitative Sociology. 2014; 37 :207–241. [ Google Scholar ]
  • Glaser, Barney G., and Anselm L. Strauss. [1967] 2010. The discovery of grounded theory. Strategies for qualitative research. Hawthorne: Aldine.
  • Goertz G, Mahoney J. A tale of two cultures: Qualitative and quantitative research in the social sciences. Princeton: Princeton University Press; 2012. [ Google Scholar ]
  • Goffman E. On fieldwork. Journal of Contemporary Ethnography. 1989; 18 (2):123–132. [ Google Scholar ]
  • Goodwin J, Horowitz R. Introduction. The methodological strengths and dilemmas of qualitative sociology. Qualitative Sociology. 2002; 25 (1):33–47. [ Google Scholar ]
  • Habermas, Jürgen. [1981] 1987. The theory of communicative action . Oxford: Polity Press.
  • Hammersley M. The issue of quality in qualitative research. International Journal of Research & Method in Education. 2007; 30 (3):287–305. [ Google Scholar ]
  • Hammersley, Martyn. 2013. What is qualitative research? Bloomsbury Publishing.
  • Hammersley M. What is ethnography? Can it survive should it? Ethnography and Education. 2018; 13 (1):1–17. [ Google Scholar ]
  • Hammersley M, Atkinson P. Ethnography . Principles in practice . London: Tavistock Publications; 2007. [ Google Scholar ]
  • Heidegger M. Sein und Zeit. Tübingen: Max Niemeyer Verlag; 2001. [ Google Scholar ]
  • Heidegger, Martin. 1988. 1923. Ontologie. Hermeneutik der Faktizität, Gesamtausgabe II. Abteilung: Vorlesungen 1919-1944, Band 63, Frankfurt am Main: Vittorio Klostermann.
  • Hempel CG. Philosophy of the natural sciences. Upper Saddle River: Prentice Hall; 1966. [ Google Scholar ]
  • Hood JC. Teaching against the text. The case of qualitative methods. Teaching Sociology. 2006; 34 (3):207–223. [ Google Scholar ]
  • James W. Pragmatism. New York: Meredian Books; 1907. [ Google Scholar ]
  • Jovanović G. Toward a social history of qualitative research. History of the Human Sciences. 2011; 24 (2):1–27. [ Google Scholar ]
  • Kalof L, Dan A, Dietz T. Essentials of social research. London: Open University Press; 2008. [ Google Scholar ]
  • Katz J. Situational evidence: Strategies for causal reasoning from observational field notes. Sociological Methods & Research. 2015; 44 (1):108–144. [ Google Scholar ]
  • King G, Keohane RO, Sidney S, Verba S. Scientific inference in qualitative research. Princeton: Princeton University Press; 1994. Designing social inquiry. [ Google Scholar ]
  • Lamont M. Evaluating qualitative research: Some empirical findings and an agenda. In: Lamont M, White P, editors. Report from workshop on interdisciplinary standards for systematic qualitative research. Washington, DC: National Science Foundation; 2004. pp. 91–95. [ Google Scholar ]
  • Lamont M, Swidler A. Methodological pluralism and the possibilities and limits of interviewing. Qualitative Sociology. 2014; 37 (2):153–171. [ Google Scholar ]
  • Lazarsfeld P, Barton A. Some functions of qualitative analysis in social research. In: Kendall P, editor. The varied sociology of Paul Lazarsfeld. New York: Columbia University Press; 1982. pp. 239–285. [ Google Scholar ]
  • Lichterman, Paul, and Isaac Reed I (2014), Theory and Contrastive Explanation in Ethnography. Sociological methods and research. Prepublished 27 October 2014; 10.1177/0049124114554458.
  • Lofland J, Lofland L. Analyzing social settings. A guide to qualitative observation and analysis. 3. Belmont: Wadsworth; 1995. [ Google Scholar ]
  • Lofland J, Snow DA, Anderson L, Lofland LH. Analyzing social settings. A guide to qualitative observation and analysis. 4. Belmont: Wadsworth/Thomson Learning; 2006. [ Google Scholar ]
  • Long AF, Godfrey M. An evaluation tool to assess the quality of qualitative research studies. International Journal of Social Research Methodology. 2004; 7 (2):181–196. [ Google Scholar ]
  • Lundberg G. Social research: A study in methods of gathering data. New York: Longmans, Green and Co.; 1951. [ Google Scholar ]
  • Malinowski B. Argonauts of the Western Pacific: An account of native Enterprise and adventure in the archipelagoes of Melanesian New Guinea. London: Routledge; 1922. [ Google Scholar ]
  • Manicas P. A realist philosophy of science: Explanation and understanding. Cambridge: Cambridge University Press; 2006. [ Google Scholar ]
  • Marchel C, Owens S. Qualitative research in psychology. Could William James get a job? History of Psychology. 2007; 10 (4):301–324. [ PubMed ] [ Google Scholar ]
  • McIntyre LJ. Need to know. Social science research methods. Boston: McGraw-Hill; 2005. [ Google Scholar ]
  • Merton RK, Barber E. The travels and adventures of serendipity . A Study in Sociological Semantics and the Sociology of Science. Princeton: Princeton University Press; 2004. [ Google Scholar ]
  • Mannay D, Morgan M. Doing ethnography or applying a qualitative technique? Reflections from the ‘waiting field‘ Qualitative Research. 2015; 15 (2):166–182. [ Google Scholar ]
  • Neuman LW. Basics of social research. Qualitative and quantitative approaches. 2. Boston: Pearson Education; 2007. [ Google Scholar ]
  • Ragin CC. Constructing social research. The unity and diversity of method. Thousand Oaks: Pine Forge Press; 1994. [ Google Scholar ]
  • Ragin, Charles C. 2004. Introduction to session 1: Defining qualitative research. In Workshop on Scientific Foundations of Qualitative Research , 22, ed. Charles C. Ragin, Joane Nagel, Patricia White. http://www.nsf.gov/pubs/2004/nsf04219/nsf04219.pdf
  • Rawls, Anne. 2018. The Wartime narrative in US sociology, 1940–7: Stigmatizing qualitative sociology in the name of ‘science,’ European Journal of Social Theory (Online first).
  • Schütz A. Collected papers I: The problem of social reality. The Hague: Nijhoff; 1962. [ Google Scholar ]
  • Seiffert H. Einführung in die Hermeneutik. Tübingen: Franke; 1992. [ Google Scholar ]
  • Silverman D. Doing qualitative research. A practical handbook. 2. London: SAGE Publications; 2005. [ Google Scholar ]
  • Silverman D. A very short, fairly interesting and reasonably cheap book about qualitative research. London: SAGE Publications; 2009. [ Google Scholar ]
  • Silverman D. What counts as qualitative research? Some cautionary comments. Qualitative Sociology Review. 2013; 9 (2):48–55. [ Google Scholar ]
  • Small ML. “How many cases do I need?” on science and the logic of case selection in field-based research. Ethnography. 2009; 10 (1):5–38. [ Google Scholar ]
  • Small, Mario L 2008. Lost in translation: How not to make qualitative research more scientific. In Workshop on interdisciplinary standards for systematic qualitative research, ed in Michelle Lamont, and Patricia White, 165–171. Washington, DC: National Science Foundation.
  • Snow DA, Anderson L. Down on their luck: A study of homeless street people. Berkeley: University of California Press; 1993. [ Google Scholar ]
  • Snow DA, Morrill C. New ethnographies: Review symposium: A revolutionary handbook or a handbook for revolution? Journal of Contemporary Ethnography. 1995; 24 (3):341–349. [ Google Scholar ]
  • Strauss AL. Qualitative analysis for social scientists. 14. Chicago: Cambridge University Press; 2003. [ Google Scholar ]
  • Strauss AL, Corbin JM. Basics of qualitative research. Techniques and procedures for developing grounded theory. 2. Thousand Oaks: Sage Publications; 1998. [ Google Scholar ]
  • Swedberg, Richard. 2017. Theorizing in sociological research: A new perspective, a new departure? Annual Review of Sociology 43: 189–206.
  • Swedberg R. The new 'Battle of Methods'. Challenge January–February. 1990; 3 (1):33–38. [ Google Scholar ]
  • Timmermans S, Tavory I. Theory construction in qualitative research: From grounded theory to abductive analysis. Sociological Theory. 2012; 30 (3):167–186. [ Google Scholar ]
  • Trier-Bieniek A. Framing the telephone interview as a participant-centred tool for qualitative research. A methodological discussion. Qualitative Research. 2012; 12 (6):630–644. [ Google Scholar ]
  • Valsiner J. Data as representations. Contextualizing qualitative and quantitative research strategies. Social Science Information. 2000; 39 (1):99–113. [ Google Scholar ]
  • Weber, Max. 1904. 1949. Objectivity’ in social Science and social policy. Ed. Edward A. Shils and Henry A. Finch, 49–112. New York: The Free Press.

Identifying Empirical Research Papers 0124

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 23 August 2024

Leveraging social media data for pandemic detection and prediction

  • Boyang Shi 1 ,
  • Weixiang Huang 1 ,
  • Yuanyuan Dang   ORCID: orcid.org/0000-0002-8791-0095 1 &
  • Wenhui Zhou 1  

Humanities and Social Sciences Communications volume  11 , Article number:  1075 ( 2024 ) Cite this article

Metrics details

  • Health humanities
  • Information systems and information technology

Governments and healthcare institutions are increasingly recognizing the value of leveraging social media data to address disease outbreaks. This is due to the rapid dissemination and rich content of social media data, which includes real-time reactions and calls for help from people. However, current research on which social media data can be utilized for information support, as well as the underlying reasons why social media data can be utilized for information support, remains limited. This study aims to make up for the aforementioned limitation by investigating which social media information is more likely to reflect the severity of an outbreak through empirical and prediction models, while also elucidating why social media data has the ability to reflect pandemic through content analysis. The COVID-19 outbreak was utilized as a case example in this study because it has the advantage of enhancing the universality of results and promoting the validation of the model with multiple waves of data. The empirical model results indicate that social media activity from public users is more likely to reflect the ground truth during pandemic. In particular, it was found that negative sentiment expressed in blog posts by public users during pandemic aligns more closely with the severity of disease outbreak. Then, a prediction model was proposed to further validate these findings of the empirical model. Finally, a content analysis was conducted based on the conclusions drawn from empirical model and prediction model. The content analysis revealed that the predictive capability of social media data for pandemic originates from individual self-reporting of illness. This study provides contributions and insights into which types of information can be used for pandemic monitoring and forecasting. The findings of our research have significant implications for governments and healthcare institutions in leveraging social media data for pandemic monitoring and forecasting.

Similar content being viewed by others

a research is empirical because

A new data integration framework for Covid-19 social media information

a research is empirical because

In.To. COVID-19 socio-epidemiological co-causality

a research is empirical because

Global evidence of expressed sentiment alterations during the COVID-19 pandemic

Introduction.

As of October 2023, approximately 4.95 billion individuals (representing 61.4% of the world’s population) were users of social media platforms (Petrosyan, 2023 ). The utilization of digital methodologies for surveillance and guidance has emerged as a dominant trend during the COVID-19 pandemic. Empirical evidence indicates that social media posts pertaining to COVID-19 can serve as predictors of disease prevalence (Li et al. 2020 ). In particular, posts on social media that are related to symptoms and diagnosis can be a more accurate predictor of illness than other types of posts (Shen et al. 2020 ). In addition, governments, and health organizations are increasingly leveraging social media to facilitate awareness campaigns, fundraising initiatives and the provision of assistance (Gao et al. 2022 ).

The surveillance and forecasting of newly diagnosed cases during public health emergencies are crucial for the mobilization of medical resources and the facilitation of policymaking (Huang et al. 2021 ). In the field of public health, national or local health authorities have historically relied on established systems for monitoring recognized infectious diseases. The effectiveness of these systems is contingent upon the routine reporting of sentinel events by physicians and laboratories (Velasco et al. 2014 ). However, traditional forms, such as indicator-based reporting by health organizations, are limited in their ability to identify potential health hazards. Specifically, the occurrence of reporting delays and the lack of appropriate equipment impede the ability to identify unexpected disease events (Velasco et al. 2014 ). However, with the advancement of the Internet, the utilization of online data for digital surveillance has become a trend. The practice of digital surveillance, or what is commonly referred to as ‘infodemiology’ and ‘infoveillance’, has become increasingly prevalent in public health practice with the objective of detecting and identifying new and re-emerging infectious diseases (Eysenbach, 2011 ).

During the COVID-19 pandemic, social media data has demonstrated predictive capabilities across various research domains. In the realm of finance, Wu et al. ( 2021 ) empirically validated that information from social media during the pandemic contribute to forecasting petroleum prices, production, and consumption. Cevik et al. ( 2022 ) concentrated on utilizing public sentiments in social media data to predict stock prices and market fluctuations during the COVID-19 period. Within the field of tourism management, Ampountolas and Legg ( 2021 ) incorporated social media data into a segmented learning framework, enhancing the prediction of hotel occupancy rates. Wu et al. ( 2023 ) combining diverse sources of social media data, predicted tourist arrivals during the 2019 coronavirus disease period. In the field of passage flow prediction, S. Zhang et al. ( 2023 ) utilizing deep learning algorithms alongside social media data and multiple sources such as confirmed COVID-19 cases, forecasted the passenger volume of urban rail transit during the COVID-19 period.

Furthermore, prior studies have made significant strides in utilizing social media data for predicting COVID-19 cases. Comito ( 2021 ) emphasizes the high correlation between tweets and actual data of COVID-19, demonstrating that Twitter can be considered as a reliable indicator of epidemic spread. Lamsal et al. ( 2022 ) forecasted daily new COVID-19 cases using Twitter conversations, suggesting that potential social media variables offer additional capacity for predicting daily cases. Tran and Matsui ( 2023 ) proposed observing trends reflected in Twitter counts and the development of the COVID-19 pandemic, employing deep neural network models to capture the relationship between societal responses and the progression of COVID-19, thereby predicting future trends. Chatterjee et al. ( 2023 ) and Kellner et al. ( 2023 ) focused on forecasting COVID-19 outbreaks through the integration of multiple data sources. Additionally, some studies (Bae et al. 2021 ; Nie et al. 2021 ) approached predictive modeling from a mathematical perspective, integrating SEIR mathematical models with social media data to enhance the prediction of new COVID-19 cases.

However, current research on detecting pandemic using social media data mostly exhibits the following shortcomings. Firstly, many studies focus solely on a single wave of the pandemic, failing to validate and capture the underlying reasons with potential predictive capability across multiple waves. This can lead to non-robust results. Secondly, users on social media platforms encompass various types, including public, opinion leader, and organization, each potentially possessing different predictive capabilities. However, there is currently no research exploring predictive abilities from the perspective of user types. Thirdly, sentiments in social media are considered crucial variables in predicting pandemic trends (Lamsal et al. 2022 ), yet there is a lack of research investigating the predictive capabilities of different sentiments. Finally, existing research predominantly concentrated on the extraction of variables from social media data through feature engineering in order to achieve more accurate predictions. However, this has resulted in a lack of exploration of the potential causes of disease prediction from social media data from an interpretability perspective. Consequently, these methods often function as black boxes and lack robustness in practice. Based on these discussions, we propose our research question:

RQ1: Which social media users are more likely to reflect the ground truth in pandemic?
RQ2: Which sentiments are more likely to reflect the ground truth in pandemic?
RQ3: Why social media data has the ability to reflect the ground truth in pandemic?

Previous studies have successfully addressed the question of whether we can utilize social media data as a proxy for on-the-ground data, affirming its viability (Gour et al. 2022 ). However, the current challenge unaddressed is determining which information within social media data serves as a robust proxy and why we can use this information as a proxy for ground truth during pandemic (Heffner et al. 2021 ). This study endeavors to address the aforementioned challenges through empirical model, prediction model, and content analysis.

Our research contributes to the field of leveraging social media data for pandemic detection and prediction in three aspects: Firstly, while previous studies have concentrated on specific time frames (Surian et al. 2016 ), our study delves into the entire duration of the pandemic in China, spanning from its onset to its conclusion, analyzing the three waves that occurred during this period. Secondly, existing research has primarily focused on overall social media posts without exploring finer granularity (Qiu and Kumar, 2017 ). In contrast, our research divides these posts from the perspectives of user types and sentiments, exploring the efficacy of different user types and sentiments as proxies. To our knowledge, no prior research has explored the effectiveness of social media proxies from the standpoint of user types. Finally, existing research often emphasizes prediction accuracy, it frequently lacks explanatory power. In contrast, our study provides explanatory insights through content analysis, elucidating why social media data can be considered as indicators of ground truth during pandemic.

We present the following findings from our study. Through empirical and prediction model, we discovered that among different social media users, social media activity from public user better reflects the ground truth during pandemic. Furthermore, we found that among different sentiment blogs, negative sentiment blogs from public user serve as critical indicator for reflecting disease outbreak. Through content analysis, we explained why social media data, including public activities and negative sentiment blogs from public, better reflect the reality. Symptom-related blogs from public effectively depict real infection situations, enhancing public’s predictive capacity. Additionally, as blogs with self-reported symptoms often manifest negative sentiment, negative sentiment blogs from public better capture the ground truth.

Role of social media users during pandemic

Participatory sensing, as defined by Burke et al. ( 2006 ), is the concept of communities (or other groups of people) collectively contributing sensory information to form a body of knowledge. The proliferation of mobile devices, including smartphones, tablet computers, and activity trackers, which are equipped with a multitude of sensors, has rendered participatory sensing a viable option at a large scale. Participatory sensing has been employed to acquire information regarding a range of phenomena, including weather, environment, noise pollution (Aumond et al. 2017 ), urban mobility (Wu and Lim 2014 ), congestion, and any other sensory information that collectively forms knowledge. Social media can be regarded as an optimal platform for the automatic capture of real-time dynamics, as it offers a wealth of knowledge while treating social media users as ‘social sensors’ (Jiang and Li 2016 ) to detect real-world events. For instance, Twitter users can utilize the platform to reflect the air quality (Lam et al. 2021 ) and earthquakes (Sakaki et al. 2010 ) in the surrounding areas. Jiang et al. ( 2022 ) proposed the utilization of social media users as social sensors to predict pandemic trends simultaneously. Simon et al. ( 2015 ) posited that in crisis communications, the public assumes the role of the primary ‘first responder’, and social media offers unparalleled avenues for their engagement.

Opinion leader is active media user leadership who interprets the meaning of media messages or content for lower-end media users. According to the two-step flow of communication theory (Lazarsfeld et al. 1968 ), information initially flows from mass media to opinion leaders, who then share the information to their audiences. The role of opinion leaders in facilitating the flow of information is of significant importance, as individuals frequently seek counsel from others within their social environment. However, recent research has found that opinion leader may not effectively predict the topics of public concern during pandemic. Conversely, it has been observed that public concerns often dictate the topics discussed by opinion leader (Chen et al. 2022 ). This indicates that during unprecedented health crises, particularly in the early stages when information may be scarce, the role of opinion leader can be limited. The most crucial information often originates from individuals who have directly experienced symptoms. Consequently, it is reasonable to posit that opinion leaders were attuned to the sentiments expressed by the general public and employed their influence to amplify their voices. Hence, opinion leaders serve more as amplifiers of existing discourse rather than sensitive sensors, while it is the public itself fulfills the role of perceptive sensors.

Over the past few years, state media and government departments have established their own social media accounts, which are defined as profiles created and operated by governmental entities (Tang et al. 2021 ). During public health crises such as the COVID-19 pandemic, these entities bear the responsibility for disseminating vital information to the public on behalf of the government. Due to the potential rapid dissemination of mistakes information through these government accounts, which could lead to a crisis of trust towards the government, authorities typically exercise caution (Feng and Umaier 2023 ) and sometimes regulate the release of information (Deng and Yang 2021 ) when making announcements. Other studies have found that during COVID-19 pandemic, government prioritized social and political strategies over transparency in informing the public about the outbreak (Cheung et al. 2023 ; Zhang et al. 2020 ). And they shared more of facts based information (Kaur et al. 2021 ). Furthermore, despite the government possessing infectious disease detection systems, reports from these systems often suffer from delays.

According to agenda-setting theory, the content that media chooses to disseminate is contingent upon their perspectives on matters pertaining to economics, politics, and culture (Molloy 2020 ). During COVID-19 pandemic, organizational actors maintained a focused agenda on discussing specific aspects of the pandemic, such as socioeconomic impact and policy (Chen et al. 2022 ). Given that these organizational actors included government agencies and media, all of which are subject to the influence of central government to varying degrees, it is expected that their social media posts prioritized broader socio-political and global aspects of the pandemic (Chen et al. 2022 ; Zhou et al. 2023 ). Recent research findings (Chen et al. 2022 ) suggest that organizations, encompassing news media entities, tended to amplify media voices and official narratives regarding COVID-19. Consequently, these entities were more inclined to rely on and propagate established discourses concerning the pandemic. Furthermore, Zhao et al. ( 2022 ) revealed that official text was unrelated to the increase in confirmed cases; official text merely served as an update and response to the epidemic, offering no predictive value. Thus, we hypothesize that:

H1: Compare to opinion leader and organization, public is more likely to reflect the ground truth in pandemic.

Role of social media sentiments during pandemic

Sentiments could elucidate the spread of disease through the increase in cases and facilitate our understanding of the relationship between disease transmission and Twitter activity during a crisis (Gour et al. 2022 ). The prevalence of disease, as indicated by the rising incidence of cases, exerts a considerable impact on the high level of social media activity (Khatua et al. 2019 ). Therefore, the greater the number of blogs written by users on the social media platform, the greater the Weibo activity will be accounted for. Some studies have discovered that the sentiments expressed by users on social media contribute to the prediction of confirmed COVID-19 cases. Alamoodi et al. ( 2021 ) proposed that sentiment analysis might aid in predicting the number of COVID-19 cases and deaths. Tran and Matsui ( 2023 ) utilized Twitter users’ sentiments to simulate new cases, with results indicating that sentiments expressed on social media accurately predicted current and imminent COVID-19 cases. Lamsal et al. ( 2022 ), employed Granger causality tests to estimate the relationship between online sentiments, post reactions, and daily number of confirmed cases. Their study revealed that online emotional features and post reactions contained valuable information for forecasting local confirmed cases in advance.

However, existing research has suggested that not all sentiments expressed by social media users possess predictive efficacy. Sanwald et al. ( 2022 ) observed that during the pandemic, positive sentiments served as predictive factors for happiness. Tweets conveying positive sentiments exhibited a significant positive correlation with recovery but did not significantly correlate with infections and deaths. Neutral emotional posts often involved objective discussions and news reporting on the pandemic. Zhou et al. ( 2023 ) proposed that the neutral sentiments expressed by public during the pandemic indicated a sense of relative calmness, reflecting a relatively stable situation. Luu and Follmann ( 2023 ) noted a negative correlation between public sentiments and daily confirmed cases during pandemic. As the number of confirmed cases gradually decreased, the emotional score showed an upward trend (people became relatively positive). However, with an increase in confirmed cases, the emotional score displayed a downward trend (relatively negative). Positive tweets about COVID-19 were associated with lower cases and deaths, while negative tweets were linked to higher cases and deaths. Zhao et al. ( 2022 ) discovered that negative sentiment from public could lead to changes in the growth rate of confirmed cases and death rates, indicating statistical significance in predicting the development of the pandemic. Negative sentiment expressed in personal posts showed Granger causality, fluctuating before pandemic indicators. Monitoring public negative sentiment could forecast changes in the pandemic, utilizing negative sentiment as an explanatory variable to predict pandemic indicators. Thus, H2 is proposed:

H2: Compare to nature and positive sentiment, negative sentiment from public is more likely to reflect the ground truth in pandemic.

Predictive capability of social media data for pandemic

Previous studies suggest that symptom-related posts shared on social media demonstrate increased predictive capability. Shen et al. ( 2020 ) sampled Weibo posts containing COVID-19 symptom lists and found that online symptom reports exhibited significant predictive capabilities, often forecasting trends 14 days before official statistical data releases. Similarly, Yousefinaghani et al. ( 2021 ) observed that symptom-related tweets had the best predictive performance, predicting about 2–6 days earlier than other data streams. And these symptom-related posts predominantly originate from personal accounts such as public and opinion leader. In contrast, government and media accounts, serving as official sources, tend to disseminate more news reports grounded in objective facts (Kaur et al. 2021 ). For example, Zhao et al. ( 2022 ) revealed that official text merely served as an update and response to the epidemic. Moreover, symptom-related posts often convey predominantly negative sentiments. Blogs posted by individuals experiencing illness tend to be negative, whereas those from recovery periods tend to be positive and optimistic. As the severity of the pandemic increases, bloggers exhibit more negative sentiments (Shan et al. 2020 ). Hence, we contend that symptom self-reporting posts from personal accounts contribute to the predictive capacity of social media content for pandemic. These symptom reporting posts usually showcase negative sentiments, thereby providing the predictive capacity of negative posts for pandemic. Thus, H3 is proposed:

H3: The predictive capability of social media data for pandemic originates from individual self-reporting of illness.

We have three data sources: (i) Sina Weibo (ii) Baidu Search Index and (iii) external data, as explained below.

China is the world’s largest social media market with highly engaged and mobile-savvy users (Thomala 2023b ). When it comes to microblogging, Chinese users have their local version of Twitter—Sina Weibo. In September 2023, Sina Weibo reported to have 260 million of daily active users, up around seven million from the corresponding quarter in the previous year. Our primary data source comprises the microblogging site ‘Sina Weibo Footnote 1 ’, where users share situation-related updates or express their viewpoints through blogs. Those blogs can be extracted by hashtags, which serve as indicators of conversation keywords on Sina Weibo. Therefore, we collected data on COVID-19-related blogs by utilizing hashtags such as #Epidemic#, #COVID-19#, #COVID-19 Pneumonia#, #Noval Coronavirus#, #Novel Coronavirus Infection Pneumonia#, #Coronavirus#, #Positive#, #Fever#, and #Cough#. By employing Weibo’s advanced search API, we gathered a total of 545,814 original blogs from January 10, 2020 to January 8, 2023. As shown in Table 1 , the fields of the acquired blogs encompass basic information (Content, Date), geographical information (IP) and user information (User Type).

Following data cleaning procedures, which involve the removal of hyperlinks, email addresses, retweet symbols, hashtag symbols, and extra white spaces, we obtain a list of 486,166 blogs. Figure 1A shows the daily post blogs and daily confirmed cases, revealing three peaks aligning with the three waves of the epidemic. As observed, there exists a lag between the blue line, symbolizing the daily post blogs, and the red line, indicating the new daily confirmed cases, across all three waves. To examine the correlation between daily blogs and daily cases, we adopt the cross-correlation analysis (Niu et al. 2022 ). In Fig. 1B , it’s evident that the local maximum cross-correlation coefficient is 0.5543 at a lag order of 12.

figure 1

Panel ( A ) Displays the trend of daily blogs (blue line) and daily cases (red line). Panel ( B ) represents the cross-correlation of daily blogs and daily cases. The red dotted line indicates the lag order between daily blogs and daily cases when the local maximum cross-correlation coefficient occurs.

Besides removing hyperlinks, email addresses, etc., we also checked other factors that may affect our analysis, such as bot accounts, spam content, and repeated posts. As for bot accounts, we trained a bot accounts classifier to recognize bot accounts in our dataset. The details of the process were showed in Supplementary Appendix C1 . Then, considering the spam content, which were not related to pandemic in our study, will also affect our analysis. We constructed a spam content classifier to recognize spam text in our dataset. We present the training details and outcomes in Supplementary Appendix C2 . Finally, we try two different text deduplication methods to deal with the repeated posts. By utilizing these two different text deduplication methods, we obtained two different datasets. Therefore, we repeated all the analysis in the following sections, including empirical analysis, predictive analysis, and content analysis on these datasets. The details and outcomes were showed in Supplementary Appendix C3 . Our analysis showed that bot accounts, spam content, and repeated posts did not change our findings.

Baidu search index

Baidu is the most popular search engine in China (Thomala, 2023a ) and possesses a functionality similar to Google Trends, known as Baidu Search Index. In this study, Baidu Search Index serve as representative social media index. Nine keywords similar to the hashtags utilized in the ‘Sina Weibo’ section were selected: ‘Epidemic’, ‘COVID-19’, ‘COVID-19 Pneumonia’, ‘Novel Coronavirus’, ‘Pneumonia’, ‘Coronavirus’, ‘Positive’, ‘Fever’, and ‘Cough’. By employing Baidu Search Index API, we gathered the index for these keywords from January 10, 2020 to January 8, 2023.

Then, the cross-correlation method was utilized to analyze these keywords, and the analysis results are presented in Table 2 . The ‘Cross-correlation’ column indicates the local maximum cross-correlation coefficient between the trends of keywords and new confirmed cases, while the ‘Lag’ column specifies the lag order at which this local maximum cross-correlation occurs. The results revealed that the correlation between new confirmed cases and symptom-related keywords generally surpassed that between new confirmed cases and COVID-19 related keywords. The lag order for symptom related keywords consistently remained positive, whereas COVID-19 related keywords did not exhibit such consistency. These findings empirically demonstrate that keywords associated with disease symptoms provide a more accurate reflection of the actual disease reality.

Among COVID-19 related keywords, the keyword ‘Coronavirus’ exhibits the highest correlation, with a lag order of 11. In Symptom related keywords, ‘Cough’ demonstrates the highest correlation, also with a lag order of 11. Furthermore, it is evident that the correlation between these two keywords and new confirmed cases surpasses that between new blog posts and new confirmed cases. This suggests that unfiltered social media blog content may contains noise, potentially influencing the impact of such blogs, consistent with previous research (Diaz-Garcia et al. 2022 ). Consequently, in subsequent analyses, we will systematically identify the elements within social media blogs that genuinely reflect fluctuations in newly confirmed cases.

External data

We collected the number of new cases ( \({{Case}}_{{tp}}\) ), deaths ( \({{Death}}_{{tp}}\) ), and cures ( \({{Cure}}_{{tp}}\) ) related to COVID-19 reported on day \(t\) in province \(p\) through the National Health Commission of the People’s Republic of China Footnote 2 . Additionally, we gathered the number of hospital visits ( \({{HV}}_{\!mp}\) ) in month \(m\) in province \(p\) from the same site. From National Bureau of Statistics of China Footnote 3 , we obtained socioeconomic data including the registered urban unemployment ( \({{RUU}}_{\!yp}\) ) and the per capita disposal income ( \({{PCDI}}_{\!yp}\) ) of province \(p\) in year \(y\) . Traffic data, specifically highway and waterway passenger traffic ( \({{HWPT}}_{\!mp}\) ), was sourced from the Ministry of Transport of the People’s Republic of China Footnote 4 . Weather conditions, such as dew point ( \({{DEWP}}_{tp}\) ) and temperature ( \({{TEMP}}_{\!tp}\) ), were collected from the National Centers for Environmental Information Footnote 5 . Government interventions were quantified through the Government Response Index ( \({{GRI}}_{tp}\) ) which assesses the strictness of a country’s government response to COVID-19, and the Economic Support Index ( \({{ESI}}_{\!tp}\) ), which measures financial and economic policies implemented to aid individuals and businesses affected by the pandemic. These indices were obtained from the Oxford Covid-19 Government Response Tracker Footnote 6 (OxCGRT). The explanation and details of these variables are provided in Table 3 below.

Our research framework was present in Fig. 2 , including sentiment analysis, empirical model, prediction model, and content analysis. The purpose of sentiment analysis was to obtain key variables utilized in both empirical and prediction model. Subsequently, the empirical and prediction model was employed to validate Hypothesis 1 and Hypothesis 2. Finally, content analysis was utilized to explained the findings of the empirical model and the prediction model, which in turn validated Hypothesis 3.

figure 2

The overall research methods include sentiment analysis, empirical model, prediction model, and content analysis.

Sentiment analysis

The sentiment analysis in this study encompasses two tasks: polarity analysis and emotion recognition. The result of polarity analysis was utilized to validate our Hypothesis 2, while the outcome of emotion recognition was employed to check the robustness of our Hypothesis 2.

Polarity analysis

In sentiment analysis, one common task is polarity analysis, which categorizes overall sentiment into three groups: nature, negative, or positive. And in the task of polarity analysis, BERT has been demonstrated to offer notable improvements in efficiency and accuracy when compared to traditional machine learning algorithms such as support vector machine and random forest (Cai et al. 2023 ). Therefore, the BERT model was utilized for sentiment identification of Sina Weibo blogs in this study. Given that our textual corpus is in Chinese, we employ the Chinese version of BERT—chinese-roberta-wwm-ext-large Footnote 7 pre-trained model (Zhang et al. 2023 ), specifically designed for Chinese text processing. Figure 3 shows our fine-tuning and classification process. In the flow of fine-tuning, only the parameters of the final classification layer were fine-tuned through our training texts and labels. Then, the Fine-tuned BERT was used to classify our Weibo blogs in the flow of classification.

figure 3

In the flow of fine-tuning process, the training texts and labels were fed into the pre-trained model to obtain fine-tuned BERT model. * indicates the layer in BERT where parameters are fine-tuned. In the flow of classification process, the fine-tuned BERT model was utilized to classify new texts.

The dataset from CCIR 2020 Footnote 8 was utilized to fine-tune the BERT model for polarity analysis (Cai et al. 2023 ). CCIR 2020 was compiled based on 230 thematic keywords related to COVID-19, involving the scraping of one million Weibo posts between January 1, 2020, and February 20, 2020. Among these, 100,000 posts were manually annotated and categorized into three classes: −1 (negative), 0 (nature), and 1 (positive). After fine-tuning the BERT model with CCIR 2020 dataset, it achieved an accuracy of 0.7613 and an F1 score of 0.7301 in polarity analysis.

Emotion recognition

Another prominent sentiment analysis task is emotion recognition, a process for extracting finer-grained emotions such as neutral, happy, anger, sad, fear, and surprise from human language. The dataset from SMP 2020 Footnote 9 was utilized to fine-tune another BERT model for emotion recognition. SMP 2020 comprises two parts: the first part is a general Weibo dataset, containing randomly collected Weibo blogs with a broad range of topics and the second part is the COVID-19 Weibo dataset, consisting of Weibo blogs related to the COVID-19 pandemic. Each blog was labeled into one of six categories: neutral, happy, angry, sad, fear, or surprise. The general Weibo training dataset consists of 27,768 blogs, with a validation set of 2,000 blogs, and a testing dataset of 5,000 blogs. The COVID-19 Weibo training dataset includes 8,606 blogs, with a validation set of 2,000 blogs, and a testing dataset of 3,000 blogs. Following fine-tuning with SMP2020 dataset, the BERT model achieved an accuracy of 0.7974 and an F1 score of 0.7464 in emotion recognition.

The correspondence between sentiment and emotion is elucidated in Table A1 (see Supplementary Appendix A) . Within the negative sentiment, emotions such as anger, sad, and fear predominantly prevail. In positive sentiment, the emotion ‘happy’ overwhelmingly prevails. However, within nature sentiment, alongside the prevalence of ‘neutral’ emotions, there exists a notable presence of both negative emotions like anger, sadness, and fear, as well as positive emotions like happiness. Hence, the objective of our emotion recognition is to synthesize more fine-grained emotions—neutral, happy, angry, sad, fear and surprise—into nature, negative and positive sentiment, thereby assessing the robustness of our Hypothesis 2. Our synthesis principle is as follows: merging ‘neutral’ and ‘surprise’ into the ‘nature’ category, while combining ‘angry,’ ‘sad,’ and ‘fear’ into the ‘negative’ category, with ‘happy’ being categorized as ‘positive’ (Lu et al. 2024 ; Zhang et al. 2021 ).

Empirical model

We begin by examining our first research question: which social media users are more likely to reflect the ground truth in pandemic? We utilize user type information to check our Hypothesis 1. Then, we exam our second research question: which sentiments are more likely to reflect the ground truth in pandemic? We utilize sentiments derived from polarity analysis to evaluate our Hypothesis 2. Each of these is elaborated below.

User type empirical model

We categorize users into four groups: public, opinion leader, government, and media based on User Type fields provided by the Weibo platform, as illustrated in Table 1 . Among these various actors, the majority posts were from public (263698, 54.24%), followed by opinion leader (118494, 24.37%), media (57825, 11.89%) and government (46149, 9.49%).

Given our research question, it is essential to ensure that the dependent variable accurately reflects the situation during the pandemic. Previous research has demonstrated that the number of new cases during disease outbreaks is a pivotal variable (Alessa and Faezipour, 2019 ). Consequently, we regard this to be our dependent variable, which we denote as \({{Case}}_{tp}\) . This represents the number of new cases reported on day \(t\) in province \(p\) during the outbreak. In addition, the prevalence of disease, as indicated by the rising incidence of cases, exerts a considerable impact on the high level of social media activity (Khatua et al. 2019 ). Therefore, the greater the number of blogs users compose on social media platforms, the greater the Weibo activity will be accounted for. Thus, we use the blogs put up by users, including public, opinion leader, government and media, to quantify the level of Weibo activity. In accordance with previous study (Gour et al. 2022 ), the use of one-day lagged values is employed. Our model is as follows:

Here, \({{Case}}_{tp}\) represents the number of new cases on day \(t\) in province \(p\) , \(\log ({{Public}}_{\left(t-1\right)p})\) denotes the logarithm of the number of blogs from public on day \(t-1\) in province \(p\) , \(\log ({{Opinion\; leader}}_{\left(t-1\right)p})\) signifies the logarithm of the number of blogs from opinion leader on day \(t-1\) in province \(p\) , \(\log ({{Government}}_{\left(t-1\right)p})\) is the logarithm of the number of blogs from government on day \(t-1\) in province \(p\) , and \(\log ({{Media}}_{\left(t-1\right)p})\) indicates the logarithm number of blogs from media on day \(t-1\) in province \(p\) . Additionally, the control variables utilized to control province-level heterogeneity are \(\log ({{Death}}_{tp})\) , \(\log ({{Cure}}_{tp})\) , \({{DEWP}}_{tp}\) , \({{TEMP}}_{tp}\) , \({{GRI}}_{tp}\) , \({{ESI}}_{tp}\) , \({{HV}}_{mp}\) , \({{HWPT}}_{mp}\) , \({{PCDI}}_{qp}\) , and \({{RUU}}_{yp}\) . Finally, \({{FE}}_{p}\) represents the fixed effect of the provinces in Eq. ( 1 ) and Eq. ( 2 ).

Least squares dummy variable (LSDV) model with clustering robust standard errors was employed to control for province-level heterogeneity, given that the data for the research question are at the province level. Furthermore, considering our dependent variable \({{Case}}_{tp}\) is non-negative, we have also considered two count models also with clustering robust standard errors and fixed effects: Poisson regression (Poisson) and Negative Binomial regression (NB).

Sentiment type empirical model

In the context of an epidemic outbreak, the increase in the number of new cases has the effect of increasing the number of negative sentiments expressed. Nevertheless, it is also evident that the dissemination of positive sentiments can be observed when a news feed includes information about government intervention or relief measures, or when it features support from humanitarian organizations (Alamoodi et al. 2021 ). Hence, based on the results of the user analysis, the independent variables to be considered are the number of nature, negative, or positive blogs from public users. As in Eq. ( 1 ), we also employ the one-day lagged values. Our model is as follows:

Here, \({{Case}}_{tp}\) is the same variable as defined in Eq. ( 1 ). In addition, \(\log ({{Nature}}_{\left(t-1\right)p})\) represents the logarithm of the number of nature blogs from public users on day \(t-1\) in province \(p\) , \(\log ({{Negative}}_{\left(t-1\right)p})\) denotes the logarithm of the number of negative blogs from public users on day \(t-1\) in province \(p\) , \(\log ({{Positive}}_{\left(t-1\right)p})\) signifies the logarithm of the number of positive blogs from public users on day \(t-1\) in province \(p\) .

Prediction model

The prediction process of our prediction model, as outlined in Table 4 , was employed to further validate the effectiveness of Hypothesis 1 and 2. The first and second lines of the pseudocode in Table 4 were the predictor variable and dependent variable, which were detailed in the section below entitled “Predictor and Dependent”. The third to fifth lines of the pseudocode in Table 4 represent the feature selection strategy employed to address the potential issues of overfitting and underfitting, which were discussed in the “Feature Selection” section below. The sixth line of the pseudocode in Table 4 represents the machine learning model, which is detailed in the section below entitled “Statistical and Machine Learning Model”. The seventh to thirteenth lines of the pseudocode outlined the training strategy employed to ensure the robustness and generalizability of the model. This strategy was detailed in the section below entitled “Training Strategy”. The fourteenth line of the pseudocode outlined the evaluation method, which was detailed in the “Evaluation Metrics” section below.

Predictor and dependent

Previous research has demonstrated that the number of new cases is a pivotal variable during disease outbreaks (Alessa and Faezipour, 2019 ). Consequently, we regard this to be our dependent variable, which we denote as \({{Case}}_{t}\) . This represents the number of newly reported cases on day \(t\) during the outbreak. Additionally, \({{Case}}_{t}\) is also included as both a predictor and an external variable. This inclusion serves two purposes: first, to utilize the model constructed using \({{Case}}_{t}\) as a baseline for validating the efficacy of other predictor variables, and second, to externally leverage it to enhance the model’s performance. To further validate Hypothesis 1 and Hypothesis 2, we also consider Weibo data’s user type, sentiment and the combination of user type and sentiment as predictor variables, as illustrated in Table A2 (see Supplementary Appendix A) .

Furthermore, to compare the disparities between Weibo data and the representative social media search index — Baidu Search Index, we consider two keywords — ‘Coronavirus’ ( \({{Coronavirus}}_{t}\) ) and ‘Cough’ ( \({{Cough}}_{t}\) ) — which exhibited the best performance in the “Baidu Search Index” section. The control variables in Table 3 are aggregated from province to country and utilized as external variable to further enhance the model’s performance in the “Forecast Outcome” section. For detailed information regarding the variables, please refer to Table A3 (see Supplementary Appendix A) . The \({{Predictor\; Variable}}_{t}\) in the first line of our pseudocode in Table 4 being a specific predictor from Table A2 or Table A3 (see Supplementary Appendix A) . The \({{Dependent\; Variable}}_{t}\) in the second line of our pseudocode denotes a particular dependent, which in our study is \({{Case}}_{t}\) .

Feature selection

Overfitting occurs when a machine learning model becomes excessively sensitive to the training data, inadvertently capturing noise and random variations. Similarly, underfitting occurs when a model is insufficiently complex to capture the inherent patterns within the provided training data sample. The third to fifth lines of pseudocode in Table 4 were utilized to deal with the risk of overfitting and underfitting. The rationale for this approach was that the parameter \(d\) was employed to control the model’s complexity. The smaller the value of \(d\) , the more complex the model is and the easier it is to overfitting. Conversely, the larger of the value of \(d\) , the simpler the model is and the easier it is to underfitting. A moderate \(d\) value helps to avoid the overfitting and underfitting of the model. It is important to note that the parameter \(d\) is constrained to a specific range. The establishment of a minimum lower bound ( \({MIN\_d}\) ) and a maximum upper bound ( \({MAX\_d}\) ) for the parameter \(d\) serves to ensure that the parameter \(d\) is assigned a moderate value, thereby preventing the risk of overfitting or underfitting of the model. The parameters \({MIN\_d}\) and \({MAX\_d}\) will be empirically determined in the section entitled “Determining the \({MIN\_d}\) and \({MAX\_d}\) ”.

Statistical and machine learning model

We utilized one statistical model: Linear Regression (LR) and three different machine learning models: K-Neighbors Regressor (KNR), Random Forest Regressor (RFR), and Extra Trees Regressor (ETR) to constructed the prediction model. The rationale behind employing diverse models is to guarantee the reliability of our findings. That is to say, our conclusions should remain consistent across models.

Training strategy

In order to ensure that our model is capable of adapting to new information, as well as demonstrating robustness and generalizability, we employed the expanding window time series cross-validation technique (Tashman 2000 ). This approach mirrors the real-world scenario where healthcare professionals gather all available historical confirmed case data daily to train a prediction model. For instance, consider a sequence of 17 observations arranged in temporal order. The earlier observations represent data that is more distant in time, while the later observations represent data that is more recent (or even future) in nature. To illustrate, consider a scenario where observations 1–16 are available. In this case, a model is constructed based on observations 1–15 (i.e., historical data) and evaluated using observation 16 (i.e., current or future data). On the following day, a different model is constructed using observations 1–16 (with the addition of new observation 16) and tested on observation 17. In this illustrative example, there is an expanding window of 15 observations, which is used to train the model. As new information emerges, we expand the time window to the right by one observation. This training strategy ensures that the selected model demonstrates optimal performance across the entire time series, rather than merely at a specific point in time. This approach, in conjunction with our feature selection strategy, serves to further mitigate the potential for overfitting and underfitting, thereby enhancing the model’s robustness and generalizability. The parameter \({EXPANDING\_WINDOW}\) will be empirically determined in the section entitled “Determining the \({EXPANDING\_WINDOW}\) ”.

Evaluation metrics

We evaluated the performance of the prediction models using the mean absolute percentage error (MAPE). Specifically, MAPE =  \(\frac{100}{n}\mathop{\sum }\nolimits_{t=1}^{n}\left|\frac{{A}_{t}-{F}_{t}}{{A}_{t}}\right|\) , where \({A}_{t}\) is the actual values, and \({F}_{t}\) is the predicted values.

Content analysis

In this section, we attempt to explain the findings of the empirical model and the prediction model and validate Hypothesis 3 from the perspective of text. Prior studies have revealed the immense potential of social media data in terms of the content shared by individuals. For example, social media platforms can furnish substantial and beneficial data to anticipate and elucidate the characteristics and status of disease outbreaks (Ginsberg et al. 2009 ). Furthermore, text mining of social media data has been employed to monitor the occurrence of diseases and to assess public awareness concerning health issues. This approach has enabled the forecasting of disease outbreaks (Boon-Itt and Skunkan, 2020 ). However, the raw social media data are voluminous and unstructured, rendering them of little use in the form of an array of blogs. Therefore, we utilize advanced text analysis techniques to analyze this content and address our RQ3.

Topic model

Early methods for generating topics were qualitative, such as survey questionnaires and personal interviews. However, these methodologies are not feasible when the quantity of data is considerable, as is the case with social media data collected over an extended period. Consequently, an efficacious mathematical methodology, designated topic modeling, is extensively employed to extract meaningful topics from voluminous textual data, particularly social media data (Aggarwal and Gour, 2020 ; Chakraborty et al. 2020 ). The most classic topic modeling approach, Latent Dirichlet Allocation (LDA) (Blei et al. 2003 ), was employed to generate topics. And the details of topic generation are presented below.

Firstly, we create a ‘Term Document Matrix’ using blogs from public user as the input of the LDA topic model. Then, the ‘Gensim package’ was applied to generate topics. In order to select the optimal topic model, two complementary approaches were employed. Firstly, the topic coherence was estimated. Secondly, the model was manually inspected to ensure its interpretability. Topic coherence, defined as the average or median of the pairwise word similarities formed by the top words of a given topic (Rosner et al. 2014 ), has been proposed as an intrinsic evaluation method for topic models (Newman et al. 2010 ). However, evidence from previous studies indicates that automated detection alone is insufficient for optimal decision-making regarding cutoff values. Consequently, expert manual inspection and evaluation are necessary in all cases (Bickel, 2019 ). Therefore, we also checked the topics generated by the model to ascertain their interpretability.

Semantic analysis

In this section, we focus on the semantic patterns extracted from public user’s sentiment, which can help solve the question: why negative sentiment blog from public user has the ability to reflect the ground truth in pandemic? Saliency methods (Li et al. 2015 ) are a widely used technique for interpreting NLP models. Contrary to sentiment analysis, which infers the sentiment of a text, this method deduces the words within a given text that indicate the sentiment type based on the provided sentiment of the text. We utilize the integrated gradient (IG) method (Sundararajan et al. 2017 ), a widely used saliency method, to conduct our semantic analysis.

The entire process of our semantic analysis is illustrated in Fig. 4 and can be roughly divided into two parts. Firstly, blogs from public user are fed into the BERT model that has been fine-tuned in polarity analysis task. Subsequently, the saliency scores for each token are derived from the gradients relative to the input embedding. Concurrently, the distribution of all tokens is obtained through the application of the softmax function. Finally, the salient words are extracted by first detokenizing the tokens to original words and then summing scores of all composing tokens. In order to gain a more profound comprehension of the gradient computations, we formally represent the embedding matrix of an input sentence as \(\,{x}_{1:n}\) , where \({x}_{i}\) denotes the embedding vector of the \(i\) th token. Then, the IG is

Here, \({f}_{c}(\cdot )\) represents the output of the BERT model in terms of class \(c\) , while \({b}_{1:n}\) are baselines. In general, by linearly interpolating the inputs of \({f}_{\!c}(\cdot )\) from \({b}_{1:n}\) to \({x}_{1:n}\) , the IG method mitigates the potential for gradients to approach zero. Finally, the dot product between the average vector over \(m\) gradients and the input embedding \({x}_{i}\) reduced by the baseline is taken to obtain the final saliency value.

figure 4

The texts along with their corresponding sentiment labels were fed into fine-tuned BERT model. Subsequently, the salient method, specifically Integrated Gradients (IG) was utilized to extract sentiment-related words.

Empirical result

This section shows the empirical validation results for Hypothesis 1 and Hypothesis 2.

User type empirical result

The result of Eq. ( 1 ) was present in Table 5 . \({\beta }_{1}\) , \({\beta }_{2}\) , \({\beta }_{3}\) and \({\beta }_{4}\) represent the relationships of \(\log ({{Public}}_{\left(t-1\right)p})\) , \(\log ({{Opinion\; leader}}_{\left(t-1\right)p})\) , \(\log ({{Government}}_{\left(t-1\right)p})\) and \(\log ({{Media}}_{\left(t-1\right)p})\) with \({{Case}}_{tp}\) . We find that only the coefficient of \(\log ({{Public}}_{\left(t-1\right)p})\) is consistently positive and significant ( p  < 0.01), indicating that past blogs from public user robustly reflect ground truth cases. Although the coefficient of \(\log ({{Opinion\; leader}}_{\left(t-1\right)p})\) is positive and significant ( p  < 0.05) in OLS, it is negative or non-significant in count model: Poisson and NB, suggesting that past blogs from opinion leader are not a robustly indicator to reflect the ground truth cases. Finally, the coefficient of \(\log ({{Government}}_{\left(t-1\right)p})\) and \(\log ({{Media}}_{\left(t-1\right)p})\) are negative or non-significant, indicating that past blogs from government or media not reliably reflect the ground truth cases.

Then, the cross-correlation method is utilized to analyze blogs from public, opinion leader, government or media. The local maximum cross-correlation coefficient for blogs from public is 0.6879, with a lag order of 12. This coefficient is higher than that of unfiltered social media blogs and Baidu Search Index, suggesting that unfiltered social media blog content may contain noise, potentially influencing their impact. The local maximum cross-correlation coefficient for blogs from opinion leader is 0.3728, with a lag order of 6. For blogs from government, the coefficient is 0.3935, with a lag order of 0, and for blogs from media, it is 0.2883, also with a lag order of 0. Hence, based on the aforementioned analysis, past blogs from public user are more effective in reflecting the fluctuations in the epidemic situation compared to opinion leader, government and media. This supports our Hypothesis 1.

Sentiment type empirical result

The result of Eq. ( 2 ) was present in Table 6 . \({\beta }_{1}\) , \({\beta }_{2}\) , and \({\beta }_{3}\) represent the relationships of \(\log ({{Nature}}_{\left(t-1\right)p})\) , \(\log ({{Negative}}_{\left(t-1\right)p})\) and \(\log ({{Positive}}_{\left(t-1\right)p})\) blogs with \({{Case}}_{tp}\) . We find that only the coefficient of \(\log ({{Negative}}_{\left(t-1\right)p})\) is consistently positive and significant ( p  < 0.01), indicating that negative blogs from public user in past robustly reflect ground truth cases. Though the coefficients of \(\log ({{Nature}}_{\left(t-1\right)p})\) and \(\log ({{Positive}}_{\left(t-1\right)p})\) are positive and significant in Poisson or NB, they are negative and non-significant in OLS. This suggests that nature blogs or positive blogs from public user in past are not a robustly indicator to reflect the ground truth cases.

Then, the cross-correlation method is used to analyze nature, negative or positive blogs from public user. The local maximum cross-correlation coefficient for negative blogs from public user is 0.7388, with a lag order of 15. The coefficient is higher than that of blogs from public user, further supporting the notion that unfiltered social media blog content contains noise, potentially influencing their impact. The local maximum cross-correlation coefficient for nature blogs from public user is 0.6450, with a lag order of 12. For positive blogs from public user, the coefficient is 0.4770, with a lag order of 7. Hence, compared with nature or positive sentiment from public user, negative sentiment from public user can more effectively reflect the changes in the epidemic situation. This supports our Hypothesis 2.

Robustness checks

We conduct a series of robustness checks to validate the empirical findings of our models.

The synthetic sentiment derived from the outcome of emotion recognition was employed to check the robustness of Hypothesis 2. The robustness check result of Eq. ( 2 ) was present in Table A4 (see Supplementary Appendix A) . Our main conclusions remained unchanged. The coefficients of \(\log ({{Negative}}_{\left(t-1\right)p})\) is consistently positive and significant ( p  < 0.01), while the coefficient of \(\log ({{Nature}}_{\left(t-1\right)p})\) and \(\log ({{Positive}}_{\left(t-1\right)p})\) are non-significant in OLS and Poisson. The local maximum cross-correlation coefficient for synthetic negative blogs from public is 0.7437, with a lag of 15. The coefficient is higher than that of blog from public user, further supporting the notion that unfiltered social media blog content contains noise, potentially influencing their impact. The local maximum cross-correlation coefficient for synthetic nature blogs from public user is 0.5642, with a lag order of 12. For synthetic positive blogs from public user, the coefficient is 0.5361, with a lag order of 12. Hence, the outcomes are consistent with sentiment type empirical result above.

Additional checks

It should be noted that the one-day lagged values for the logarithm of the number of blogs from public, opinion leader, government, or media are employed in Eq. ( 1 ). In this section, we explore the robustness of our results by utilizing different lagged values. Three alternatives are considered. First, we employ additional lagged values of blogs from opinion leader, government, and media. Second, we employ additional lagged values of blogs from public user. Third, we employ additional lagged values for all users. The results of these alternatives have been compiled in Table A5 (see Supplementary Appendix A) . Our key insights remain consistent across all three alternatives.

For Eq. ( 2 ), three alternative results were showed in Table A6 (see Supplementary Appendix A) . First, we employ an additional day lag for nature, and positive sentiment blogs from public user. Second, we employ an additional day lag for negative sentiment blogs from public user. Third, we employ an additional day lag values for all sentiments. Despite these alterations, we observe that our primary insights remain unchanged, indicating robustness in our findings.

Similarly, for synthetic sentiment, three alternatives were employed, and the results were presented in Table A7 (see Supplementary Appendix A) . First, we employ an additional day lag for synthetic nature, and synthetic positive sentiment blogs from public user. Second, we employ an additional day lag for synthetic negative sentiment blogs from public user. Third, we employ an additional day lag for all synthetic sentiments. We note that our fundamental insights remain unchanged, indicating robustness in our findings.

Three waves

The COVID-19 situation in China witnessed three major outbreaks: the first large-scale outbreak in the first half of 2020, the second substantial outbreak from March to November 2022 due to widespread transmission of the Omicron variant, and the third significant outbreak from November 2022 to January 2023. During the first large-scale outbreak phase, China reported a record daily increase of 15,152 newly confirmed cases on February 12, 2020, reaching the peak number of confirmed cases. As the second major outbreak phase commenced in March 2022, with the widespread transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Omicron variant, the number of asymptomatic carriers of COVID-19 reported across various regions in China gradually surpassed the number of confirmed cases. On March 12, a report indicated 1807 new confirmed cases and 1315 asymptomatic carriers, both marking the highest numbers since the onset of the pandemic. In the third major outbreak phase, as several provinces successively announced the relaxation of epidemic control measures, on November 27, the number of new COVID-19 infections in mainland China reached its highest level since the beginning of the pandemic, surpassing the previous single-day record set in mid-April 2022. Hence, we utilize the critical time points of these three stages—February 12, 2020, March 12, 2022, and November 27, 2022—as the basis for dividing the periods. We will consider the 28 days (4 weeks) preceding each time point as a cycle and conduct robustness checks on our previous findings within these cycles. The specific division is depicted in Fig. 5 .

figure 5

The first period (green rectangle) spans from January 15, 2020, to February 12, 2020. The second period (blue rectangle) spans from February 12, 2022, to March 12, 2022. The third period (yellow rectangle) spans from October 30, 2022, to November 27, 2022.

We present the results of Eq. ( 1 ) for the three waves in Table A8 (see Supplementary Appendix A) . The results are similar to the findings of the entire outbreak. The coefficient of \(\log ({{Public}}_{(t-1)p})\) is consistently positive and significant, while the coefficients of \(\log ({{Opinion\; leader}}_{(t-1)s})\) , \(\log ({{Government}}_{(t-1)p})\) and \(\log ({{Media}}_{(t-1)p})\) are either non-significant or negative. Therefore, in comparison to opinion leader, government, and media, public prove more effective in depicting the fluctuations in the epidemic situation. Our primary insights remain unchanged, indicating their robustness.

The results of sentiment and synthetic sentiment for the three waves were present in Table A9 and Table A10 (see Supplementary Appendix A) . The coefficient of \(\log ({{Negative}}_{(t-1)p})\) is consistently positive and significant, while the coefficients of \(\log ({{Nature}}_{(t-1)p})\) and \(\log ({{Positive}}_{(t-1)p})\) are not consistently significant. These outcomes align with our observations from the entire outbreak. Therefore, we deduce that, contrary to blogs from public user expressing positive sentiments and nature sentiments, blogs from public user conveying negative sentiments better capture the changes in the epidemic situation. Thus, our results maintain robustness.

Prediction result

This section shows empirically determined parameters such as \({MIN\_d}\) , \({MAX\_d}\) , and \({EXPANDING\_WINDOW}\) , and the predictive results that can further validate Hypothesis 1 and Hypothesis 2.

Determining the \({MIN\_d}\) and \({MAX\_d}\)

In order for a prediction model to be considered truly actionable, it is essential that accurate predictions are made at an early stage. In fact, if the forecasting model can only predict one or two days ahead, it may not be useful for government and health organization, as this will not allow for timely decision-making and planning. Forecasting \({Y}_{t}\) early holds particular significance in the context of epidemics. Thus, the larger the lag \(d\) between \({{Predictor\; Variable}}_{t}\) and \({{Dependent\; Variable}}_{t+d}\) , the earlier (or, rather, the further into the future) we can forecast. As we aim to forecast future trends in the epidemic situation with a lead time of at least 7 days (1 week), allowing ample time for governmental and health institutions to make decisions, we have set \({MIN\_d}\) to 7. Furthermore, since the maximum local maximum cross-correlation is 15 (achieved by our \({Predictor\; Variable}\) ( \({{case}}_{t}\) ) and daily negative blogs from public user, as detailed in the ‘Sentiment Type Empirical Result’ section), we have set \({MAX\_d}\) to 15 days. Setting \({MAX\_d}\) to 15 days also makes practical sense, for example, for COVID-19, the 95th percentile of the incubation period is estimated to be approximately 10–14 days (Linton et al. 2020 ). Finally, the optimum lag (from \({MIN\_d}\) to \({MAX\_d}\) ) between \({{Predictor\; Variable}}_{t}\) and \({{Dependent\; Variable}}_{t}\) are determined empirically.

Determining the \({EXPANDING}{\rm{\_}}{WINDOW}\)

As our objective is to construct a prediction model capable of early warning for epidemic outbreaks, our training data needs to encompass data preceding the outbreaks as much as possible. Similar to the temporal setup in the “Three Waves” section, we initially use data from 28 days preceding February 12, 2020, and 28 days preceding March 12, 2022. Hence, \({EXPANDING\_WINDOW}\) is set to 56. Differing from the “Three Waves” section, data from October 31, 2022, to January 8, 2023, encompassing the entirety of the third wave of the epidemic, is utilized for prediction.

Forecast outcome

Table 7 presents the outcomes of the Extra Trees Regressor utilizing various Weibo predictors as showed in Table A2 (see Supplementary Appendix A) , aimed at further validating our Hypothesis 1 and Hypothesis 2. And the values in parentheses within Table 6 represent optimum lag. We observe that public consistently attains the lowest MAPE values across various user types. This further substantiates the validity of Hypothesis 1, indicating that, compared to opinion leader, government, and media, the public exhibits greater predictiveness. Additionally, in the ‘public’ row of Table 7 , negative sentiment consistently achieves the lowest MAPE values. Thus, this outcome aligns with our Hypothesis 2, indicating that, compared to nature and positive sentiments, negative sentiment from public exhibits greater predictiveness. Furthermore, the outcomes of Linear Regression, K-Neighbors Regressor, and Random Forest Regressor are presented in Table A11 , Table A12 , and Table A13 (see Supplementary Appendix A) , respectively. These results corroborate the analytical findings discussed earlier.

Table 8 presents the prediction results from various data source, including baseline results \({{Case}}_{t}\) , Baidu Search Index results \({{Coronavirus}}_{t}\) and \({{Cough}}_{t}\) , and the best-performing Weibo predictor ( \({{Negative}}_{t}\) , representing daily negative blogs from public). The lowest MAPE (24.80257) is attained by \({{Negative}}_{t}\) under ETR. For Rows (1) to (3), the MAPE increase, compared to \({{Negative}}_{t}\) , is derived by \(\frac{1}{n}\mathop{\sum }\nolimits_{t=1}^{n}100\left(\left|\frac{{A}_{t}-{F}_{{tj}}}{{A}_{t}}\right|-\left|\frac{{A}_{t}-{F}_{t0}}{{A}_{t}}\right|\right)\) , where \({F}_{t0}\) represents the forecast value of \({{Negative}}_{t}\) ; \({F}_{{tj}}\) is that of variable in Row ( \(j\) ). A positive MAPE increase in Row ( \(j\) ) indicates that using \({{Negative}}_{t}\) yields a lower MAPE than using the variable in Row ( \(j\) ). The standard deviation of the MAPE increase (as shown in parentheses in the table) is the standard deviation of \(100\left(\left|\frac{{A}_{t}-{F}_{{tj}}}{{A}_{t}}\right|-\left|\frac{{A}_{t}-{F}_{t0}}{{A}_{t}}\right|\right)\) for \(t\in \{\mathrm{1,2},\,...\,,{n}\}\) divided by \(\sqrt{n}\) . Then, a one-sided t-test was employed to determine the significance level of the MAPE increase. From Table 8 , it is evident that, across various models, the MAPE values for \({{Negative}}_{t}\) are significantly lower than those for the baseline \({{Case}}_{t}\) and Baidu Search Index \({{Coronavirus}}_{t}\) and \({{Cough}}_{t}\) . This signifies the predictive nature of Weibo data, and indicates that preprocessed Weibo data, after filtering, can further enhance the predictive performance.

Based on the aforementioned results, we employed a stepwise forward method to sequentially introduce External Variables from Table A3 into the ETR model, focusing on optimizing results based on the \({{Negative}}_{t}\) variable. In each iteration, we selected the variable that resulted in the greatest reduction in MAPE compared to the previous iteration and included it in the model until either the MAPE ceased to decrease or all variables were included in the model. The best-performing model obtained after iteration achieved an MAPE of 10.735. The variables included in this model comprise \({{Negative}}_{t},{{HWPT}}_{m},{{PCDI}}_{q},{{GRI}}_{t},{{ESI}}_{t}\) . To further validate the efficacy of our result, we used only the previous-day \({{Case}}_{t-1}\) variable as the predictor. We found that the MAPE for the model is 14.512, which is higher than that of our best-performing model (14.512 vs. 10.735, paired t-test p -value < 0.05). The actual data, predictions based on \({{negative}}_{t}\) variable, and the forecasts from our best-performing model are depicted in the following illustration Fig. 6 .

figure 6

The red solid line denotes the actual data. The green dotted line represents the predictions based on daily negative sentiment blog from public. The blue dotted line indicates the forecasted outcomes from the best-performing prediction model.

Content analysis result

To further explain Hypothesis 1 and Hypothesis 2 and to answer RQ3, we develop two content analysis. First, we explore: why blog from public user has the ability to reflect the ground truth in pandemic? The topic model result presented below address this question. Then, we investigate the semantic of blog to understand: why negative sentiment blog from public has the ability to reflect the ground truth in pandemic? This question was answered by semantic analysis result provided below.

Topic model result

Figure 7 shows the graph from the UMass and CV topic coherence method (Röder et al. 2015 ), utilized for determining the optimal number of topics. The combined results of topic coherence and manual inspection suggest that the number of topics is most suitable when set to 3.

figure 7

The dashed lines represent the number of topics achieved at the peak of the topic coherence.

Additionally, the Biterm Topic Model (BTM) is a topic modeling algorithm that classifies words and documents from a text corpus into a smaller number of latent topics, being particularly suitable for short texts such as blogs (Murshed et al. 2023 ). Hence, taking into account the concise nature of social media data, we employed the BTM for another round of topic modeling, setting the number of topics to 3. The results from LDA and BTM exhibit a striking similarity. Then, a visual of the topic groupings by Jensen-Shannon divergence can be found in Fig. B1 and Fig. B2 (see Supplementary Appendix B) .

Table 9 compiles topic modeling (LDA and BTM) results into three categories of public discussions related to the outbreak—COVID-19 Infection and Management, Epidemiological Trends and Diagnosis in COVID-19, and Physical Symptoms of Illness.

The topic ‘COVID-19 Infection and Management’ appears to center around the COVID-19 pandemic, encompassing terms related to the virus, its spread, diagnosis, symptoms, management strategies (such as isolation and mask-wearing), hospitalization, and the biological aspects like nucleic acid testing for detection. And the local maximum cross-correlation coefficient between this topic and new confirmed cases is 0.6248, with a lag order of 12.

The topic ‘Epidemiological Trends and Diagnosis in COVID-19’ primarily revolves around discussions of numbers and cases associated with the progression of pandemic. Therefore, this topic may not serve as a reliable indicator of the future situation of the epidemic. And the local maximum cross-correlation coefficient between this topic and new confirmed cases is 0.3405, with a lag order of 7.

The topic ‘Physical Symptoms of Illness’ mainly encompasses discussions of symptoms such as cough, fever, generalized discomfort (from head to toe, whole-body symptoms), and so forth. This topic may provide insights into the extent of people’s infection, the current state of the epidemic, and the potential severity in the future. And the local maximum cross-correlation coefficient between this topic and new confirmed cases is 0.7469, with a lag order of 15.

Topic ‘Physical Symptoms of Illness’, which related to individual self-reporting of illness, achieved the highest local maximum cross-correlation coefficient among all three topics, indication that blogs related to this topic have the strongest association with new confirmed cases. Therefore, the reason why blog from public has the ability to reflect the ground truth in pandemic is mainly originated from individual self-reporting of illness. This is because symptom-related blogs shared on social media effectively reflect real infection situations. Furthermore, previous research also indicates that social media blogs related to symptom reporting tend to have greater predictive value.

Semantic analysis result

Through semantic analysis, we’ve identified keywords that express nature sentiment, negative sentiment, and positive sentiment from public. The spearman rank correlation coefficient between nature sentiment keywords and negative sentiment keywords is –0.08 ( p  < 0.01), between nature sentiment keywords and positive sentiment keywords is 0.17 ( p  < 0.01) and between negative sentiment keywords and positive sentiment keywords is 0.14 ( p  < 0.01). These results demonstrate the effectiveness of our method and the notable separation among sentiment keywords.

The top 20 words with the highest frequency are listed in Table 10 below. It’s noticeable that keywords reflecting negative sentiment from public are predominantly associated with symptoms of the disease. Therefore, the reason why negative sentiment blog from public has the ability to reflect the ground truth in pandemic is mainly originated from individual self-reporting of illness. One potential explanation for this phenomenon is that blogs based on negative sentiment can effectively capture the sadness, fear, and panic that arise from the increased spread of disease (Gour et al. 2022 ). Furthermore, blogs from the infected public typically present self-reported symptoms following the infection, and these reports often convey negative sentiment. However, people tend to publish more positive blogs where there is fewer crisis related to illness and disease. Overall, the analysis results from both the “Topic Model” section and the “Semantic Analysis” section collectively support our Hypothesis 3.

The key to pandemic prevention and control lies in timely monitoring and forecasting. Despite the existence of epidemic surveillance sentinels at the national level and within various healthcare institutions, the lack of timely information updates has significantly impeded the effective use of these systems. The real-time and dynamic nature of data updates on social media platforms provides an opportunity to address this issue. Nevertheless, the vast and unstructured nature of social media information poses a significant challenge for government and healthcare institutions in leveraging social media for pandemic monitoring and forecasting. Consequently, this study aims to identify key information reflecting pandemic trends within the immense volume of data and to extract structured information from it, thereby addressing the inconvenience caused by the massive and unstructured social media data and enabling more effective pandemic monitoring and prediction by government. This study proposes a methodology for harnessing social media data to generate essential information for managing disease outbreaks. The methodology employs empirical models, predictive models, and content analysis. In empirical model, we initially examine the relationship between user activity and the ground truth during the pandemic, followed by exploring the relationship between user sentiment and the ground truth. In predictive model, we further validate the conclusions conducted from empirical model and utilize them to forecast the ground truth. In content analysis, we use topic model and semantic analysis to answer why social media data has the ability to reflect the ground truth in pandemic. This study provides insights into leveraging social media data for pandemic detection and prediction, thereby extending significant contributions and insights into which types of information can be used for pandemic monitoring and forecasting.

Conclusion and implication

The present study addresses the following issues: (1) We demonstrate that social media data generated by public user, or negative sentiment blogs generated by public user during pandemic, can be utilized to gain insight into the ground truth. A review of the existing literature did not identify any studies that have established this relationship. The findings of this study, which pertain to the characteristics of social media, such as user type and sentiment type, could prove beneficial to health organizations in the filtering of pertinent posts. (2) We demonstrate how social media data can be combined with our findings to construct a predictive model forecasting the future trends in epidemic outbreaks. The availability of real-time social media data enables analysts to derive meaningful insights and updates through the application of advanced analytical and machine learning models. (3) We examine the potential of social media data for providing real-time updates in pandemic management. We use topic model to segregate useful information into topic and semantic analysis to generate sentiment-related keywords from the social media data. These topics and keywords can assist health organizations in further filtering information. Moreover, stakeholders could utilize them to develop customized analytical models for gathering necessary information during pandemic.

The findings of our research have significant implications for governments and healthcare institutions in leveraging social media data for pandemic monitoring and forecasting. Firstly, the study assists these entities in focusing on key information. When addressing various types of user discussions about the pandemic on social media platforms, governments and healthcare institutions should prioritize social media activity from public user. Discussions from public users about the pandemic exhibit different types of sentiments, and special attention should be given to negative sentiment posts from these users. Additionally, negative sentiment posts from public user often involve various topics, and particular focus should be placed on public users’ discussions related to individuals’ self-reporting of illness. Furthermore, the research aids governments and healthcare institutions in real-time pandemic monitoring and predictive measures, as well as in the optimal allocation of resources. The crucial social media information uncovered by this study enables real-time dynamic updates, facilitating the timely collection of data for real-time pandemic monitoring and forecasting. This approach allows for more prompt surveillance compared to traditional methods, aiding in the swift implementation of preventive measures by governments and healthcare institutions. More, the vital social media information discovered can be detailed down to regional granularity. By collaborating with social media platforms, governments can obtain specific regional sources of each piece of information, thereby achieving localized pandemic monitoring and forecasting, leading to more reasonable and targeted resource allocation.

Limitation and future direction

While the findings of this study offer valuable insights into leveraging social media for pandemic detection and prediction, we acknowledged that the study has certain limitations. Firstly, it should be noted that the scope of this study is limited, in that it examines the influence of a singular social media platform. Social media platforms exhibit distinctive characteristics with regard to the data they capture and the interfaces they provide. It would be advantageous to conduct further research on other social media platforms, including Instagram, Twitter, and Facebook, in order to validate and expand upon the existing findings. Secondly, our study employs a single type of data. Nevertheless, future studies may build upon our research by incorporating additional dimensions of social media, including images, audio, and video, as well as the textual data captured in this study. Thirdly, our study only carried out the identification of bot accounts. Future study may explore the role of bot accounts during pandemic. Finally, the accuracy of our sentiment classification model requires further improvement. Future research endeavors will involve the use of sentiment classification models with higher classification accuracy.

Data availability

The datasets generated during and/or analysed during the current study can be accessed here: https://doi.org/10.7910/DVN/JVYJRK .

https://www.weibo.com/ .

http://www.nhc.gov.cn/ .

http://www.stats.gov.cn/ .

https://www.mot.gov.cn/ .

https://www.ncei.noaa.gov/ .

https://github.com/OxCGRT/covid-policy-dataset .

https://huggingface.co/hfl/chinese-roberta-wwm-ext-large .

https://www.datafountain.cn/competitions/423 .

https://smp2020ewect.github.io/ .

Aggarwal S, Gour A (2020) Peeking inside the minds of tourists using a novel web analytics approach. J Hosp Tour Manag 45:580–591. https://doi.org/10.1016/j.jhtm.2020.10.009

Article   Google Scholar  

Alamoodi AH, Zaidan BB, Zaidan AA, Albahri OS, Mohammed KI, Malik RQ, Almahdi EM, Chyad MA, Tareq Z, Albahri AS (2021) Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review. Expert Syst Appl 167:114155. https://doi.org/10.1016/j.eswa.2020.114155

Article   CAS   PubMed   Google Scholar  

Alessa A, Faezipour M (2019) Preliminary flu outbreak prediction using twitter posts classification and linear regression with historical centers for disease control and prevention reports: Prediction framework study. JMIR Public Health Surveill 5(2):e12383. https://doi.org/10.2196/12383

Article   PubMed   PubMed Central   Google Scholar  

Ampountolas A, Legg MP. A segmented machine learning modeling approach of social media for predicting occupancy. Int J Contemporary Hosp Manag 33(6):2001–2021. https://doi.org/10.1108/IJCHM-06-2020-0611

Aumond P, Lavandier C, Ribeiro C, Boix EG, Kambona K, D’Hondt EDelaitre P (2017) A stu dy of the accuracy of mobile technology for measuring urban noise pollution in large scale participatory sensing campaigns Appl Acoust 117:219–226. https://doi.org/10.1016/j.apacoust.2016.07.011

Bae S, Sung E, Kwon O (2021) Accounting for social media effects to improve the accuracy of infection models: combatting the COVID-19 pandemic and infodemic. Eur J Inf Syst 30(3):342–355. https://doi.org/10.1080/0960085X.2021.1890530

Bickel MW (2019) Reflecting trends in the academic landscape of sustainable energy using probabilistic topic modeling. Energy, Sustainability Soc 9(1):1–23. https://doi.org/10.1186/s13705-019-0226-z

Article   MathSciNet   CAS   Google Scholar  

Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3(Jan):993–1022. https://doi.org/10.1162/jmlr.2003.3.4-5.993

Boon-Itt S, Skunkan Y (2020) Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health Surveill 6(4):e21978. https://doi.org/10.2196/21978

Burke JA, Estrin D, Hansen M, Parker A, Ramanathan N, Reddy S, Srivastava MB (2006) Participatory sensing . UCLA: Center for Embedded Network Sensing. https://escholarship.org/uc/item/19h777qd

Cai M, Luo H, Meng X, Cui Y, Wang W (2023) Network distribution and sentiment interaction: Information diffusion mechanisms between social bots and human users on social media. Inf Process Manag 60(2):103197. https://doi.org/10.1016/j.ipm.2022.103197

Cevik E, Kirci Altinkeski B, Cevik EI, Dibooglu S (2022) Investor sentiments and stock markets during the COVID-19 pandemic. Financial Innov 8(1):69. https://doi.org/10.1186/s40854-022-00375-0

Chakraborty K, Bhatia S, Bhattacharyya S, Platos J, Bag R, Hassanien AE (2020) Sentiment Analysis of COVID-19 tweets by Deep Learning Classifiers—A study to show how popularity is affecting accuracy in social media. Appl Soft Comput 97:106754. https://doi.org/10.1016/j.asoc.2020.106754

Chatterjee S, Ghosh K, Banerjee A, Banerjee S (2023) Forecasting COVID-19 outbreak through fusion of internet search, social media, and air quality data: a retrospective study in indian context. IEEE Trans Computational Soc Syst 10(3):1017–1028. https://doi.org/10.1109/TCSS.2022.3140320

Chen A, Zhang J, Liao W, Luo C, Shen C, Feng B (2022) Multiplicity and dynamics of social representations of the COVID-19 pandemic on Chinese social media from 2019 to 2020. Inf Process Manag 59(4):102990. https://doi.org/10.1016/j.ipm.2022.102990

Cheung KKC, Chan H-Y, Erduran S (2023) Communicating science in the COVID-19 news in the UK during Omicron waves: exploring representations of nature of science with epistemic network analysis. Humanities Soc Sci Commun 10(1):1–14

Google Scholar  

Comito C (2021) How COVID-19 information spread in US? The role of Twitter as early indicator of epidemics. IEEE Trans Serv Comput 15(3):1193–1205. https://doi.org/10.1109/TSC.2021.3091281

Deng W, Yang Y (2021) Cross-platform comparative study of public concern on social media during the COVID-19 Pandemic: An empirical study based on Twitter and Weibo. Int J Environ Res Public Health 18(12):6487. https://doi.org/10.3390/ijerph18126487

Article   CAS   PubMed   PubMed Central   Google Scholar  

Diaz-Garcia JA, Ruiz MD, Martin-Bautista MJ (2022) NOFACE: A new framework for irrelevant content filtering in social media according to credibility and expertise. Expert Syst Appl 208:118063. https://doi.org/10.1016/j.eswa.2022.118063

Eysenbach G (2011) Infodemiology and infoveillance: tracking online health information and cyberbehavior for public health. Am J Prevent Med 40(5):154–158. https://doi.org/10.1016/j.amepre.2011.02.006

Feng C, Umaier K (2023) Risk communication during the COVID-19 Pandemic in the era of social media. J Disaster Res 18(1):34–39. https://doi.org/10.20965/jdr.2023.p0034

Gao H, Kumar S, Tan Y, Zhao H (2022) Socialize more, pay less: Randomized field experiments on social pricing. Inf Syst Res 33(3):935–953. https://doi.org/10.1287/isre.2021.1089

Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L (2009) Detecting influenza epidemics using search engine query data. Nature 457(7232):1012–1014. https://doi.org/10.1038/nature07634

Article   ADS   CAS   PubMed   Google Scholar  

Gour A, Aggarwal S, Kumar S (2022) Lending ears to unheard voices: An empirical analysis of user‐generated content on social media. Prod Oper Manag 31(6):2457–2476. https://doi.org/10.1111/poms.13732

Heffner J, Vives M-L, FeldmanHall O (2021) Anxiety, gender, and social media consumption predict COVID-19 emotional distress. Humanities and Social Sciences . Communications 8:1

Huang W, Cao B, Yang G, Luo N, Chao N (2021) Turn to the internet first? Using online medical behavioral data to forecast COVID-19 epidemic trend. Inf Process Manag 58(3):102486. https://doi.org/10.1016/j.ipm.2020.102486

Article   PubMed   Google Scholar  

Jiang J-Y, Li C-T (2016) Forecasting Geo-sensor Data with Participatory Sensing Based on Dropout Neural Network. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, Indianapolis, Indiana, USA, https://doi.org/10.1145/2983323.2983902

Jiang J-Y, Zhou Y, Chen X, Jhou Y-R, Zhao L, Liu S, Yang P-C, Ahmar J, Wang W (2022) COVID-19 Surveiller: toward a robust and effective pandemic surveillance system based on social media mining. Philos Trans R Soc A 380(2214):20210125. https://doi.org/10.1098/rsta.2021.0125

Article   ADS   CAS   Google Scholar  

Kaur M, Verma R, Ranjan S (2021) Political leaders communication: A Twitter sentiment analysis during Covid-19 Pandemic. J Messenger 13(1):45–62. https://doi.org/10.26623/themessenger.v13i1.2585

Kellner D, Lowin M, Hinz O (2023) Improved healthcare disaster decision-making utilizing information extraction from complementary social media data during the COVID-19 pandemic. Decision Support Syst 113983. https://doi.org/10.1016/j.dss.2023.113983

Khatua A, Khatua A, Cambria E (2019) A tale of two epidemics: Contextual Word2Vec for classifying twitter streams during outbreaks. Inf Process Manag 56(1):247–257. https://doi.org/10.1016/j.ipm.2018.10.010

Lam JC, Li VO, Han Y, Zhang Q, Lu Z, Gilani Z (2021) In search of bluer skies: Would people move to places of better air qualities? Environ Sci Policy 117:8–15. https://doi.org/10.1016/j.envsci.2020.12.012

Article   CAS   Google Scholar  

Lamsal R, Harwood A, Read MR (2022) Twitter conversations predict the daily confirmed COVID-19 cases. Appl Soft Comput 129:109603. https://doi.org/10.1016/j.asoc.2022.109603

Lazarsfeld PF, Berelson B, Gaudet H (1968) Columbia University Press. https://doi.org/10.7312/laza93930

Li J, Chen X, Hovy E, Jurafsky D (2016) Visualizing and Understanding Neural Models in NLP Association for Computational Linguistics. https://aclanthology.org/N16-1082

Li J, Xu Q, Cuomo R, Purushothaman V, Mackey T (2020) Data mining and content analysis of the Chinese social media platform Weibo during the early COVID-19 outbreak: retrospective observational infoveillance study. JMIR Public Health Surveill 6(2):e18700. https://doi.org/10.2196/18700

Linton NM, Kobayashi T, Yang Y, Hayashi K, Akhmetzhanov AR, Jung S-M, Yuan B, Kinoshita R, Nishiura H (2020) Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data. J Clin Med 9(2):538. https://doi.org/10.3390/jcm9020538

Lu L, Xu J, Wei J, Shults FL, Feng XL (2024) The role of emotion and social connection during the COVID-19 pandemic phase transitions: a cross-cultural comparison of China and the United States. Humanities Soc Sci Commun 11(1):1–16. https://doi.org/10.1057/s41599-024-02744-9

Luu TP, Follmann R (2023) The relationship between sentiment score and COVID-19 cases in the United States. J Inf Sci 49(6):1615–1630. https://doi.org/10.1177/01655515211068167

Molloy P (2020) The press is making the same mistakes as 2016 . Media Matters for America. https://www.mediamatters.org/donald-trump/press-making-same-mistakes-2016-and-time-running-out-fix-problem

Murshed BAH, Mallappa S, Abawajy J, Saif MAN, Al-Ariki HDE, Abdulwahab HM (2023) Short text topic modelling approaches in the context of big data: taxonomy, survey, and analysis. Artif Intell Rev 56(6):5133–5260. https://doi.org/10.1007/s10462-022-10254-w

Newman D, Noh Y, Talley E, Karimi S, Baldwin T (2010) Evaluating topic models for digital libraries. Proceedings of the 10th annual joint conference on Digital libraries, Gold Coast, Queensland, Australia, https://doi.org/10.1145/1816123.1816156

Nie Q, Liu Y, Zhang D, Jiang H (2021) Dynamical SEIR model with information entropy using COVID-19 as a case study. IEEE Trans Computational Soc Syst 8(4):946–954. https://doi.org/10.1109/TCSS.2020.3046712

Niu Q, Liu J, Kato M, Nagai-Tanima M, Aoyama T (2022) The effect of fear of infection and sufficient vaccine reservation information on rapid COVID-19 vaccination in Japan: Evidence from a retrospective Twitter analysis. J Med Internet Res 24(6):e37466. https://doi.org/10.2196/37466

Petrosyan A (2023) Internet and social media users in the world 2023 . Statista. https://www.statista.com/statistics/617136/digital-population-worldwide/

Qiu L, Kumar S (2017) Understanding voluntary knowledge provision and content contribution through a social-media-based prediction market: A field experiment. Inf Syst Res 28(3):529–546. https://doi.org/10.1287/isre.2016.0679

Röder M, Both A, Hinneburg A (2015) Exploring the Space of Topic Coherence Measures. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, Shanghai, China, https://doi.org/10.1145/2684822.2685324

Rosner F, Hinneburg A, Röder M, Nettling M, Both A (2014) Evaluating topic coherence measures. arXiv. https://doi.org/10.48550/arXiv.1403.6397

Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes Twitter users: real-time event detection by social sensors. Proceedings of the 19th international conference on World wide web, Raleigh, North Carolina, USA, https://doi.org/10.1145/1772690.1772777

Sanwald S, Widenhorn-Müller K, Carlos GRGGMKTS-L, Montag C, Kiefer M (2022) Primary emotions as predictors for fear of COVID-19 in former inpatients with Major Depressive Disorder and healthy control participants. BMC psychiatry 22(1):94. https://doi.org/10.1186/s12888-021-03677-2

Shan S, Yan Q, Wei Y (2020) Infectious or recovered? Optimizing the infectious disease detection process for epidemic control and prevention based on social media. Int J Environ Res Public Health 17(18):6853. https://doi.org/10.3390/ijerph17186853

Shen C, Chen A, Luo C, Zhang J, Feng B, Liao W (2020) Using reports of symptoms and diagnoses on social media to predict COVID-19 case counts in mainland China: Observational infoveillance study. J Med Internet Res 22(5):e19421. https://doi.org/10.2196/19421

Simon T, Goldberg A, Adini B (2015) Socializing in emergencies—A review of the use of social media in emergency situations. Int J Inf Manag 35(5):609–619. https://doi.org/10.1016/j.ijinfomgt.2015.07.001

Sundararajan M, Taly A, Yan Q (2017) Axiomatic Attribution for Deep Networks Proceedings of the 34th International Conference on Machine Learning, Proceedings of Machine Learning Research. https://proceedings.mlr.press/v70/sundararajan17a.html

Surian D, Nguyen DQ, Kennedy G, Johnson M, Coiera E, Dunn AG (2016) Characterizing Twitter discussions about HPV vaccines using topic modeling and community detection. J Med Internet Res 18(8):e232. https://doi.org/10.2196/jmir.6045

Tang Z, Miller AS, Zhou Z, Warkentin M (2021) Does government social media promote users’ information security behavior towards COVID-19 scams? Cultivation effects and protective motivations. Gov Inf Q 38(2):101572. https://doi.org/10.1016/j.giq.2021.101572

Tashman LJ (2000) Out-of-sample tests of forecasting accuracy: an analysis and review. Int J Forecast 16(4):437–450. https://doi.org/10.1016/S0169-2070(00)00065-0

Thomala LL (2023a) Search engines in China - statistics & facts . Statista. https://www.statista.com/topics/1337/search-engines-in-china/

Thomala LL (2023b) Social media in China - statistics & facts . Statista. https://www.statista.com/topics/1170/social-networks-in-china/

Tran V, Matsui T (2023) COVID-19 case prediction using emotion trends via Twitter emoji analysis: A case study in Japan. Front public health 11:1079315. https://doi.org/10.3389/fpubh.2023.1079315

Velasco E, Agheneza T, Denecke K, Kirchner G, Eckmanns T (2014) Social media and internet‐based data in global systems for public health surveillance: a systematic review. Milbank Q 92(1):7–33. https://doi.org/10.1111/1468-0009.12038

Wu B, Wang L, Wang S, Zeng Y-R (2021) Forecasting the US oil markets based on social media information during the COVID-19 pandemic. Energy 226:120403. https://doi.org/10.1016/j.energy.2021.120403

Wu F-J, Lim HB (2014) UrbanMobilitySense: A user-centric participatory sensing system for transportation activity surveys. IEEE Sens J 14(12):4165–4174. https://doi.org/10.1109/JSEN.2014.2359876

Article   ADS   Google Scholar  

Wu J, Li M, Zhao E, Sun S, Wang S (2023) Can multi-source heterogeneous data improve the forecasting performance of tourist arrivals amid COVID-19? Mixed-data sampling approach. Tour Manag 98:104759. https://doi.org/10.1016/j.tourman.2023.104759

Yousefinaghani S, Dara R, Mubareka S, Sharif S (2021) Prediction of COVID-19 waves using social media and Google search: a case study of the US and Canada. Front public health 9:656635. https://doi.org/10.3389/Fpubh.2021.656635

Zhang L, Li H, Chen K (2020) Effective risk communication for public health emergency: Reflection on the COVID-19 (2019-nCoV) Outbreak in Wuhan, China. Healthcare, 8. https://doi.org/10.3390/healthcare8010064

Zhang S, Zhang J, Yang L, Wang C, Gao Z (2023) COV-STFormer for short-term passenger flow prediction during COVID-19 in urban rail transit systems. IEEE Transactions on Intelligent Transportation Systems. https://doi.org/10.1109/TITS.2023.3323379

Zhang X, Yang Q, Albaradei S, Lyu X, Alamro H, Salhi A, Ma C, Alshehri M, Jaber II, Tifratene F (2021) Rise and fall of the global conversation and shifting sentiments during the COVID-19 pandemic. Humanities Soc Sci Commun 8(1):1–10. https://doi.org/10.1057/s41599-021-00798-7

Zhang Y, Lin H, Wang Y, Fan X (2023) Sinophobia was popular in Chinese language communities on Twitter during the early COVID-19 pandemic. Humanities Soc Sci Commun 10(1):1–12. https://doi.org/10.1057/s41599-023-01959-6

Zhao S, Chen L, Liu Y, Yu M, Han H (2022) Deriving anti-epidemic policy from public sentiment: A framework based on text analysis with microblog data. PLoS One 17(8):e0270953. https://doi.org/10.1371/journal.pone.0270953

Zhou S, Yang X, Wang Y, Zheng X, Zhang Z (2023) Affective agenda dynamics on social media: interactions of emotional content posted by the public, government, and media during the COVID-19 pandemic. Humanities Soc Sci Commun 10(1):1–10. https://doi.org/10.1057/s41599-023-02265-x

Download references

Acknowledgements

This research was funded by the National Natural Science Foundation of China (72101090, 72321001, 72342018, 71925002, 72271098), the Special Fund Project for Scientific and Technological Innovation (Soft Science) of Guangdong Province (2022A1515011620, 2024A1515011518), Guangdong Philosophy and Social Sciences (GD21YGL09, 2023GZYB22), Innovative Research Team of Shanghai International Studies University (2023KFKT003).

Author information

Authors and affiliations.

School of Business Administration, South China University of Technology, Guangdong, China

Boyang Shi, Weixiang Huang, Yuanyuan Dang & Wenhui Zhou

You can also search for this author in PubMed   Google Scholar

Contributions

By S: Data Curation, Formal Analysis, Writing - Original Draft. Wx H: Providing the essential materials, reagents, equipment, or analytical tools. Yy D: Conceptualization, Writing - Original Draft and editing. Wh Z: Overseeing the research, guiding the project intellectually.

Corresponding author

Correspondence to Yuanyuan Dang .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval was not required as the study did not involve human participants.

Informed consent

Informed consent was not required as the study did not involve human participants.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Research data, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Shi, B., Huang, W., Dang, Y. et al. Leveraging social media data for pandemic detection and prediction. Humanit Soc Sci Commun 11 , 1075 (2024). https://doi.org/10.1057/s41599-024-03589-y

Download citation

Received : 16 April 2024

Accepted : 12 August 2024

Published : 23 August 2024

DOI : https://doi.org/10.1057/s41599-024-03589-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

a research is empirical because

  • Open access
  • Published: 20 August 2024

“Because people don’t know what it is, they don’t really know it exists” : a qualitative study of postgraduate medical educators’ perceptions of dyscalculia

  • Laura Josephine Cheetham 1  

BMC Medical Education volume  24 , Article number:  896 ( 2024 ) Cite this article

70 Accesses

Metrics details

Dyscalculia is defined as a specific learning difference or neurodiversity. Despite a move within postgraduate medical education (PGME) towards promoting inclusivity and addressing differential attainment, dyscalculia remains an unexplored area.

Using an interpretivist, constructivist, qualitative methodology, this scoping study explores PGME educators’ attitudes, understanding and perceived challenges of supporting doctors in training (DiT) with dyscalculia. Through purposive sampling, semi-structured interviews and reflexive thematic analysis, the stories of ten Wales-based PGME educators were explored.

Multiple themes emerged relating to lack of educator knowledge, experience and identification of learners with dyscalculia. Participants’ roles as educators and clinicians were inextricably linked, with PGME seen as deeply embedded in social interactions. Overall, a positive attitude towards doctors with dyscalculia underpinned the strongly DiT-centred approach to supporting learning, tempered by uncertainty over potential patient safety-related risks. Perceiving themselves as learners, educators saw the educator-learner relationship as a major learning route given the lack of dyscalculia training available, with experience leading to confidence.

Conclusions

Overall, educators perceived a need for greater dyscalculia awareness, understanding and knowledge, pre-emptive training and evidence-based, feasible guidance introduction. Although methodological limitations are inherent, this study constructs novel, detailed understanding from educators relating to dyscalculia in PGME, providing a basis for future research.

Peer Review reports

Dyscalculia is categorised as a specific learning difference or part of neurodiversity in the UK and a learning disability in North America. Learners with dyscalculia are said to have significant difficulties in numerical processing [ 1 ]. It is increasingly acknowledged that these relate to arithmetic, statistics, ordinance, number and code memorisation and recall, with other individual variance [ 2 , 3 ]. Here, I chose to use “specific learning difference” (SpLD) to acknowledge that some feel SpLDs relate to a difference in learning needs but may not always result in learners identifying as disabled [ 4 , 5 ]. Most contemporary definitions state that these challenges are out of keeping with learner age, intelligence level and educational background [ 1 ], evolve over time but persist during adulthood.

Dyscalculia is a comparatively recently recognised SpLD with a relatively low ‘diagnosed’ population prevalence, with estimates ranging between 3% and 7% [ 2 ]. Awareness of dyscalculia is lower than more highly ‘diagnosed’ SpLDs such as dyslexia, dyspraxia and Attention Deficit and Hyperactivity Disorder (ADHD) [ 3 ], with a paucity of research-based evidence, especially relating to adult learners [ 2 ]. Of the two studies exploring dyscalculia in Higher Education Institutions (HEI), from the perspective of learners, both Drew [ 3 ] and Lynn [ 6 , 7 ] outlined poor understanding within adult learning environments and a lack of recognition of dyscalculia and of HEI learning support provision. Additionally, learner challenges were different to those described in dyslexia and dyspraxia studies, with understanding and perception of time, distance, finances, non-integer numbers, memorisation and recall of numerical codes and values being frequent issues. Potential complexity arose through possible coexistence of dyslexia or mathematical anxiety, varying learner-developed coping strategies effectiveness and learner coping mechanisms becoming ineffective during undergraduate or postgraduate education [ 3 ]. Drew’s [ 3 ] three healthcare learner participants had also experienced potential fitness to practice concerns either from themselves or educators.

Context for medical education

The number of DiT in postgraduate medical education (PGME) with dyscalculia remains unknown. Similarly, awareness levels of PGME educators, or what their experiences might be, of facilitating the learning of DiT with dyscalculia is unexplored. Indeed, there has been no published research to date relating to dyscalculia in PGME or undergraduate medical education.

This paucity of knowledge is set in the context of a presumed increasing proportion of UK PGME DiT learners with a disability resulting from increasing numbers of medical students in the UK reporting a disability [ 8 , 9 ] and in other countries such as Australia [ 10 ]. Data collection via the statutory education bodies, and the medical regulator, the General Medical Council (GMC), is challenging given the voluntary nature of SpLD declaration and persisting concerns regarding discrimination and stigma [ 11 ]. My Freedom of Information request to the GMC in February 2022 revealed that 1.25% of registered doctors have declared a ‘learning disability’ (including SpLDs) such as dyslexia.

The impact of dyscalculia on DiT and their educators is unknown. The GMC defines differential attainment as the gap in assessment outcomes between learners grouped by protected characteristic [ 12 ]. It recently commissioned research into recommending education providers create more inclusive learning environments for disabled learners [ 13 ]. Other recent research indicates that differential attainment may persist from school-based examinations through to medical school exit ranking scores and onto PGME examinations [ 14 ].

Currently, there is no publicly available information addressing the support of PGME DiT with dyscalculia within the UK, and no known prospective screening in place. Support, including reasonable adjustments for PGME DiT with additional learning needs is accessed through, and coordinated by, education bodies’ Professional Support Units (PSU), including Health Educator and Improvement Wales’ (HEIW) PSU in Wales. More widely, HEIW, the education body in Wales, is responsible for delivery and quality management of PGME in accordance with UK-level standards set by the GMC and medical speciality Royal Colleges and Faculties. Reasonable adjustments are changes, additions, or the removal of learning environment elements to provide learners with additional support and remediate disadvantage [ 15 ]. They are frequently purported to enable learners with SpLDs to learn and perform to their potential, although evidence for this is variable [ 16 , 17 ], with a marked lack of research relating to adult learners with dyscalculia.

Despite recent shifts from more teacher-centred to more student-centred learning approaches, with a range of andrological learning theories emphasising the learner being at the centre of learning [ 18 ], the educationalist remains a key element of many learning theories and PGME. Many PGME educators are practising doctors and, alongside this, must maintain a contemporaneous understanding of learning theory, training delivery, teaching, supervision and wider educational policies. However, how they approach, or would plan to approach, supporting learning for DiT with dyscalculia is unknown. Therefore, exploring the attitudes and perspectives of PGME DiT or educators regarding dyscalculia, both unresearched previously, through this paradigm could be valuable [ 19 ].

Educational challenges, learning needs and local context

For educators, a pivotal part of facilitating learning is understanding the learning needs of learners, felt to be a cornerstone of adult pedagogy [ 19 , 20 ]. Davis et al. [ 20 ] define learning needs as ‘’any gap between what is and what should be”. These can be established subjectively, objectively or a combination approach. However, Grant [ 19 ] cautions against conducting limiting, formulaic learning need assessments.

Identifying attitudes and understanding

Furthermore, attitudes are said to frame educator approaches and thus the learning experiences learners will have [ 21 ]. Attitudes are defined as “a feeling or opinion about something or someone, or a way of behaving that is caused by this” [ 22 ]. Interpretivism offers a route to exploring such attitudes by outlining that there is no one universal truth or fact, but instead many equally valid realities constructed by different individuals, their meaning-making and their experiences.

Again, research is absent within medical education relating to educators’ attitudes and understanding of learners with dyscalculia and how these might influence their approach. Current research indicates attitudes of HEI educators are often formed through their past - or absent past - experiences, lack of legal obligations knowledge and, for healthcare educators, the patient-centred role of clinical learners [ 23 ]. These appeared to help form their approach to facilitating teaching [ 23 , 24 , 25 , 26 , 27 , 28 , 29 ]. Therefore, understanding PGME educationalist attitudes towards DiT with dyscalculia would be important in helping understand how learning is facilitated.

Thus, there exists a clear lack of published knowledge and understanding regarding dyscalculia set in a context of increasing awareness of the importance of inclusivity and addressing differential attainment within medical education. The importance of educators in facilitating learning of such PGME DiT suggests that exploring their perspectives and understanding could provide valuable insights into this understudied area. Such knowledge could provide benefit to learners and those designing and delivering programmes of learning for DiT and programmes of support for educators. This includes potentially exploring the attitudes and understanding of educators who have no direct experience of dyscalculia, given that this could be the context in which a DiT with dyscalculia finds themselves in a postgraduate learning environment. Assumptions, or perceptions generated without experience or knowledge of dyscalculia, are equally important to understand in a learning context when the awareness level and prevalence of dyscalculia within DiT is unknown. This allows understanding of how learning for DiT with dyscalculia may be facilitated in a knowledge and understanding-poor context, and furthermore, what educator needs exist and what further research is needed.

Consequently, the research question and aims below were constructed.

Research question:

What are the attitudes towards , understanding and perceived challenges of dyscalculia within postgraduate medical training by postgraduate medical educators?

Research aims:

To explore the awareness and understanding of dyscalculia that postgraduate medical educators may or may not have.

To determine the attitudes that postgraduate educators have towards dyscalculia and DiT with dyscalculia and how these might be formed.

To establish the challenges that postgraduate educators perceive they encounter or might encounter when facilitating the learning of a DiT who has dyscalculia.

To provide the basis for future research studies exploring how to facilitate the learning of DiT with dyscalculia during postgraduate training.

This scoping study was designed using an interpretivist, constructivist qualitative methodology to understand the phenomenon, in detail [ 30 ] as part of a Masters in Medical Education programme.

A literature review was undertaken to enable research question and aim construction. Firstly, a focused literature search ascertained the level, and lack, of evidence existing for the study phenomenon followed by four, progressively broader, searches to understand the wider context, between October 2021 and May 2022, revealing the lack of, or limited, literature existing.

The literature search was then performed by me using guidance [ 31 , 32 ] and twenty-seven research search engines. Additionally, a spectrum of journals was searched directly. Literature was also identified through snowballing.

Keyword search terms were developed and refined during the literature search, with limits on further broadening the search based on relevance to the areas of interest: postgraduate learners, educators and SpLDs using different term combinations exploring dyscalculia and postgraduate education, SpLDs and postgraduate healthcare learners, postgraduate educators and attitudes or knowledge or experiences of facilitating learning (appendix 1, supplementary material). Broadening of search terms allowed for exploration of analogous phenomena (other SpLDs), in other postgraduate healthcare and learning contexts, and for further research question development, returning 2,638 items. Papers were initially screened using their titles and the inclusion/exclusion criteria (below) generating 182 articles, papers and theses, with abstracts and reference lists reviewed. 174 papers and eight PhD theses were appraised using guidance [ 32 , 33 , 34 ].

Inclusion criteria were:

Primary research or review.

International or UK-based research reported in English.

Postgraduate higher education (university-level, post Bachelor or equivalent degree) setting.

Relating to postgraduate or higher educationalists’ views from any discipline and knowledge of SpLDs.

Exclusion criteria were:

Literature published in non-English languages.

Opinion and commentary articles.

Undergraduate setting, unless mixed cohort/study with postgraduate learners.

Ultimately, 17 papers and one doctoral thesis were included. Whilst grey literature, this thesis [ 3 ] was included due to the dyscalculia-focused insights provided and limited adult-based dyscalculia research elsewhere. After literature appraisal, research aims and a research question were formed.

Semi-structured interviews were chosen to enable data collection and interpretation through a constructivist lens, via open enquiry rather than hypothesis testing [ 30 , 35 , 36 ]. Study participants were PGME educators, actively involved in DiT learning within any PGME programme within Wales whilst holding a Medical Trainer agreement with HEIW. Participants held a range of educationalist roles, from education supervisor to local speciality-specific Royal College tutor (local speciality training lead) to training programme director (responsible for delivery of speciality-specific training across a region).

Interview question and guide design (appendix 2, supplementary material) drew on the six qualitative and six quantitative research-based, validated published tools used to explore similar phenomena, particularly those of O’Hara [ 37 ], Ryder [ 38 ], L’Ecuyer [ 23 ] and Schabmann et al. [ 39 ]. Design also drew upon Cohen et al’s [ 40 ] recommendations of composing open, neutral questioning.

Interview format was piloted using a PGME educator from England (thus ineligible for study recruitment) with modifications resulting from participant feedback and through adopting reflexivity; as per Cohen et al. [ 41 ] and Malmqvist et al. [ 42 ]. Participant interviews took place between May and June 2022 and were recorded via the University-hosted Microsoft Teams platform, due to the pandemic-based situation and large geographical area involved, whilst maintaining interviewer-interviewee visibility during the dialogue [ 35 ]. Recruitment occurred via purposive sampling, through two HEIW gatekeepers, the national Directors of Postgraduate Secondary (hospital-based) and Primary (General Practice-based) Medical Training in Wales. An email-based invitation with project information was distributed to all postgraduate medical educators with a current HEIW Medical Trainer agreement, regularly engaging in the support of learners within PGME training, in Wales. In this case, the gatekeepers in HEIW were individuals who could grant permission and make contact with all potential eligible participants on behalf of myself, through their email databases, whilst adhering to UK data protection regulations [ 43 , 44 ].

Ethical considerations

Formal ethics approval was gained from the Cardiff University School of Medicine Research Ethics Committee. Health Research Authority ethics approval was considered but deemed unnecessary. Informed written and verbal participant consent was obtained prior to, and at the point of, interview respectively. Additionally, verbal consent for video recording was sought, offering audio recording or notetaking alternatives; however, participant discomfort was not reported. Mitigation options to avoid selection bias included selecting alternative volunteers if significant relationships between the researcher and participant had existed.

Invitations to participate were circulated to approximately 2,400 to 2,500 postgraduate secondary care trainers and 600 primary care trainers. 18 individuals indicated interest in participating, one cancelled and seven did not respond to follow-up within the two-month timeframe the MSc project schedule allowed for. Subsequent reasons given for two out of seven who subsequently responded out of timeframe included clinical demands and unexpected personal matters. 10 postgraduate educators were interviewed and all allowed video-based interview recording. Interviews lasted between 40 and 60 min. Interviews were transcribed verbatim by me and checked twice for accuracy, with participants assigned pseudonyms. Data analysis was conducted using reflexive thematic analysis (RTA) and undertaken by me, the author, as the single coder and Masters student, with transcripts analysed three times.

RTA followed the six-step approach of Braun et al. [ 45 ], Braun and Clarke [ 46 ] and Braun and Clarke [ 47 ], with a primarily inductive approach [ 47 , 48 ] through an iterative process. Both latent and semantic coding approaches were used, guided by meaning interpretation [ 49 ].

RTA allowed exploration through an interpretivist lens. Discussions persist regarding how RTA sample size sufficiency and ‘data saturation’ are determined, with RTA placing more emphasis on the analyst-based individualism of meaning-making. Therefore, mechanisms for determining thematic saturation are purportedly inconsistent and unreliable [ 50 ]. Consequently, sample size was based on the maximum number of participants recruited within the set project time limits.

Reflexivity

I strove to adopt reflexivity throughout, using a research diary and personal reflections, referring to Finlay [ 51 ] who stated that such subjectivity can evolve into an opportunity. My interest in the studied phenomenon resulted partially from my experiences as a DiT with SpLDs and from being a DiT representative. Acknowledging this was important given my perspective, as an intrinsic part of this research, could have affected data gathering, interpretation, and, ultimately, study findings through introducing insider status.

Additionally, holding an influential role within the research, with potential for ‘interviewer bias’ [ 52 ], I adopted Cohen et al.’s [ 53 ] recommendations, committing to conscious neutrality during interviews and use of an interview prompt list, whilst striving to maintain a reflexive approach. Alongside this, the impact on credibility of this study being part of a Masters project, limiting scale and timeframes were considered and mitigated by exploring these within the discussion and referring to this research as a scoping study.

Educators with limited to no direct experience of learners with dyscalculia knew little to nothing about dyscalculia (Fig.  1 ).

figure 1

Summary of themes and subthemes generated

Furthermore, of the participants who did, these educators cited close second-hand experiences with family members or past learners with dyscalculia which helped shape their understanding of dyscalculia. Those that had no direct experience drew on empathy and generalisation, extrapolating from the greater knowledge and confidence they had in their understanding regarding dyslexia or other SpLDs or even analysis of the term ‘dyscalculia’ to form definitions and perceptions.

“Absolutely nothing… I saw it , [dyscalculia in the study invitation] didn’t know what it was and Googled it so very , very little really. I suppose in my simplistic surgical sieve head , I would just sort of apply the bits and pieces I know around dyslexia.” P10 .

All suggested dyscalculia represented a specific set of challenges and associated learning needs relating to numbers, numeracy or quantity where overall intelligence was preserved. Educators saw each learner as being an individual, therefore felt dyscalculia would present as a spectrum, with varying challenges and needs existing. Dyscalculia was seen as persisting lifelong, with the challenges and needs evolving with age and experiences. Common challenges suggested related to calculations, statistics, critical appraisal, awareness of time, organisation and recall of number-based information (such as job lists, blood results), spatial dimension quantification, prescribing, fast-paced tasks and emergencies, exams and learning-based fatigue or high cognitive load. Wellbeing issues relating to dyscalculia were also frequently perceived, with this potentially negatively affecting self-confidence and anxiety levels. All educators saw a key aspect of their role to be provision of pastoral support, in enabling effective learning.

Past educator experiences of dyscalculia were linked to perceived confidence in ability to support future DiT with dyscalculia. Educators felt their limited knowledge, with the primary source of information regarding dyscalculia being DiT with dyscalculia themselves, to be reflective of low levels of awareness, knowledge and identification within PGME, education systems and wider society. Some felt the proportion of PGME DiT with dyscalculia would be lower than for the general population, following challenging assessments during secondary school and undergraduate studies, but might be changing given widening participation initiatives within medicine. Others saw a potential hidden iceberg of later career stage doctors with unidentified dyscalculia who had completed training when speciality assessments relied less on numeracy.

“[It] was only because of my own experiences and my [relative] that I was able to kind of wheedle around and , you know , make them recognise that there was an issue and that , you know. But I - I think had I not had an awareness of it , I probably wouldn’t have recognised it , I think.” P7 .

Educators frequently used empathy when attempting to understand dyscalculia. Educators had mixed feelings about ‘labelling’ DiT as having dyscalculia although all felt identification of additional learning needs was key. Some felt labels were necessary to enable and better support DiT with dyscalculia in the absence of effective, feasible, inclusive education approaches, others noted the potential for stigma or generalisations.

None of the participants had received dyscalculia training. Some felt widespread societal normalisation of mathematics challenges adversely impacted upon if, and at what educational stage, dyscalculia identification occurred and needs were recognised. Many felt assumptions might occur regarding dyscalculia through others making generalisations from better known SpLDs, including dyslexia and dyspraxia, in the absence of other knowledge sources but that these extrapolations could be inaccurate and unhelpful.

“And I think there’s a lot of ‘oh you’re just bad with numbers’ or ‘ohh , you just can’t do , you know people are just , I , I suspect there’s a lot of people who have just been told they’re not very good at maths , aren’t there? And it’s just , you know they can’t , can’t do it , which you know is not really very fair , is it?” P7 .

Many felt PGME might represent a critical juncture for DiT with dyscalculia, where effective coping mechanisms developed in the past become ineffective. A variety of such coping mechanisms were suggested or hypothesised, often outlined as depending on the dyscalculia-based experience level of the educator, including checking work with others, calculator use and avoidance of numeracy-dense work or specialities.

Mechanisms were generally viewed positively except where perceived to reduce the likelihood of a DiT recognising dyscalculia themselves and seeking support.

Most felt positively towards learners with dyscalculia and their learning facilitation, especially those with greater experience of dyscalculia. Many balanced this positivity with potential concerns regarding patient safety. Concerns focused especially on heavily numeracy-based tasks, fast-paced situations, or when working independently in surgical or emergency prescription-based situations. Overall, concerns were heightened due to the clinical patient-based context to PGME learning. Two participants felt that not all DiT with dyscalculia should be supported to continue training in particular specialities where numeracy skills were seen as critical, such as ophthalmology.

“I am , and it just seemed really unfair that this one small thing could potentially have such a big impact and could potentially prevent [them] progressing and succeeding in the way that I think you know , [they , they] had the potential to.” P6 .

Educators outlined a dependence on the bidirectionality of learner-educator relationships to best facilitate DiT learning per se, and it was felt all DiT had a responsibility to be honest with educators. Some cited potential barriers to this collaboration, including past negative learner experiences, felt stigma, limited educator time and frequent DiT rotations.

“It’s a wonderful opportunity for learning which I really enjoy , because I think that this is a two-way process. You know , I think the DiT gives you things that you reflect on and you should be giving the DiT things that they reflect on” P5 .

Most felt they would take a one-to-one learning approach for DiT with dyscalculia. Group-based, fast-paced or numeracy-rich, higher risk clinical activity-based teaching would be more challenging to cater for.

For some, patient safety uncertainties abutted with the duality of being a clinician and educator, with perceived difficulty in quantifying clinical risks associated with learning and educators’ clinical workload demands limiting available time and resources. Thus, many felt that their educator roles always needed to be tempered with their duties as a doctor, prioritising patient safety and quality of care above all else.

“So , it’s not so much the learning , uh , issue that worries me. I think even if someone had dyscalculia the , uh , concepts of medicine could be understood and the basic outline of what we’re doing , but actually you’ve got to be quite precise in the vocational aspect of , of , of the training , and if you get it wrong , it’s a potential major clinical risk and obviously patient safety has to come first in everything that , that we do.” P4 .

Educators wished strongly for pre-emptive support in facilitating the learning of DiT with dyscalculia, feeling great responsibility both for DiT learning but also for upholding clinical standards and safety. Many felt they would approach HEIW’s PSU for reactive support, including seeking learner ‘diagnosis’, although some predicted this support, and their knowledge, might be limited. However, two participants outlined positive experiences after seeking PSU support.

Most educator participants supported reasonable adjustment use if patient safety and quality of care remained prioritised and preserved. Other conditions for supporting reasonable adjustments included if they enabled without giving undue advantage and if educator-related workload was not overly burdensome. Those with experience of dyscalculia more confidently volunteered reasonable adjustments suggestions, ranging from calculation-table or App access to additional time for numeracy-rich activities. Some perceived a challenging divide between clinical educators and SpLD education experts who could make potentially unfeasible reasonable adjustment recommendations, with participants suggesting the importance of greater involvement of clinical educators in developing support processes.

“If I’m honest , I don’t think we do it very well…They’re [reasonable adjustments offered] very simplistic , … you know , they’re very much based on a sort of global ability rather than realising that processing and other things might be impacted… We’re , we’re probably behind the curve and not really doing what could be done” P8 .

Further example quotes for each theme and subtheme can be found within appendix 3, supplementary material.

Experience shapes educator knowledge, understanding and attitudes

This study reveals novel findings regarding dyscalculia in PGME within a vacuum of prior research. Notably, participants’ views towards PGME learners with dyscalculia, including DiT potential to learn, practise and develop effective coping strategies, were substantially more positive and empathetic than in the closest comparable healthcare studies of other SpLDs [ 23 , 24 , 27 , 29 , 54 ]. Furthermore, the potential impact of societal normalisation of numeracy challenges on awareness of, and attitudes towards, dyscalculia explored by some participants has only previously been noted by Drew [ 3 ].

Educators’ expressions of a sense of personal or healthcare-wide lack of awareness and understanding of dyscalculia aligns with the current UK position [ 2 ]. But they also built on this, outlining how generalisation from other SpLDs or disabilities was frequently used to bridge the dyscalculia knowledge gap with some not recognising this as potentially problematic. This suggests a need for enhanced awareness and understanding within the healthcare education community of the potential fallibility of using generalisation to support learners with poorly understood additional needs.

Moreover, no other studies have revealed that healthcare educators with personal experience of a learner relative with a SpLD displayed universally positive attitudes towards DiT with the same SpLD. Whilst this could reflect inter-study methodological differences, inter-professional differences or the increasing emphasis on compassionate clinical practice [ 55 ], it also suggests influence of educator experience in attitude formation.

In addition to their attitudes, the impact of prior experience of learners with dyscalculia on educators’ knowledge, understanding and confidence was often acknowledged as important by participants. This was seen to an extent in the closest comparable SpLD studies, [ 24 , 54 ] and further shows the diverse influence of past educationalist experiences, particularly the establishment of deep, longitudinal relative-based relationships, aligning with social constructivism [ 56 ].

Unlike HEI lecturers in dyslexia studies [ 24 , 54 ], who frequently questioned the needs of learners, educators saw DiT with dyscalculia as intelligent and high-functioning, having credible additional learning needs. Needs were seen as variable unlike elsewhere. Additionally, the level of detail constructed regarding educators’ perceptions of the needs, strengths and challenges of each DiT with dyscalculia, evolving over time and experience, is not seen in non-dyscalculia SpLD studies and only alluded to for dyscalculia [ 3 ]. These differences, which may be partially explained by varying methodologies or cultural norms regarding how different SpLDs are regarded, are important to better understand.

Furthermore, the preferred educator approach of individualising learning for DiT with dyscalculia is not seen elsewhere in the literature, although this aligns with supporting learning within their zone of proximal development (ZPD). Rather, Ryder and Norwich found HEI educators actually expressed negative attitudes towards individualising learning [ 24 ]. Methodological and SpLD-specific factors may contribute to these differences, with this study’s findings aligning more closely with Swanwick’s proposal that PGME often emulates apprenticeship-type learning [ 57 ]. It would be valuable to establish the efficacy of individualised PGME-based approaches to facilitating learning with dyscalculia from DiT and educator perspectives.

Greater educator support and training regarding dyscalculia is needed

Educators’ perceived need for wider awareness of dyscalculia, alongside greater pre-emptive training and guidance tailored towards dyscalculia within PGME learning environments has also been described for other SpLDs [ 23 , 58 , 59 ]. Greater research is needed to develop such awareness and evidence-based training, with similar needs identified more widely in HEI for dyscalculia [ 3 ] and for other SpLDs [ 23 , 24 , 27 ]. Akin to some participants, Swanwick and Morris [ 60 ] discuss the increasing expectations on clinical educationalists to deliver professional-level education and Sandhu [ 61 ] explores participants’ expressed need for greater faculty development whilst rectifying the deficit of evidence-base for PGME educators to use.

The crucial importance of the bidirectionality of the educator-learner relationship, with educators perceiving themselves as learners too, is only subtly alluded to elsewhere [ 3 ]. Given the bidirectional learning relationship was reportedly undermined by frequent DiT placement rotations, fast-paced clinical environments and shift-based training patterns, further exploration of the appropriateness of current UK PGME training design for DiT with dyscalculia could be important.

Coping strategies are important to better understand

As with this study, Drew’s research suggested coping strategies for learners with dyscalculia to be potentially important, effective and helpful but could have limitations [ 3 ]. However, this study provides the first examples of coping strategies, potential or already used, by DiT with dyscalculia. It is crucial that research to develop better understanding of both positive and negative dyscalculia-based coping mechanisms occurs in the future given the broad participant concerns.

Identification is key but not fully enabling

Educators perceived early identification of dyscalculia to be key, showing commonality with dyscalculia, dyslexia and dyspraxia-based studies [ 3 , 25 , 28 ]. That identification was not seen as an absolute solution reinforces the need for further research exploring other disabling factors. However, the witnessed or potential negatives of being ‘labelled’ following dyscalculia ‘diagnosis/identification’, outlined by some participants, have been found only minimally elsewhere within learner-based dyslexia and dyscalculia HEI studies [ 3 , 25 , 28 ]. Negative consequences to labelling included the attitudes learners encountered within the clinical community, suggesting a need to understand cultural norm-related impacts. In contrast, the far greater positives to identification, and the necessity of labelling perceived by educators, were also seen in other SpLD studies [ 3 , 25 , 28 ], enabling self-understanding and access to support. Certainly, the need for improved dyscalculia identification approaches and training is highlighted by the lack of educator confidence in identifying dyscalculia where they had no relative-based experience.

Within the UK, voluntary dyslexia ‘screening’ processes are now offered to some medical students and DiT and similar opportunities could be offered for dyscalculia in the future. Moreover, accumulating evidence indicates an ever-greater importance of establishing equity of learning opportunity and that identification has a positive performance effect for DiT with dyslexia [ 16 , 62 , 63 ].

The PGME clinical context may limit support

Whilst educators clearly adopted a strongly student-centred approach to supporting learning with dyscalculia, addressing the influence of the duality of clinical educator roles on this approach is important. Educator supportive intent was twinned with tension between balancing effective DiT learning with guaranteeing patient safety within diverse, predominantly clinical learning PGME environments, sharing commonalty with L’Ecuyer’s nursing study [ 23 ]. Swanwick and Morris [ 60 ] note this influence on delivering training, with Sandhu [ 61 ] exploring general concerns regarding risk and clinical learning.

Even more pronounced perceived patient safety concerns were expressed in other nursing SpLD studies [ 23 , 29 , 54 , 64 ], and further post-qualification independent working concerns emerged [ 23 , 65 , 66 ], which limited educators’ willingness to support learning. Together, these tensions appear to set learning facilitation for those with dyscalculia within healthcare apart from non-healthcare settings. Therefore, healthcare-specific education research and training is needed to address this, especially given thus far, analogous concerns regarding dyslexia and clinical risk remain unproven.

The influence of educator-reported increasing clinical workload and resource limitations on approach towards supporting DiT with dyscalculia was similarly seen within nursing studies [ 23 , 29 ]. Whilst the impact of clinical demands on UK-based educators are broadly known [ 67 ], greater recognition of the potentially disproportionately negative impact on DiT with dyscalculia needs to be made by those overseeing training delivery.

Uncertainty regarding reasonable adjustments need addressing

Additionally, whilst educators were generally supportive of RAs for DiT with dyscalculia, most intending these to be enabling, caveats to RA introduction were substantial for some. Concerns regarding RA implementation for DiT with dyscalculia were similar to nursing and wider HEI SpLD studies [ 24 , 66 ], but less common or absolute, most relating to feasibility, fairness and adverse impact on educators. These are important to explore if inclusivity in PGME is to be further embraced. Furthermore, and similarly to HEI findings [ 24 ], participant concerns about externally-mandated RAs derived from distant SpLD experts suggest that harnessing coproduction, with greater involvement of clinical educators in RA design, could be important for future endorsement. Additionally, whilst the scale of potential RA suggestions for dyscalculia made in this study is novel, it is important that the experiences of DiT with dyscalculia themselves are captured and used to ensure adjustments are truly enabling.

Therefore, whilst this study reveals important and novel discoveries relating to educators, PGME and dyscalculia, establishing DiT experiences of dyscalculia and PGME is the most crucial avenue of future research to next undertake to better understand and enable both DiT and educators to fulfil their roles effectively and inclusively.

Limitations

As a small, qualitative scoping study undertaken in Wales, study findings cannot and should not be generalisable. Seemingly the first study in this area, transferability should also be considered carefully. Due to purposive sampling, those volunteering may have been more interested in this topic; therefore, findings may not reflect the range of knowledge, attitudes, and experiences of all PGME educators.

Furthermore, use of interviews for data collection and the resultant lack of anonymity may have altered participant contributions. Moreover, despite adopting reflexivity, as a relatively inexperienced, sole researcher, I will have engaged in interviews and analysed data with intrinsic unconscious biases, introducing variability and affecting finding credibility. Despite methodological limitations within this small scoping study, my intention was to construct detailed understanding, providing a basis for future research.

This study reveals, seemingly for the first time, the attitudes, understanding and perceptions of PGME educators relating to DiT with dyscalculia. It highlights that lack of awareness and understanding of dyscalculia exists within the PGME educator community, especially in the absence of relatives with dyscalculia, and that widely accessible, evidence-based approaches to identification, support, teaching approaches and RA provisions are needed and wanted by PGME educators.

The rich stories of participants illuminate the emphasis educators place on experiential learning in informing their perceptions and training approaches, especially in the absence of prospective dyscalculia training or evidence base to draw upon. Given this, including the impact of limited or complete lack of dyscalculia experience and the substitution of generalisation to fill knowledge gaps found in this study, there is a real need for greater PGME-focused research to pre-emptively inform and support all educators.

Furthermore, greater acknowledgement and understanding of the seminal influence that clinical context has on educators, their attitudes towards supporting DiT with dyscalculia and the highly prized bidirectional learning relationships, as revealed in this study, are needed. It highlights the need for greater research to better understand the impact that specific nuances of PGME might have on educators’ support of DiT with dyscalculia and further characterise unmet needs. Future research must begin to address educator uncertainties revealed in this study around potential concerns relating to patient safety and care and differential approaches for dyscalculia and unfairness to other learners to move PGME forward in an effective, inclusive and enabling way.

Notable in this study is the lack of the learner voice, and future research needs to begin to better understand the perceptions and experiences of DiT with dyscalculia of PGME across a wide range of aspects. These could involve those suggested by participants, including DiT PGME learning and assessment experiences, coping strategies, reasonable adjustments and cultural norm impact. Furthermore, clarifying the wider awareness and knowledge levels of PGME educators regarding dyscalculia via more quantitative approaches could help build breadth to the understanding of this poorly understood phenomenon alongside the depth provided by this study.

Data availability

No datasets were generated or analysed during the current study.

Abbreviations

Attention Deficit and Hyperactivity Disorder

Doctors in Training

General Medical Council

Higher Education Institution

Health Education and Improvement Wales

Postgraduate Medical Education

Professional Support Unit

Reasonable Adjustment

Reflexive Thematic Analysis

Specific Learning Difference

United Kingdom

Zone of Proximal Development

Laurillard D, Butterworth B. Review 4: The role of science and technology in improving outcomes for the case of dyscalculia. In: Current Understanding, Support Systems, and Technology-led Interventions for Specific Learning Difficulties: evidence reviews commissioned for work by the Council for Science and Technology. Council for Science and Technology, Government Office for Science; 2020. https://assets.publishing.service.gov.uk/media/5f849afa8fa8f504594d4b84/specific-learning-difficulties-spld-cst-report.pdf . Accessed 24th November 2023.

Parliamentary Office for Science and Technology (POST). Postnote: Dyslexia and dyscalculia. London: Parliamentary Office for Science and Technology. 2014. https://www.parliament.uk/globalassets/documents/post/postpn226.pdf . Accessed 9th October 2023.

Drew S. Dyscalculia in higher education. PhD Thesis, Loughborough University, UK; 2016.

Walker E, Shaw S. Specific learning difficulties in healthcare education: the meaning in the nomenclature. Nurse Educ Pract. 2018;32:97–8.

Article   Google Scholar  

Shaw S. The impacts of dyslexia and dyspraxia on medical education. PhD Thesis, University of Brighton and the University of Sussex; 2021. p. 16.

Lewis K, Lynn D. Against the odds: insights from a statistician with dyscalculia. Educ Sci. 2017;8:63. https://doi.org/10.3390/educsci8020063 .

Lewis K, Lynn D. An insider’s view of a mathematics learning disability: compensating to gain access to fractions. Investig Math Learn. 2018;10(3):159–72. https://doi.org/10.1080/19477503.2018.1444927 .

Shrewsbury D. State of play: supporting students with specific learning difficulties. Med Teach. 2011;33(3):254–5.

Google Scholar  

Murphy M, Dowell J, Smith D. Factors associated with declaration of disability in medical students and junior doctors, and the association of declared disability with academic performance: observational study using data from the UK Medical Education Database, 2002–2018 (UKMED54). BMJ Open. 2022;12:e059179. https://doi.org/10.1136/bmjopen-2021-059179 .

Mogensen L, Hu W. ‘A doctor who really knows...’: a survey of community perspectives on medical students and practitioners with disability. BMC Med Educ. 2019;19:288. doi: 10.1186/s12909-019-1715-7

British Medical Association. Disability in the Medical Profession: Survey Findings 2020. 2021. https://www.bma.org.uk/media/2923/bma-disability-in-the-medical-profession.pdf . Accessed 9th October 2023.

General Medical Council. What is differential attainment? 2021. Available from: https:// www.gmc-uk.org/education/standards-guidance-and-curricula/projects/differential-attainment/what-is-differential-attainment . Accessed 9th October 2023.

General Medial Council. Welcomed and valued: Supporting disabled learners in medical education and training. 2019. https://www.gmc-uk.org/-/media/documents/welcomed-and-valued-2021-english_pdf-86053468.pdf . Accessed 9th October 2023.

Ellis R, Cleland J, Scrimgeour D, Lee A, Brennan P. The impact of disability on performance in a high-stakes postgraduate surgical examination: a retrospective cohort study. J Royal Soc Med. 2022;115(2):58–68.

Equality Act. 2010. c. 15. [Internet.] 2010. https://www.legislation.gov.uk/ukpga/2010/15 . Accessed 9th October 2023.

Asghar Z, et al. Performance of candidates disclosing dyslexia with other candidates in a UK medical licensing examination: cross-sectional study. Postgrad Med J. 2018;94(1110):198–203.

Botan V, Williams N, Law G, Siriwardena A. How is performance at selection to general practice related to performance at the endpoint of GP training? Report to Health Education England. 2022. https://eprints.lincoln.ac.uk/id/eprint/48920/ . Accessed 9th October 2023.

Taylor D, Hamdy H. Adult learning theories: implications for learning and teaching in medical education: AMEE Guide 83. Med Teach. 2013. https://doi.org/10.3109/0142159X.2013.828153 .

Grant J. Learning needs assessment: assessing the need. BMJ. 2002;324:156–9. https://doi.org/10.1136/bmj.324.7330.156 .

Davis N, Davis D, Bloch R. Continuing medical education: AMEE Education Guide 35. Med Teach. 2008;30(7):652–66.

Pit-Ten Cate I, Glock S. Teachers’ implicit attitudes toward students from different Social groups: a Meta-analysis. Front Psychol. 2019. https://doi.org/10.3389/fpsyg.2019.02832 .

Cambridge Dictionary. Meaning of attitude in English. [Internet.] 2022. https://dictionary.cambridge.org/dictionary/english/attitude . Accessed 9th October 2023.

L’Ecuyer K. Perceptions of nurse preceptors of students and new graduates with learning difficulties and their willingness to precept them in clinical practice (part 2). Nurse Educ Pract. 2019;34:210–7. https://doi.org/10.1016/j.nepr.2018.12.004 .

Ryder D, Norwich B. UK higher education lecturers’ perspectives of dyslexia, dyslexic students and related disability provision. J Res Spec Educ Needs. 2019;19:161–72.

Newlands F, Shrewsbury D, Robson J. Foundation doctors and dyslexia: a qualitative study of their experiences and coping strategies. Postgrad Med J. 2015;91(1073):121–6. https://doi.org/10.1136/postgradmedj-2014-132573 .

Shaw S, Anderson J. The experiences of medical students with dyslexia: an interpretive phenomenological study. Dyslexia. 2018;24(3):220–33.

L’Ecuyer K. Clinical education of nursing students with learning difficulties: an integrative review (part 1). Nurse Educ Pract. 2019;234:173–84. https://doi.org/10.1016/j.nepr.2018.11.015 .

Walker E, Shaw S, Reed M, Anderson J. The experiences of foundation doctors with dyspraxia: a phenomenological study. Adv Health Sci Educ Theory Pract. 2021;26(3):959–74.

Evans W. If they can’t tell the difference between duphalac and digoxin you’ve got patient safety issues. Nurse lecturers’ constructions of students’ dyslexic identities in nurse education. Nurse Educ Today. 2014;34(6):41–6. https://doi.org/10.1016/j.nedt.2013.11.004 .

Illing J, Carter M. Chapter 27: philosophical research perspectives and planning your research. In: Swanwick T, Forrest K, O’Brien B, editors. Understanding medical education: evidence, theory and practice. 3rd ed. Oxford: Wiley; 2019. pp. 393–6.

Atkinson K, Koenka A, Sanchez C, Moshontz H, Cooper H. Reporting standards for literature searches and report inclusion criteria: making research syntheses more transparent and easy to replicate. Res Synth Methods. 2015;6(1):87–95.

Cohen L, Manion L, Morrison K. Research methods in education. 8th ed. London: Routledge; 2017. pp. 171–86.

Book   Google Scholar  

Critical Skills Appraisal Programme (CASP): Qualitative checklist. In: Critical Appraisal Checklist. Critical appraisal skills programme. [Internet]. 2018. https://casp-uk.net/casp-tools-checklists/ . Accessed 9th October 2023.

O’Brien BC, Harris IB, Beckman TJ, Reed DA, Cook DA. Standards for reporting qualitative research: a synthesis of recommendations. Acad Med. 2014;89(9):1245–51.

Cohen L, Manion L, Morrison K. Research methods in education. 8th ed. London: Routledge; 2017. pp. 334–5.

DeJonckheere M, Vaughn L. Semistructured interviewing in primary care research: a balance of relationship and rigour. Fam Med Com Health. 2019. https://doi.org/10.1136/fmch-2018-000057 .

O’Hara C. To Teach or Not to Teach? A study of Dyslexia in Teacher Education. Cardiff Metropolitan University, UK;2013 p. 240.

Ryder D. Dyslexia assessment practice within the UK higher education sector: Assessor, lecturer and student perspectives. University of Exeter; 2016.

Schabmann A, Eichert H-C, Schmidt B, Hennes A-K, Ramacher-Faasen N. Knowledge, awareness of problems, and support: university instructors’ perspectives on dyslexia in higher education. Eur J Spec Needs Educ. 2020;35(2):273–82.

Cohen L, Manion L, Morrison K. Research methods in education. 8th ed. London: Routledge; 2017. pp. 507–24.

Cohen L, Manion L, Morrison K. Research methods in education. 8th ed. London: Routledge; 2017. ;523.

Malmqvist J, Hellberg K, Möllås G, Rose R, Shevlin M. Conducting the pilot study: a neglected part of the research process? Methodological findings supporting the importance of piloting in qualitative Research studies. Int J Qual Methods. 2019;18. https://doi.org/10.1177/1609406919878341 .

Miller T, Bell L. Consenting to what? Issues of access, gate-keeping and ‘informed’ consent. In: Mauthner M, Birch M, Jessop J, Miller T, editors. Ethics in qualitative research. London: Sage; 2002. pp. 53–5.

Cohen L, Manion L, Morrison K. Research methods in education. 8th ed. London: Routledge. 2017.;523.

Braun V, Clarke V. Using thematic analysis in psychology. Qual Res Psychol. 2006;3(2):77–101.

Braun V, Clarke V. Can I use TA? Should I use TA? Should I not use TA? Comparing reflexive thematic analysis and other pattern-based qualitative analytic approaches. Couns Psychother Res. 2020;21(2):37–47.

Braun V, Clarke V. Thematic analysis. In: Cooper H, Camic P, Long D, Panter A, Rindskopf D, Sher K, editors. APA handbook of research methods in psychology, vol. 2: Research designs: quantitative, qualitative, neuropsychological, and biological. Washington, DC: American Psychological Association; 2012. pp. 57–71.

Chapter   Google Scholar  

Braun V, Clarke V. Reflecting on reflexive thematic analysis. Qual Res Sport Exerc Health. 2019;11(4):589–97.

Byrne DA. Worked example of Braun and Clarke’s approach to reflexive thematic analysis. Qual Quant. 2021;56:1391–412.

Braun V, Clarke V. One size fits all? What counts as quality practice in (reflexive) thematic analysis? Qual. Res Psychol. 2021;18(3):328–52.

Finlay L. Outing the researcher: the provenance, process, and practice of reflexivity. Qual Health Res. 2002;12(4):531–45. https://doi.org/10.1177/104973202129120052 .

Beer O. There’s a certain slant of light’: the experience of discovery in qualitative interviewing. OTJR. 1997;17(2):127.

Cohen L, Manion L, Morrison K. Research methods in education. 8th ed. London: Routledge; 2017. ;112.

Cameron H, Nunkoosing K. Lecturer perspectives on dyslexia and dyslexic students within one faculty at one university in England. Teach High Educ. 2012;17(3):341–52.

West M, Coia D. Caring for doctors, caring for patients. London:General Medical Council. 2019. https://www.gmc-uk.org/-/media/documents/caring-for-doctors-caring-for-patients_pdf-80706341.pdf . Accessed 8th October 2023.

Kaufman DM. Teaching and learning in Medical Education: how theory can inform practice. In: Swanwick T, Forrest K, O’Brien BC, editors. Understanding Medical Education evidence theory and practice. New Jersey: Wiley Blackwell; 2019. pp. 58–9.

Swanwick T. Postgraduate medical education: the same, but different. Postgrad Med J. 2015;91:179–81.

Farmer M, Riddick B, Sterling C. Dyslexia and inclusion: assessment and support in higher education. London: Whurr; 2002. pp. 175–81.

Mortimore T. Dyslexia in higher education: creating a fully inclusive institution. J Res Spec Educ Needs. 2013;13:38–47. https://doi.org/10.1111/j.1471-3802.2012.01231.x .

Morris C. Chapter 12: Work-based learning. In: Swanwick T, Forrest K, O’Brien B, editors. Understanding medical education: Evidence, theory and practice. 3rd ed. Oxford: Wiley; 2019. p.168.

Sandhu D. Postgraduate medical education – challenges and innovative solutions. Med Teach. 2018;40(6):607–9.

Ricketts C, Brice J, Coombes L. Are multiple choice tests fair to medical students with specific learning disabilities? Adv Health Sci Educ Theory Pract. 2010;15:265–75. https://doi.org/10.1007/s10459-009-9197-8 .

Asghar Z, Williams N, Denney M, Siriwardena A. Performance in candidates declaring versus those not declaring dyslexia in a licensing clinical examination. Med Educ. 2019;53(12):1243–52.

Riddell S, Weedon E. What counts as a reasonable adjustment? Dyslexic students and the concept of fair assessment. Int Stud Sociol Educ. 2006;16(1):57–73. https://doi.org/10.1080/19620210600804301 .

Riddick R, English E. Meeting the standards? Dyslexic students and the selection process for initial teacher training. Eur J Teach Educ. 2006;29(2):203–22. https://doi.org/10.1080/02619760600617383 .

Morris D, Turnbull P. Clinical experiences of students with dyslexia. J Adv Nurs. 2006;54(2):238–47. https://doi.org/10.1111/j.1365-2648.2006.03806.x .

General Medical Council. National Training Survey 2024 results. [Internet]. 2024 p. 4–5, 24–25, 28–32. https://www.gmc-uk.org/-/media/documents/national-training-survey-summary-report-2024_pdf-107834344.pdf . Accessed 26/7/2024.

Download references

Acknowledgements

LJC would like to thank her academic supervisor Ms Helen Pugsley, Centre for Medical Education at Cardiff University, for her guidance and encouragement during LJC’s Masters project. LJC would also like to thank all the interview participants who took an active part in shaping this project. LJC is extremely grateful for their time, honesty and for providing such vivid and illuminating windows into their roles as educators. LJC would also like to thank Dr Colette McNulty, Dr Helen Baker and wider staff members at HEIW for their support in circulating her study invitation to trainers across Wales.

LJC did not receive any funding for, or as part of, the research project described in this paper.

Author information

Authors and affiliations.

Aneurin Bevan University Health Board, Newport, UK

Laura Josephine Cheetham

You can also search for this author in PubMed   Google Scholar

Contributions

LJC designed and undertook the entirety of the research project described in this paper. She also wrote this paper in entirety.

Corresponding author

Correspondence to Laura Josephine Cheetham .

Ethics declarations

Ethics approval and consent to participate.

This study received ethical approval from Cardiff University’s Medical Ethics Committee. After discussions, it was felt that NHS Research Ethics Committee approval was not needed. Written and verbally informed consent to participate was obtained, with prospective participants being provided with information regarding the study and their rights at least three weeks before interviews took place.

Consent for publication

Research participants gave written and verbal consent for the contents of their interviews to be analysed and reported as part of this study.

Competing interests

The authors declare no competing interests.

Author’s information

LJC is currently a final year GP registrar working in Wales with keen interests in differential attainment, inclusivity within education and civil learning environments. This paper is borne from a project she designed and undertook as part of her Masters in Medical Education at Cardiff University.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary material 2, supplementary material 3, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

Cheetham, L.J. “Because people don’t know what it is, they don’t really know it exists” : a qualitative study of postgraduate medical educators’ perceptions of dyscalculia. BMC Med Educ 24 , 896 (2024). https://doi.org/10.1186/s12909-024-05912-2

Download citation

Received : 27 November 2023

Accepted : 14 August 2024

Published : 20 August 2024

DOI : https://doi.org/10.1186/s12909-024-05912-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Dyscalculia
  • Postgraduate
  • Neurodiversity

BMC Medical Education

ISSN: 1472-6920

a research is empirical because

IMAGES

  1. Empirical Research: Definition, Methods, Types and Examples

    a research is empirical because

  2. Empirical Research: Definition, Methods, Types and Examples

    a research is empirical because

  3. What Is Empirical Research? Definition, Types & Samples

    a research is empirical because

  4. 1) Empirical research is important because it

    a research is empirical because

  5. What Is Empirical Research? Definition, Types & Samples

    a research is empirical because

  6. What is empirical research

    a research is empirical because

COMMENTS

  1. What Is Empirical Research? Definition, Types & Samples in 2024

    Empirical research is defined as any study whose conclusions are exclusively derived from concrete, verifiable evidence. The term empirical basically means that it is guided by scientific experimentation and/or evidence. Likewise, a study is empirical when it uses real-world evidence in investigating its assertions.

  2. Empirical Research: Definition, Methods, Types and Examples

    Empirical research is defined as any research where conclusions of the study is strictly drawn from concretely empirical evidence, and therefore "verifiable" evidence. ... Because of the simplicity of such a method, high responses are achieved. It is one of the most commonly used methods for all kinds of research in today's world.

  3. Empirical research

    A scientist gathering data for her research. Empirical research is research using empirical evidence.It is also a way of gaining knowledge by means of direct and indirect observation or experience. Empiricism values some research more than other kinds. Empirical evidence (the record of one's direct observations or experiences) can be analyzed quantitatively or qualitatively.

  4. What is empirical research: Methods, types & examples

    Because only then you can test your ideas and collect tangible information. Now, let us start with the empirical research definition: What is empirical research? Empirical research is a research type where the aim of the study is based on finding concrete and provable evidence. The researcher using this method to draw conclusions can use both ...

  5. Empirical Research: Defining, Identifying, & Finding

    Empirical research methodologies can be described as quantitative, qualitative, or a mix of both (usually called mixed-methods). Ruane (2016) (UofM login required) gets at the basic differences in approach between quantitative and qualitative research: Quantitative research -- an approach to documenting reality that relies heavily on numbers both for the measurement of variables and for data ...

  6. What is Empirical Research? Definition, Methods, Examples

    Empirical research is characterized by several key features: Observation and Measurement: It involves the systematic observation or measurement of variables, events, or behaviors. Data Collection: Researchers collect data through various methods, such as surveys, experiments, observations, or interviews.

  7. Empirical Research

    In empirical research, knowledge is developed from factual experience as opposed to theoretical assumption and usually involved the use of data sources like datasets or fieldwork, but can also be based on observations within a laboratory setting. ... According to Dilworth , empirical research is fascinating because, if it is appropriately ...

  8. Empirical Research

    Strategies for Empirical Research in Writing is a particularly accessible approach to both qualitative and quantitative empirical research methods, helping novices appreciate the value of empirical research in writing while easing their fears about the research process. ... If you have trouble accessing this page because of a disability, please ...

  9. Empirical Research: A Comprehensive Guide for Academics

    Advantages of Empirical Research. Empirical research is valuable because it stays objective by relying on observable data, lessening the impact of personal biases. This objectivity boosts the trustworthiness of research findings. Also, using precise quantitative methods helps in accurate measurement and statistical analysis.

  10. Empirical Research: Defining, Identifying, & Finding

    While just because an article is structured in an IMRaD layout is not enough to say it is empirical, specific characteristics of empirical research are more likely to be in certain sections, so knowing them will help you find the characteristics more quickly. Click the link for each section to learn what empirical research characteristics are ...

  11. What is empirical research?

    According to the APA, empirical research is defined as the following: "Study based on facts, systematic observation, or experiment, rather than theory or general philosophical principle." Empirical research articles are generally located in scholarly, peer-reviewed journals and often follow a specific layout known as IMRaD:

  12. Empirical Research in the Social Sciences and Education

    Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components: Introduction: sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous studies

  13. Empirical Research

    Finding Empirical Research in Library OneSearch & Google Scholar. These tools do not have a method for locating empirical research. Using "empirical" as a keyword will find some studies, but miss many others. ... This process is important because it validates the research and gives it a sort of "seal of approval" from others in the research ...

  14. Empirical Research

    Empirical research is based on phenomena that can be observed and measured. Empirical research derives knowledge from actual experience rather than from theory or belief. Key characteristics of empirical research include: Specific research questions to be answered; Definitions of the population, behavior, or phenomena being studied;

  15. What is Empirical Research Study? [Examples & Method]

    An experiment is a useful method of measuring causality; that is cause and effect between dependent and independent variables in a research environment. It is an integral data gathering method in an empirical research study because it involves testing calculated assumptions in order to arrive at the most valid data and research outcomes. Case Study

  16. Empirical Research: What is Empirical Research?

    Empirical research is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology." Ask yourself: Could I recreate this study and ...

  17. What is "Empirical Research"?

    Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components: Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous ...

  18. Empirical evidence: A definition

    Empirical research is the process of finding empirical evidence. Empirical data is the information that comes from the research. ... Because scientists are human and prone to error, empirical data ...

  19. PDF What Is Empirical Social Research?

    teristics that set research apart. First, social research is systematic; that is, the researcher devel-ops a plan of ac. ion before beginning the research. Second, social research involves data, which are the pieces of informa. ion gathered from primary sources. This is what makes it empirical—based not on ideas or theory b.

  20. Conduct empirical research

    Typically, empirical research embodies the following elements: A research question, which will determine research objectives. A particular and planned design for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources. The gathering of primary data, which is then analysed.

  21. Searching for Empirical Research

    Because empirical research refers to the method of investigation rather than a method of publication, it can be published in a number of places. In many disciplines empirical research is most commonly published in scholarly, peer-reviewed journals. Putting empirical research through the peer review process helps ensure that the research is high ...

  22. Methodical Basics of Empirical Research

    A hypothesis always includes empirical research and theoretical attempts to formulate expected outcomes of the new investigation. In this way, research is linked to relevant theory. 4. The research designs and methods (including the samples) used must allow the answering of the research questions.

  23. What is Qualitative in Qualitative Research

    Qualitative research involves the studied use and collection of a variety of empirical materials - case study, personal experience, introspective, life story, interview, observational, historical, interactional, and visual texts - that describe routine and problematic moments and meanings in individuals' lives.

  24. Identifying Empirical Research Papers 0124 (docx)

    3) Meta-analyses A meta-analysis is a technique that allows researchers to combine the results of many studies. This means that the authors do not collect their own data; they rely on the results from already published studies. Meta-analyses are also not empirical papers in a narrow sense, because they rely on published empirical papers (in this sense we can say that they are empirical papers ...

  25. Leveraging social media data for pandemic detection and prediction

    Our research framework was present in Fig. 2, including sentiment analysis, empirical model, prediction model, and content analysis. The purpose of sentiment analysis was to obtain key variables ...

  26. Full article: Knowledge and dispositions of caring professionals in

    In addition, all empirical literature had to have thematic relevance to the research question. Although scoping reviews can include grey literature, this review only included peer-reviewed articles to minimize bias and maintain credibility in this often divisive, emerging field of study (Munn et al., Citation 2022 ).

  27. "Because people don't know what it is, they don't really know it exists

    Dyscalculia is defined as a specific learning difference or neurodiversity. Despite a move within postgraduate medical education (PGME) towards promoting inclusivity and addressing differential attainment, dyscalculia remains an unexplored area. Using an interpretivist, constructivist, qualitative methodology, this scoping study explores PGME educators' attitudes, understanding and perceived ...

  28. Platform Governance with Algorithm-Based Content Moderation: An

    To understand the impacts of these increasingly popular bot moderators, we conduct an empirical study with data collected from 156 communities (subreddits) on Reddit. Based on a series of econometric analyses, we find that bots augment volunteer moderators by stimulating them to moderate a larger quantity of posts, and such effects are ...