short essay on analysis of data

A Step-by-Step Guide to the Data Analysis Process

Like any scientific discipline, data analysis follows a rigorous step-by-step process. Each stage requires different skills and know-how. To get meaningful insights, though, it’s important to understand the process as a whole. An underlying framework is invaluable for producing results that stand up to scrutiny.

In this post, we’ll explore the main steps in the data analysis process. This will cover how to define your goal, collect data, and carry out an analysis. Where applicable, we’ll also use examples and highlight a few tools to make the journey easier. When you’re done, you’ll have a much better understanding of the basics. This will help you tweak the process to fit your own needs.

Here are the steps we’ll take you through:

Defining the question
Collecting the data
Cleaning the data
Analyzing the data
Sharing your results
Embracing failure

On popular request, we’ve also developed a video based on this article. Scroll further along this article to watch that.

Ready? Let’s get started with step one.

1. Step one: Defining the question

The first step in any data analysis process is to define your objective. In data analytics jargon, this is sometimes called the ‘problem statement’.

Defining your objective means coming up with a hypothesis and figuring how to test it. Start by asking: What business problem am I trying to solve? While this might sound straightforward, it can be trickier than it seems. For instance, your organization’s senior management might pose an issue, such as: “Why are we losing customers?” It’s possible, though, that this doesn’t get to the core of the problem. A data analyst’s job is to understand the business and its goals in enough depth that they can frame the problem the right way.

Let’s say you work for a fictional company called TopNotch Learning. TopNotch creates custom training software for its clients. While it is excellent at securing new clients, it has much lower repeat business. As such, your question might not be, “Why are we losing customers?” but, “Which factors are negatively impacting the customer experience?” or better yet: “How can we boost customer retention while minimizing costs?”

Now you’ve defined a problem, you need to determine which sources of data will best help you solve it. This is where your business acumen comes in again. For instance, perhaps you’ve noticed that the sales process for new clients is very slick, but that the production team is inefficient. Knowing this, you could hypothesize that the sales process wins lots of new clients, but the subsequent customer experience is lacking. Could this be why customers don’t come back? Which sources of data will help you answer this question?

Tools to help define your objective

Defining your objective is mostly about soft skills, business knowledge, and lateral thinking. But you’ll also need to keep track of business metrics and key performance indicators (KPIs). Monthly reports can allow you to track problem points in the business. Some KPI dashboards come with a fee, like Databox and DashThis . However, you’ll also find open-source software like Grafana , Freeboard , and Dashbuilder . These are great for producing simple dashboards, both at the beginning and the end of the data analysis process.

2. Step two: Collecting the data

Once you’ve established your objective, you’ll need to create a strategy for collecting and aggregating the appropriate data. A key part of this is determining which data you need. This might be quantitative (numeric) data, e.g. sales figures, or qualitative (descriptive) data, such as customer reviews. All data fit into one of three categories: first-party, second-party, and third-party data. Let’s explore each one.

What is first-party data?

First-party data are data that you, or your company, have directly collected from customers. It might come in the form of transactional tracking data or information from your company’s customer relationship management (CRM) system. Whatever its source, first-party data is usually structured and organized in a clear, defined way. Other sources of first-party data might include customer satisfaction surveys, focus groups, interviews, or direct observation.

What is second-party data?

To enrich your analysis, you might want to secure a secondary data source. Second-party data is the first-party data of other organizations. This might be available directly from the company or through a private marketplace. The main benefit of second-party data is that they are usually structured, and although they will be less relevant than first-party data, they also tend to be quite reliable. Examples of second-party data include website, app or social media activity, like online purchase histories, or shipping data.

What is third-party data?

Third-party data is data that has been collected and aggregated from numerous sources by a third-party organization. Often (though not always) third-party data contains a vast amount of unstructured data points (big data). Many organizations collect big data to create industry reports or to conduct market research. The research and advisory firm Gartner is a good real-world example of an organization that collects big data and sells it on to other companies. Open data repositories and government portals are also sources of third-party data .

Tools to help you collect data

Once you’ve devised a data strategy (i.e. you’ve identified which data you need, and how best to go about collecting them) there are many tools you can use to help you. One thing you’ll need, regardless of industry or area of expertise, is a data management platform (DMP). A DMP is a piece of software that allows you to identify and aggregate data from numerous sources, before manipulating them, segmenting them, and so on. There are many DMPs available. Some well-known enterprise DMPs include Salesforce DMP , SAS , and the data integration platform, Xplenty . If you want to play around, you can also try some open-source platforms like Pimcore or D:Swarm .

Want to learn more about what data analytics is and the process a data analyst follows? We cover this topic (and more) in our free introductory short course for beginners. Check out tutorial one: An introduction to data analytics .

3. Step three: Cleaning the data

Once you’ve collected your data, the next step is to get it ready for analysis. This means cleaning, or ‘scrubbing’ it, and is crucial in making sure that you’re working with high-quality data . Key data cleaning tasks include:

Removing major errors, duplicates, and outliers —all of which are inevitable problems when aggregating data from numerous sources.
Removing unwanted data points —extracting irrelevant observations that have no bearing on your intended analysis.
Bringing structure to your data —general ‘housekeeping’, i.e. fixing typos or layout issues, which will help you map and manipulate your data more easily.
Filling in major gaps —as you’re tidying up, you might notice that important data are missing. Once you’ve identified gaps, you can go about filling them.

A good data analyst will spend around 70-90% of their time cleaning their data. This might sound excessive. But focusing on the wrong data points (or analyzing erroneous data) will severely impact your results. It might even send you back to square one…so don’t rush it! You’ll find a step-by-step guide to data cleaning here . You may be interested in this introductory tutorial to data cleaning, hosted by Dr. Humera Noor Minhas.

Carrying out an exploratory analysis

Another thing many data analysts do (alongside cleaning data) is to carry out an exploratory analysis. This helps identify initial trends and characteristics, and can even refine your hypothesis. Let’s use our fictional learning company as an example again. Carrying out an exploratory analysis, perhaps you notice a correlation between how much TopNotch Learning’s clients pay and how quickly they move on to new suppliers. This might suggest that a low-quality customer experience (the assumption in your initial hypothesis) is actually less of an issue than cost. You might, therefore, take this into account.

Tools to help you clean your data

Cleaning datasets manually—especially large ones—can be daunting. Luckily, there are many tools available to streamline the process. Open-source tools, such as OpenRefine , are excellent for basic data cleaning, as well as high-level exploration. However, free tools offer limited functionality for very large datasets. Python libraries (e.g. Pandas) and some R packages are better suited for heavy data scrubbing. You will, of course, need to be familiar with the languages. Alternatively, enterprise tools are also available. For example, Data Ladder , which is one of the highest-rated data-matching tools in the industry. There are many more. Why not see which free data cleaning tools you can find to play around with?

4. Step four: Analyzing the data

Finally, you’ve cleaned your data. Now comes the fun bit—analyzing it! The type of data analysis you carry out largely depends on what your goal is. But there are many techniques available. Univariate or bivariate analysis, time-series analysis, and regression analysis are just a few you might have heard of. More important than the different types, though, is how you apply them. This depends on what insights you’re hoping to gain. Broadly speaking, all types of data analysis fit into one of the following four categories.

Descriptive analysis

Descriptive analysis identifies what has already happened . It is a common first step that companies carry out before proceeding with deeper explorations. As an example, let’s refer back to our fictional learning provider once more. TopNotch Learning might use descriptive analytics to analyze course completion rates for their customers. Or they might identify how many users access their products during a particular period. Perhaps they’ll use it to measure sales figures over the last five years. While the company might not draw firm conclusions from any of these insights, summarizing and describing the data will help them to determine how to proceed.

Learn more: What is descriptive analytics?

Diagnostic analysis

Diagnostic analytics focuses on understanding why something has happened . It is literally the diagnosis of a problem, just as a doctor uses a patient’s symptoms to diagnose a disease. Remember TopNotch Learning’s business problem? ‘Which factors are negatively impacting the customer experience?’ A diagnostic analysis would help answer this. For instance, it could help the company draw correlations between the issue (struggling to gain repeat business) and factors that might be causing it (e.g. project costs, speed of delivery, customer sector, etc.) Let’s imagine that, using diagnostic analytics, TopNotch realizes its clients in the retail sector are departing at a faster rate than other clients. This might suggest that they’re losing customers because they lack expertise in this sector. And that’s a useful insight!

Predictive analysis

Predictive analysis allows you to identify future trends based on historical data . In business, predictive analysis is commonly used to forecast future growth, for example. But it doesn’t stop there. Predictive analysis has grown increasingly sophisticated in recent years. The speedy evolution of machine learning allows organizations to make surprisingly accurate forecasts. Take the insurance industry. Insurance providers commonly use past data to predict which customer groups are more likely to get into accidents. As a result, they’ll hike up customer insurance premiums for those groups. Likewise, the retail industry often uses transaction data to predict where future trends lie, or to determine seasonal buying habits to inform their strategies. These are just a few simple examples, but the untapped potential of predictive analysis is pretty compelling.

Prescriptive analysis

Prescriptive analysis allows you to make recommendations for the future. This is the final step in the analytics part of the process. It’s also the most complex. This is because it incorporates aspects of all the other analyses we’ve described. A great example of prescriptive analytics is the algorithms that guide Google’s self-driving cars. Every second, these algorithms make countless decisions based on past and present data, ensuring a smooth, safe ride. Prescriptive analytics also helps companies decide on new products or areas of business to invest in.

Learn more: What are the different types of data analysis?

5. Step five: Sharing your results

You’ve finished carrying out your analyses. You have your insights. The final step of the data analytics process is to share these insights with the wider world (or at least with your organization’s stakeholders!) This is more complex than simply sharing the raw results of your work—it involves interpreting the outcomes, and presenting them in a manner that’s digestible for all types of audiences. Since you’ll often present information to decision-makers, it’s very important that the insights you present are 100% clear and unambiguous. For this reason, data analysts commonly use reports, dashboards, and interactive visualizations to support their findings.

How you interpret and present results will often influence the direction of a business. Depending on what you share, your organization might decide to restructure, to launch a high-risk product, or even to close an entire division. That’s why it’s very important to provide all the evidence that you’ve gathered, and not to cherry-pick data. Ensuring that you cover everything in a clear, concise way will prove that your conclusions are scientifically sound and based on the facts. On the flip side, it’s important to highlight any gaps in the data or to flag any insights that might be open to interpretation. Honest communication is the most important part of the process. It will help the business, while also helping you to excel at your job!

Tools for interpreting and sharing your findings

There are tons of data visualization tools available, suited to different experience levels. Popular tools requiring little or no coding skills include Google Charts , Tableau , Datawrapper , and Infogram . If you’re familiar with Python and R, there are also many data visualization libraries and packages available. For instance, check out the Python libraries Plotly , Seaborn , and Matplotlib . Whichever data visualization tools you use, make sure you polish up your presentation skills, too. Remember: Visualization is great, but communication is key!

You can learn more about storytelling with data in this free, hands-on tutorial . We show you how to craft a compelling narrative for a real dataset, resulting in a presentation to share with key stakeholders. This is an excellent insight into what it’s really like to work as a data analyst!

6. Step six: Embrace your failures

The last ‘step’ in the data analytics process is to embrace your failures. The path we’ve described above is more of an iterative process than a one-way street. Data analytics is inherently messy, and the process you follow will be different for every project. For instance, while cleaning data, you might spot patterns that spark a whole new set of questions. This could send you back to step one (to redefine your objective). Equally, an exploratory analysis might highlight a set of data points you’d never considered using before. Or maybe you find that the results of your core analyses are misleading or erroneous. This might be caused by mistakes in the data, or human error earlier in the process.

While these pitfalls can feel like failures, don’t be disheartened if they happen. Data analysis is inherently chaotic, and mistakes occur. What’s important is to hone your ability to spot and rectify errors. If data analytics was straightforward, it might be easier, but it certainly wouldn’t be as interesting. Use the steps we’ve outlined as a framework, stay open-minded, and be creative. If you lose your way, you can refer back to the process to keep yourself on track.

In this post, we’ve covered the main steps of the data analytics process. These core steps can be amended, re-ordered and re-used as you deem fit, but they underpin every data analyst’s work:

Define the question —What business problem are you trying to solve? Frame it as a question to help you focus on finding a clear answer.
Collect data —Create a strategy for collecting data. Which data sources are most likely to help you solve your business problem?
Clean the data —Explore, scrub, tidy, de-dupe, and structure your data as needed. Do whatever you have to! But don’t rush…take your time!
Analyze the data —Carry out various analyses to obtain insights. Focus on the four types of data analysis: descriptive, diagnostic, predictive, and prescriptive.
Share your results —How best can you share your insights and recommendations? A combination of visualization tools and communication is key.
Embrace your mistakes —Mistakes happen. Learn from them. This is what transforms a good data analyst into a great one.

What next? From here, we strongly encourage you to explore the topic on your own. Get creative with the steps in the data analysis process, and see what tools you can find. As long as you stick to the core principles we’ve described, you can create a tailored technique that works for you.

To learn more, check out our free, 5-day data analytics short course . You might also be interested in the following:

These are the top 9 data analytics tools
10 great places to find free datasets for your next project
How to build a data analytics portfolio

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Publications
Account settings
Advanced Search
Journal List
Springer Nature - PMC COVID-19 Collection

Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective

Iqbal h. sarker.

1 Swinburne University of Technology, Melbourne, VIC 3122 Australia

2 Department of Computer Science and Engineering, Chittagong University of Engineering & Technology, Chittagong, 4349 Bangladesh

The digital world has a wealth of data, such as internet of things (IoT) data, business data, health data, mobile data, urban data, security data, and many more, in the current age of the Fourth Industrial Revolution (Industry 4.0 or 4IR). Extracting knowledge or useful insights from these data can be used for smart decision-making in various applications domains. In the area of data science, advanced analytics methods including machine learning modeling can provide actionable insights or deeper knowledge about data, which makes the computing process automatic and smart. In this paper, we present a comprehensive view on “Data Science” including various types of advanced analytics methods that can be applied to enhance the intelligence and capabilities of an application through smart decision-making in different scenarios. We also discuss and summarize ten potential real-world application domains including business, healthcare, cybersecurity, urban and rural data science, and so on by taking into account data-driven smart computing and decision making. Based on this, we finally highlight the challenges and potential research directions within the scope of our study. Overall, this paper aims to serve as a reference point on data science and advanced analytics to the researchers and decision-makers as well as application developers, particularly from the data-driven solution point of view for real-world problems.

Introduction

We are living in the age of “data science and advanced analytics”, where almost everything in our daily lives is digitally recorded as data [ 17 ]. Thus the current electronic world is a wealth of various kinds of data, such as business data, financial data, healthcare data, multimedia data, internet of things (IoT) data, cybersecurity data, social media data, etc [ 112 ]. The data can be structured, semi-structured, or unstructured, which increases day by day [ 105 ]. Data science is typically a “concept to unify statistics, data analysis, and their related methods” to understand and analyze the actual phenomena with data. According to Cao et al. [ 17 ] “data science is the science of data” or “data science is the study of data”, where a data product is a data deliverable, or data-enabled or guided, which can be a discovery, prediction, service, suggestion, insight into decision-making, thought, model, paradigm, tool, or system. The popularity of “Data science” is increasing day-by-day, which is shown in Fig. Fig.1 1 according to Google Trends data over the last 5 years [ 36 ]. In addition to data science, we have also shown the popularity trends of the relevant areas such as “Data analytics”, “Data mining”, “Big data”, “Machine learning” in the figure. According to Fig. Fig.1, 1 , the popularity indication values for these data-driven domains, particularly “Data science”, and “Machine learning” are increasing day-by-day. This statistical information and the applicability of the data-driven smart decision-making in various real-world application areas, motivate us to study briefly on “Data science” and machine-learning-based “Advanced analytics” in this paper.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_765_Fig1_HTML.jpg

The worldwide popularity score of data science comparing with relevant areas in a range of 0 (min) to 100 (max) over time where x -axis represents the timestamp information and y -axis represents the corresponding score

Usually, data science is the field of applying advanced analytics methods and scientific concepts to derive useful business information from data. The emphasis of advanced analytics is more on anticipating the use of data to detect patterns to determine what is likely to occur in the future. Basic analytics offer a description of data in general, while advanced analytics is a step forward in offering a deeper understanding of data and helping to analyze granular data, which we are interested in. In the field of data science, several types of analytics are popular, such as "Descriptive analytics" which answers the question of what happened; "Diagnostic analytics" which answers the question of why did it happen; "Predictive analytics" which predicts what will happen in the future; and "Prescriptive analytics" which prescribes what action should be taken, discussed briefly in “ Advanced analytics methods and smart computing ”. Such advanced analytics and decision-making based on machine learning techniques [ 105 ], a major part of artificial intelligence (AI) [ 102 ] can also play a significant role in the Fourth Industrial Revolution (Industry 4.0) due to its learning capability for smart computing as well as automation [ 121 ].

Although the area of “data science” is huge, we mainly focus on deriving useful insights through advanced analytics, where the results are used to make smart decisions in various real-world application areas. For this, various advanced analytics methods such as machine learning modeling, natural language processing, sentiment analysis, neural network, or deep learning analysis can provide deeper knowledge about data, and thus can be used to develop data-driven intelligent applications. More specifically, regression analysis, classification, clustering analysis, association rules, time-series analysis, sentiment analysis, behavioral patterns, anomaly detection, factor analysis, log analysis, and deep learning which is originated from the artificial neural network, are taken into account in our study. These machine learning-based advanced analytics methods are discussed briefly in “ Advanced analytics methods and smart computing ”. Thus, it’s important to understand the principles of various advanced analytics methods mentioned above and their applicability to apply in various real-world application areas. For instance, in our earlier paper Sarker et al. [ 114 ], we have discussed how data science and machine learning modeling can play a significant role in the domain of cybersecurity for making smart decisions and to provide data-driven intelligent security services. In this paper, we broadly take into account the data science application areas and real-world problems in ten potential domains including the area of business data science, health data science, IoT data science, behavioral data science, urban data science, and so on, discussed briefly in “ Real-world application domains ”.

Based on the importance of machine learning modeling to extract the useful insights from the data mentioned above and data-driven smart decision-making, in this paper, we present a comprehensive view on “Data Science” including various types of advanced analytics methods that can be applied to enhance the intelligence and the capabilities of an application. The key contribution of this study is thus understanding data science modeling, explaining different analytic methods for solution perspective and their applicability in various real-world data-driven applications areas mentioned earlier. Overall, the purpose of this paper is, therefore, to provide a basic guide or reference for those academia and industry people who want to study, research, and develop automated and intelligent applications or systems based on smart computing and decision making within the area of data science.

The main contributions of this paper are summarized as follows:

To define the scope of our study towards data-driven smart computing and decision-making in our real-world life. We also make a brief discussion on the concept of data science modeling from business problems to data product and automation, to understand its applicability and provide intelligent services in real-world scenarios.
To provide a comprehensive view on data science including advanced analytics methods that can be applied to enhance the intelligence and the capabilities of an application.
To discuss the applicability and significance of machine learning-based analytics methods in various real-world application areas. We also summarize ten potential real-world application areas, from business to personalized applications in our daily life, where advanced analytics with machine learning modeling can be used to achieve the expected outcome.
To highlight and summarize the challenges and potential research directions within the scope of our study.

The rest of the paper is organized as follows. The next section provides the background and related work and defines the scope of our study. The following section presents the concepts of data science modeling for building a data-driven application. After that, briefly discuss and explain different advanced analytics methods and smart computing. Various real-world application areas are discussed and summarized in the next section. We then highlight and summarize several research issues and potential future directions, and finally, the last section concludes this paper.

Background and Related Work

In this section, we first discuss various data terms and works related to data science and highlight the scope of our study.

Data Terms and Definitions

There is a range of key terms in the field, such as data analysis, data mining, data analytics, big data, data science, advanced analytics, machine learning, and deep learning, which are highly related and easily confusing. In the following, we define these terms and differentiate them with the term “Data Science” according to our goal.

The term “Data analysis” refers to the processing of data by conventional (e.g., classic statistical, empirical, or logical) theories, technologies, and tools for extracting useful information and for practical purposes [ 17 ]. The term “Data analytics”, on the other hand, refers to the theories, technologies, instruments, and processes that allow for an in-depth understanding and exploration of actionable data insight [ 17 ]. Statistical and mathematical analysis of the data is the major concern in this process. “Data mining” is another popular term over the last decade, which has a similar meaning with several other terms such as knowledge mining from data, knowledge extraction, knowledge discovery from data (KDD), data/pattern analysis, data archaeology, and data dredging. According to Han et al. [ 38 ], it should have been more appropriately named “knowledge mining from data”. Overall, data mining is defined as the process of discovering interesting patterns and knowledge from large amounts of data [ 38 ]. Data sources may include databases, data centers, the Internet or Web, other repositories of data, or data dynamically streamed through the system. “Big data” is another popular term nowadays, which may change the statistical and data analysis approaches as it has the unique features of “massive, high dimensional, heterogeneous, complex, unstructured, incomplete, noisy, and erroneous” [ 74 ]. Big data can be generated by mobile devices, social networks, the Internet of Things, multimedia, and many other new applications [ 129 ]. Several unique features including volume, velocity, variety, veracity, value (5Vs), and complexity are used to understand and describe big data [ 69 ].

In terms of analytics, basic analytics provides a summary of data whereas the term “Advanced Analytics” takes a step forward in offering a deeper understanding of data and helps to analyze granular data. Advanced analytics is characterized or defined as autonomous or semi-autonomous data or content analysis using advanced techniques and methods to discover deeper insights, predict or generate recommendations, typically beyond traditional business intelligence or analytics. “Machine learning”, a branch of artificial intelligence (AI), is one of the major techniques used in advanced analytics which can automate analytical model building [ 112 ]. This is focused on the premise that systems can learn from data, recognize trends, and make decisions, with minimal human involvement [ 38 , 115 ]. “Deep Learning” is a subfield of machine learning that discusses algorithms inspired by the human brain’s structure and the function called artificial neural networks [ 38 , 139 ].

Unlike the above data-related terms, “Data science” is an umbrella term that encompasses advanced data analytics, data mining, machine, and deep learning modeling, and several other related disciplines like statistics, to extract insights or useful knowledge from the datasets and transform them into actionable business strategies. In [ 17 ], Cao et al. defined data science from the disciplinary perspective as “data science is a new interdisciplinary field that synthesizes and builds on statistics, informatics, computing, communication, management, and sociology to study data and its environments (including domains and other contextual aspects, such as organizational and social aspects) to transform data to insights and decisions by following a data-to-knowledge-to-wisdom thinking and methodology”. In “ Understanding data science modeling ”, we briefly discuss the data science modeling from a practical perspective starting from business problems to data products that can assist the data scientists to think and work in a particular real-world problem domain within the area of data science and analytics.

Related Work

In the area, several papers have been reviewed by the researchers based on data science and its significance. For example, the authors in [ 19 ] identify the evolving field of data science and its importance in the broader knowledge environment and some issues that differentiate data science and informatics issues from conventional approaches in information sciences. Donoho et al. [ 27 ] present 50 years of data science including recent commentary on data science in mass media, and on how/whether data science varies from statistics. The authors formally conceptualize the theory-guided data science (TGDS) model in [ 53 ] and present a taxonomy of research themes in TGDS. Cao et al. include a detailed survey and tutorial on the fundamental aspects of data science in [ 17 ], which considers the transition from data analysis to data science, the principles of data science, as well as the discipline and competence of data education.

Besides, the authors include a data science analysis in [ 20 ], which aims to provide a realistic overview of the use of statistical features and related data science methods in bioimage informatics. The authors in [ 61 ] study the key streams of data science algorithm use at central banks and show how their popularity has risen over time. This research contributes to the creation of a research vector on the role of data science in central banking. In [ 62 ], the authors provide an overview and tutorial on the data-driven design of intelligent wireless networks. The authors in [ 87 ] provide a thorough understanding of computational optimal transport with application to data science. In [ 97 ], the authors present data science as theoretical contributions in information systems via text analytics.

Unlike the above recent studies, in this paper, we concentrate on the knowledge of data science including advanced analytics methods, machine learning modeling, real-world application domains, and potential research directions within the scope of our study. The advanced analytics methods based on machine learning techniques discussed in this paper can be applied to enhance the capabilities of an application in terms of data-driven intelligent decision making and automation in the final data product or systems.

Understanding Data Science Modeling

In this section, we briefly discuss how data science can play a significant role in the real-world business process. For this, we first categorize various types of data and then discuss the major steps of data science modeling starting from business problems to data product and automation.

Types of Real-World Data

Typically, to build a data-driven real-world system in a particular domain, the availability of data is the key [ 17 , 112 , 114 ]. The data can be in different types such as (i) Structured—that has a well-defined data structure and follows a standard order, examples are names, dates, addresses, credit card numbers, stock information, geolocation, etc.; (ii) Unstructured—has no pre-defined format or organization, examples are sensor data, emails, blog entries, wikis, and word processing documents, PDF files, audio files, videos, images, presentations, web pages, etc.; (iii) Semi-structured—has elements of both the structured and unstructured data containing certain organizational properties, examples are HTML, XML, JSON documents, NoSQL databases, etc.; and (iv) Metadata—that represents data about the data, examples are author, file type, file size, creation date and time, last modification date and time, etc. [ 38 , 105 ].

In the area of data science, researchers use various widely-used datasets for different purposes. These are, for example, cybersecurity datasets such as NSL-KDD [ 127 ], UNSW-NB15 [ 79 ], Bot-IoT [ 59 ], ISCX’12 [ 15 ], CIC-DDoS2019 [ 22 ], etc., smartphone datasets such as phone call logs [ 88 , 110 ], mobile application usages logs [ 124 , 149 ], SMS Log [ 28 ], mobile phone notification logs [ 77 ] etc., IoT data [ 56 , 11 , 64 ], health data such as heart disease [ 99 ], diabetes mellitus [ 86 , 147 ], COVID-19 [ 41 , 78 ], etc., agriculture and e-commerce data [ 128 , 150 ], and many more in various application domains. In “ Real-world application domains ”, we discuss ten potential real-world application domains of data science and analytics by taking into account data-driven smart computing and decision making, which can help the data scientists and application developers to explore more in various real-world issues.

Overall, the data used in data-driven applications can be any of the types mentioned above, and they can differ from one application to another in the real world. Data science modeling, which is briefly discussed below, can be used to analyze such data in a specific problem domain and derive insights or useful information from the data to build a data-driven model or data product.

Steps of Data Science Modeling

Data science is typically an umbrella term that encompasses advanced data analytics, data mining, machine, and deep learning modeling, and several other related disciplines like statistics, to extract insights or useful knowledge from the datasets and transform them into actionable business strategies, mentioned earlier in “ Background and related work ”. In this section, we briefly discuss how data science can play a significant role in the real-world business process. Figure Figure2 2 shows an example of data science modeling starting from real-world data to data-driven product and automation. In the following, we briefly discuss each module of the data science process.

Understanding business problems: This involves getting a clear understanding of the problem that is needed to solve, how it impacts the relevant organization or individuals, the ultimate goals for addressing it, and the relevant project plan. Thus to understand and identify the business problems, the data scientists formulate relevant questions while working with the end-users and other stakeholders. For instance, how much/many, which category/group, is the behavior unrealistic/abnormal, which option should be taken, what action, etc. could be relevant questions depending on the nature of the problems. This helps to get a better idea of what business needs and what we should be extracted from data. Such business knowledge can enable organizations to enhance their decision-making process, is known as “Business Intelligence” [ 65 ]. Identifying the relevant data sources that can help to answer the formulated questions and what kinds of actions should be taken from the trends that the data shows, is another important task associated with this stage. Once the business problem has been clearly stated, the data scientist can define the analytic approach to solve the problem.
Understanding data: As we know that data science is largely driven by the availability of data [ 114 ]. Thus a sound understanding of the data is needed towards a data-driven model or system. The reason is that real-world data sets are often noisy, missing values, have inconsistencies, or other data issues, which are needed to handle effectively [ 101 ]. To gain actionable insights, the appropriate data or the quality of the data must be sourced and cleansed, which is fundamental to any data science engagement. For this, data assessment that evaluates what data is available and how it aligns to the business problem could be the first step in data understanding. Several aspects such as data type/format, the quantity of data whether it is sufficient or not to extract the useful knowledge, data relevance, authorized access to data, feature or attribute importance, combining multiple data sources, important metrics to report the data, etc. are needed to take into account to clearly understand the data for a particular business problem. Overall, the data understanding module involves figuring out what data would be best needed and the best ways to acquire it.
Data pre-processing and exploration: Exploratory data analysis is defined in data science as an approach to analyzing datasets to summarize their key characteristics, often with visual methods [ 135 ]. This examines a broad data collection to discover initial trends, attributes, points of interest, etc. in an unstructured manner to construct meaningful summaries of the data. Thus data exploration is typically used to figure out the gist of data and to develop a first step assessment of its quality, quantity, and characteristics. A statistical model can be used or not, but primarily it offers tools for creating hypotheses by generally visualizing and interpreting the data through graphical representation such as a chart, plot, histogram, etc [ 72 , 91 ]. Before the data is ready for modeling, it’s necessary to use data summarization and visualization to audit the quality of the data and provide the information needed to process it. To ensure the quality of the data, the data pre-processing technique, which is typically the process of cleaning and transforming raw data [ 107 ] before processing and analysis is important. It also involves reformatting information, making data corrections, and merging data sets to enrich data. Thus, several aspects such as expected data, data cleaning, formatting or transforming data, dealing with missing values, handling data imbalance and bias issues, data distribution, search for outliers or anomalies in data and dealing with them, ensuring data quality, etc. could be the key considerations in this step.
Machine learning modeling and evaluation: Once the data is prepared for building the model, data scientists design a model, algorithm, or set of models, to address the business problem. Model building is dependent on what type of analytics, e.g., predictive analytics, is needed to solve the particular problem, which is discussed briefly in “ Advanced analytics methods and smart computing ”. To best fits the data according to the type of analytics, different types of data-driven or machine learning models that have been summarized in our earlier paper Sarker et al. [ 105 ], can be built to achieve the goal. Data scientists typically separate training and test subsets of the given dataset usually dividing in the ratio of 80:20 or data considering the most popular k -folds data splitting method [ 38 ]. This is to observe whether the model performs well or not on the data, to maximize the model performance. Various model validation and assessment metrics, such as error rate, accuracy, true positive, false positive, true negative, false negative, precision, recall, f-score, ROC (receiver operating characteristic curve) analysis, applicability analysis, etc. [ 38 , 115 ] are used to measure the model performance, which can guide the data scientists to choose or design the learning method or model. Besides, machine learning experts or data scientists can take into account several advanced analytics such as feature engineering, feature selection or extraction methods, algorithm tuning, ensemble methods, modifying existing algorithms, or designing new algorithms, etc. to improve the ultimate data-driven model to solve a particular business problem through smart decision making.
Data product and automation: A data product is typically the output of any data science activity [ 17 ]. A data product, in general terms, is a data deliverable, or data-enabled or guide, which can be a discovery, prediction, service, suggestion, insight into decision-making, thought, model, paradigm, tool, application, or system that process data and generate results. Businesses can use the results of such data analysis to obtain useful information like churn (a measure of how many customers stop using a product) prediction and customer segmentation, and use these results to make smarter business decisions and automation. Thus to make better decisions in various business problems, various machine learning pipelines and data products can be developed. To highlight this, we summarize several potential real-world data science application areas in “ Real-world application domains ”, where various data products can play a significant role in relevant business problems to make them smart and automate.

Overall, we can conclude that data science modeling can be used to help drive changes and improvements in business practices. The interesting part of the data science process indicates having a deeper understanding of the business problem to solve. Without that, it would be much harder to gather the right data and extract the most useful information from the data for making decisions to solve the problem. In terms of role, “Data Scientists” typically interpret and manage data to uncover the answers to major questions that help organizations to make objective decisions and solve complex problems. In a summary, a data scientist proactively gathers and analyzes information from multiple sources to better understand how the business performs, and designs machine learning or data-driven tools/methods, or algorithms, focused on advanced analytics, which can make today’s computing process smarter and intelligent, discussed briefly in the following section.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_765_Fig2_HTML.jpg

An example of data science modeling from real-world data to data-driven system and decision making

Advanced Analytics Methods and Smart Computing

As mentioned earlier in “ Background and related work ”, basic analytics provides a summary of data whereas advanced analytics takes a step forward in offering a deeper understanding of data and helps in granular data analysis. For instance, the predictive capabilities of advanced analytics can be used to forecast trends, events, and behaviors. Thus, “advanced analytics” can be defined as the autonomous or semi-autonomous analysis of data or content using advanced techniques and methods to discover deeper insights, make predictions, or produce recommendations, where machine learning-based analytical modeling is considered as the key technologies in the area. In the following section, we first summarize various types of analytics and outcome that are needed to solve the associated business problems, and then we briefly discuss machine learning-based analytical modeling.

Types of Analytics and Outcome

In the real-world business process, several key questions such as “What happened?”, “Why did it happen?”, “What will happen in the future?”, “What action should be taken?” are common and important. Based on these questions, in this paper, we categorize and highlight the analytics into four types such as descriptive, diagnostic, predictive, and prescriptive, which are discussed below.

Descriptive analytics: It is the interpretation of historical data to better understand the changes that have occurred in a business. Thus descriptive analytics answers the question, “what happened in the past?” by summarizing past data such as statistics on sales and operations or marketing strategies, use of social media, and engagement with Twitter, Linkedin or Facebook, etc. For instance, using descriptive analytics through analyzing trends, patterns, and anomalies, etc., customers’ historical shopping data can be used to predict the probability of a customer purchasing a product. Thus, descriptive analytics can play a significant role to provide an accurate picture of what has occurred in a business and how it relates to previous times utilizing a broad range of relevant business data. As a result, managers and decision-makers can pinpoint areas of strength and weakness in their business, and eventually can take more effective management strategies and business decisions.
Diagnostic analytics: It is a form of advanced analytics that examines data or content to answer the question, “why did it happen?” The goal of diagnostic analytics is to help to find the root cause of the problem. For example, the human resource management department of a business organization may use these diagnostic analytics to find the best applicant for a position, select them, and compare them to other similar positions to see how well they perform. In a healthcare example, it might help to figure out whether the patients’ symptoms such as high fever, dry cough, headache, fatigue, etc. are all caused by the same infectious agent. Overall, diagnostic analytics enables one to extract value from the data by posing the right questions and conducting in-depth investigations into the answers. It is characterized by techniques such as drill-down, data discovery, data mining, and correlations.
Predictive analytics: Predictive analytics is an important analytical technique used by many organizations for various purposes such as to assess business risks, anticipate potential market patterns, and decide when maintenance is needed, to enhance their business. It is a form of advanced analytics that examines data or content to answer the question, “what will happen in the future?” Thus, the primary goal of predictive analytics is to identify and typically answer this question with a high degree of probability. Data scientists can use historical data as a source to extract insights for building predictive models using various regression analyses and machine learning techniques, which can be used in various application domains for a better outcome. Companies, for example, can use predictive analytics to minimize costs by better anticipating future demand and changing output and inventory, banks and other financial institutions to reduce fraud and risks by predicting suspicious activity, medical specialists to make effective decisions through predicting patients who are at risk of diseases, retailers to increase sales and customer satisfaction through understanding and predicting customer preferences, manufacturers to optimize production capacity through predicting maintenance requirements, and many more. Thus predictive analytics can be considered as the core analytical method within the area of data science.
Prescriptive analytics: Prescriptive analytics focuses on recommending the best way forward with actionable information to maximize overall returns and profitability, which typically answer the question, “what action should be taken?” In business analytics, prescriptive analytics is considered the final step. For its models, prescriptive analytics collects data from several descriptive and predictive sources and applies it to the decision-making process. Thus, we can say that it is related to both descriptive analytics and predictive analytics, but it emphasizes actionable insights instead of data monitoring. In other words, it can be considered as the opposite of descriptive analytics, which examines decisions and outcomes after the fact. By integrating big data, machine learning, and business rules, prescriptive analytics helps organizations to make more informed decisions to produce results that drive the most successful business decisions.

In summary, to clarify what happened and why it happened, both descriptive analytics and diagnostic analytics look at the past. Historical data is used by predictive analytics and prescriptive analytics to forecast what will happen in the future and what steps should be taken to impact those effects. In Table Table1, 1 , we have summarized these analytics methods with examples. Forward-thinking organizations in the real world can jointly use these analytical methods to make smart decisions that help drive changes in business processes and improvements. In the following, we discuss how machine learning techniques can play a big role in these analytical methods through their learning capabilities from the data.

Various types of analytical methods with examples

Machine Learning Based Analytical Modeling

In this section, we briefly discuss various advanced analytics methods based on machine learning modeling, which can make the computing process smart through intelligent decision-making in a business process. Figure Figure3 3 shows a general structure of a machine learning-based predictive modeling considering both the training and testing phase. In the following, we discuss a wide range of methods such as regression and classification analysis, association rule analysis, time-series analysis, behavioral analysis, log analysis, and so on within the scope of our study.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_765_Fig3_HTML.jpg

A general structure of a machine learning based predictive model considering both the training and testing phase

Regression Analysis

In data science, one of the most common statistical approaches used for predictive modeling and data mining tasks is regression techniques [ 38 ]. Regression analysis is a form of supervised machine learning that examines the relationship between a dependent variable (target) and independent variables (predictor) to predict continuous-valued output [ 105 , 117 ]. The following equations Eqs. 1 , 2 , and 3 [ 85 , 105 ] represent the simple, multiple or multivariate, and polynomial regressions respectively, where x represents independent variable and y is the predicted/target output mentioned above:

Regression analysis is typically conducted for one of two purposes: to predict the value of the dependent variable in the case of individuals for whom some knowledge relating to the explanatory variables is available, or to estimate the effect of some explanatory variable on the dependent variable, i.e., finding the relationship of causal influence between the variables. Linear regression cannot be used to fit non-linear data and may cause an underfitting problem. In that case, polynomial regression performs better, however, increases the model complexity. The regularization techniques such as Ridge, Lasso, Elastic-Net, etc. [ 85 , 105 ] can be used to optimize the linear regression model. Besides, support vector regression, decision tree regression, random forest regression techniques [ 85 , 105 ] can be used for building effective regression models depending on the problem type, e.g., non-linear tasks. Financial forecasting or prediction, cost estimation, trend analysis, marketing, time-series estimation, drug response modeling, etc. are some examples where the regression models can be used to solve real-world problems in the domain of data science and analytics.

Classification Analysis

Classification is one of the most widely used and best-known data science processes. This is a form of supervised machine learning approach that also refers to a predictive modeling problem in which a class label is predicted for a given example [ 38 ]. Spam identification, such as ‘spam’ and ‘not spam’ in email service providers, can be an example of a classification problem. There are several forms of classification analysis available in the area such as binary classification—which refers to the prediction of one of two classes; multi-class classification—which involves the prediction of one of more than two classes; multi-label classification—a generalization of multiclass classification in which the problem’s classes are organized hierarchically [ 105 ].

Several popular classification techniques, such as k-nearest neighbors [ 5 ], support vector machines [ 55 ], navies Bayes [ 49 ], adaptive boosting [ 32 ], extreme gradient boosting [ 85 ], logistic regression [ 66 ], decision trees ID3 [ 92 ], C4.5 [ 93 ], and random forests [ 13 ] exist to solve classification problems. The tree-based classification technique, e.g., random forest considering multiple decision trees, performs better than others to solve real-world problems in many cases as due to its capability of producing logic rules [ 103 , 115 ]. Figure Figure4 4 shows an example of a random forest structure considering multiple decision trees. In addition, BehavDT recently proposed by Sarker et al. [ 109 ], and IntrudTree [ 106 ] can be used for building effective classification or prediction models in the relevant tasks within the domain of data science and analytics.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_765_Fig4_HTML.jpg

An example of a random forest structure considering multiple decision trees

Cluster Analysis

Clustering is a form of unsupervised machine learning technique and is well-known in many data science application areas for statistical data analysis [ 38 ]. Usually, clustering techniques search for the structures inside a dataset and, if the classification is not previously identified, classify homogeneous groups of cases. This means that data points are identical to each other within a cluster, and different from data points in another cluster. Overall, the purpose of cluster analysis is to sort various data points into groups (or clusters) that are homogeneous internally and heterogeneous externally [ 105 ]. To gain insight into how data is distributed in a given dataset or as a preprocessing phase for other algorithms, clustering is often used. Data clustering, for example, assists with customer shopping behavior, sales campaigns, and retention of consumers for retail businesses, anomaly detection, etc.

Many clustering algorithms with the ability to group data have been proposed in machine learning and data science literature [ 98 , 138 , 141 ]. In our earlier paper Sarker et al. [ 105 ], we have summarized this based on several perspectives, such as partitioning methods, density-based methods, hierarchical-based methods, model-based methods, etc. In the literature, the popular K-means [ 75 ], K-Mediods [ 84 ], CLARA [ 54 ] etc. are known as partitioning methods; DBSCAN [ 30 ], OPTICS [ 8 ] etc. are known as density-based methods; single linkage [ 122 ], complete linkage [ 123 ], etc. are known as hierarchical methods. In addition, grid-based clustering methods, such as STING [ 134 ], CLIQUE [ 2 ], etc.; model-based clustering such as neural network learning [ 141 ], GMM [ 94 ], SOM [ 18 , 104 ], etc.; constrained-based methods such as COP K-means [ 131 ], CMWK-Means [ 25 ], etc. are used in the area. Recently, Sarker et al. [ 111 ] proposed a hierarchical clustering method, BOTS [ 111 ] based on bottom-up agglomerative technique for capturing user’s similar behavioral characteristics over time. The key benefit of agglomerative hierarchical clustering is that the tree-structure hierarchy created by agglomerative clustering is more informative than an unstructured set of flat clusters, which can assist in better decision-making in relevant application areas in data science.

Association Rule Analysis

Association rule learning is known as a rule-based machine learning system, an unsupervised learning method is typically used to establish a relationship among variables. This is a descriptive technique often used to analyze large datasets for discovering interesting relationships or patterns. The association learning technique’s main strength is its comprehensiveness, as it produces all associations that meet user-specified constraints including minimum support and confidence value [ 138 ].

Association rules allow a data scientist to identify trends, associations, and co-occurrences between data sets inside large data collections. In a supermarket, for example, associations infer knowledge about the buying behavior of consumers for different items, which helps to change the marketing and sales plan. In healthcare, to better diagnose patients, physicians may use association guidelines. Doctors can assess the conditional likelihood of a given illness by comparing symptom associations in the data from previous cases using association rules and machine learning-based data analysis. Similarly, association rules are useful for consumer behavior analysis and prediction, customer market analysis, bioinformatics, weblog mining, recommendation systems, etc.

Several types of association rules have been proposed in the area, such as frequent pattern based [ 4 , 47 , 73 ], logic-based [ 31 ], tree-based [ 39 ], fuzzy-rules [ 126 ], belief rule [ 148 ] etc. The rule learning techniques such as AIS [ 3 ], Apriori [ 4 ], Apriori-TID and Apriori-Hybrid [ 4 ], FP-Tree [ 39 ], Eclat [ 144 ], RARM [ 24 ] exist to solve the relevant business problems. Apriori [ 4 ] is the most commonly used algorithm for discovering association rules from a given dataset among the association rule learning techniques [ 145 ]. The recent association rule-learning technique ABC-RuleMiner proposed in our earlier paper by Sarker et al. [ 113 ] could give significant results in terms of generating non-redundant rules that can be used for smart decision making according to human preferences, within the area of data science applications.

Time-Series Analysis and Forecasting

A time series is typically a series of data points indexed in time order particularly, by date, or timestamp [ 111 ]. Depending on the frequency, the time-series can be different types such as annually, e.g., annual budget, quarterly, e.g., expenditure, monthly, e.g., air traffic, weekly, e.g., sales quantity, daily, e.g., weather, hourly, e.g., stock price, minute-wise, e.g., inbound calls in a call center, and even second-wise, e.g., web traffic, and so on in relevant domains.

A mathematical method dealing with such time-series data, or the procedure of fitting a time series to a proper model is termed time-series analysis. Many different time series forecasting algorithms and analysis methods can be applied to extract the relevant information. For instance, to do time-series forecasting for future patterns, the autoregressive (AR) model [ 130 ] learns the behavioral trends or patterns of past data. Moving average (MA) [ 40 ] is another simple and common form of smoothing used in time series analysis and forecasting that uses past forecasted errors in a regression-like model to elaborate an averaged trend across the data. The autoregressive moving average (ARMA) [ 12 , 120 ] combines these two approaches, where autoregressive extracts the momentum and pattern of the trend and moving average capture the noise effects. The most popular and frequently used time-series model is the autoregressive integrated moving average (ARIMA) model [ 12 , 120 ]. ARIMA model, a generalization of an ARMA model, is more flexible than other statistical models such as exponential smoothing or simple linear regression. In terms of data, the ARMA model can only be used for stationary time-series data, while the ARIMA model includes the case of non-stationarity as well. Similarly, seasonal autoregressive integrated moving average (SARIMA), autoregressive fractionally integrated moving average (ARFIMA), autoregressive moving average model with exogenous inputs model (ARMAX model) are also used in time-series models [ 120 ].

In addition to the stochastic methods for time-series modeling and forecasting, machine and deep learning-based approach can be used for effective time-series analysis and forecasting. For instance, in our earlier paper, Sarker et al. [ 111 ] present a bottom-up clustering-based time-series analysis to capture the mobile usage behavioral patterns of the users. Figure Figure5 5 shows an example of producing aggregate time segments Seg_i from initial time slices TS_i based on similar behavioral characteristics that are used in our bottom-up clustering approach, where D represents the dominant behavior BH_i of the users, mentioned above [ 111 ]. The authors in [ 118 ], used a long short-term memory (LSTM) model, a kind of recurrent neural network (RNN) deep learning model, in forecasting time-series that outperform traditional approaches such as the ARIMA model. Time-series analysis is commonly used these days in various fields such as financial, manufacturing, business, social media, event data (e.g., clickstreams and system events), IoT and smartphone data, and generally in any applied science and engineering temporal measurement domain. Thus, it covers a wide range of application areas in data science.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_765_Fig5_HTML.jpg

An example of producing aggregate time segments from initial time slices based on similar behavioral characteristics

Opinion Mining and Sentiment Analysis

Sentiment analysis or opinion mining is the computational study of the opinions, thoughts, emotions, assessments, and attitudes of people towards entities such as products, services, organizations, individuals, issues, events, topics, and their attributes [ 71 ]. There are three kinds of sentiments: positive, negative, and neutral, along with more extreme feelings such as angry, happy and sad, or interested or not interested, etc. More refined sentiments to evaluate the feelings of individuals in various situations can also be found according to the problem domain.

Although the task of opinion mining and sentiment analysis is very challenging from a technical point of view, it’s very useful in real-world practice. For instance, a business always aims to obtain an opinion from the public or customers about its products and services to refine the business policy as well as a better business decision. It can thus benefit a business to understand the social opinion of their brand, product, or service. Besides, potential customers want to know what consumers believe they have when they use a service or purchase a product. Document-level, sentence level, aspect level, and concept level, are the possible levels of opinion mining in the area [ 45 ].

Several popular techniques such as lexicon-based including dictionary-based and corpus-based methods, machine learning including supervised and unsupervised learning, deep learning, and hybrid methods are used in sentiment analysis-related tasks [ 70 ]. To systematically define, extract, measure, and analyze affective states and subjective knowledge, it incorporates the use of statistics, natural language processing (NLP), machine learning as well as deep learning methods. Sentiment analysis is widely used in many applications, such as reviews and survey data, web and social media, and healthcare content, ranging from marketing and customer support to clinical practice. Thus sentiment analysis has a big influence in many data science applications, where public sentiment is involved in various real-world issues.

Behavioral Data and Cohort Analysis

Behavioral analytics is a recent trend that typically reveals new insights into e-commerce sites, online gaming, mobile and smartphone applications, IoT user behavior, and many more [ 112 ]. The behavioral analysis aims to understand how and why the consumers or users behave, allowing accurate predictions of how they are likely to behave in the future. For instance, it allows advertisers to make the best offers with the right client segments at the right time. Behavioral analytics, including traffic data such as navigation paths, clicks, social media interactions, purchase decisions, and marketing responsiveness, use the large quantities of raw user event information gathered during sessions in which people use apps, games, or websites. In our earlier papers Sarker et al. [ 101 , 111 , 113 ] we have discussed how to extract users phone usage behavioral patterns utilizing real-life phone log data for various purposes.

In the real-world scenario, behavioral analytics is often used in e-commerce, social media, call centers, billing systems, IoT systems, political campaigns, and other applications, to find opportunities for optimization to achieve particular outcomes. Cohort analysis is a branch of behavioral analytics that involves studying groups of people over time to see how their behavior changes. For instance, it takes data from a given data set (e.g., an e-commerce website, web application, or online game) and separates it into related groups for analysis. Various machine learning techniques such as behavioral data clustering [ 111 ], behavioral decision tree classification [ 109 ], behavioral association rules [ 113 ], etc. can be used in the area according to the goal. Besides, the concept of RecencyMiner, proposed in our earlier paper Sarker et al. [ 108 ] that takes into account recent behavioral patterns could be effective while analyzing behavioral data as it may not be static in the real-world changes over time.

Anomaly Detection or Outlier Analysis

Anomaly detection, also known as Outlier analysis is a data mining step that detects data points, events, and/or findings that deviate from the regularities or normal behavior of a dataset. Anomalies are usually referred to as outliers, abnormalities, novelties, noise, inconsistency, irregularities, and exceptions [ 63 , 114 ]. Techniques of anomaly detection may discover new situations or cases as deviant based on historical data through analyzing the data patterns. For instance, identifying fraud or irregular transactions in finance is an example of anomaly detection.

It is often used in preprocessing tasks for the deletion of anomalous or inconsistency in the real-world data collected from various data sources including user logs, devices, networks, and servers. For anomaly detection, several machine learning techniques can be used, such as k-nearest neighbors, isolation forests, cluster analysis, etc [ 105 ]. The exclusion of anomalous data from the dataset also results in a statistically significant improvement in accuracy during supervised learning [ 101 ]. However, extracting appropriate features, identifying normal behaviors, managing imbalanced data distribution, addressing variations in abnormal behavior or irregularities, the sparse occurrence of abnormal events, environmental variations, etc. could be challenging in the process of anomaly detection. Detection of anomalies can be applicable in a variety of domains such as cybersecurity analytics, intrusion detections, fraud detection, fault detection, health analytics, identifying irregularities, detecting ecosystem disturbances, and many more. This anomaly detection can be considered a significant task for building effective systems with higher accuracy within the area of data science.

Factor Analysis

Factor analysis is a collection of techniques for describing the relationships or correlations between variables in terms of more fundamental entities known as factors [ 23 ]. It’s usually used to organize variables into a small number of clusters based on their common variance, where mathematical or statistical procedures are used. The goals of factor analysis are to determine the number of fundamental influences underlying a set of variables, calculate the degree to which each variable is associated with the factors, and learn more about the existence of the factors by examining which factors contribute to output on which variables. The broad purpose of factor analysis is to summarize data so that relationships and patterns can be easily interpreted and understood [ 143 ].

Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) are the two most popular factor analysis techniques. EFA seeks to discover complex trends by analyzing the dataset and testing predictions, while CFA tries to validate hypotheses and uses path analysis diagrams to represent variables and factors [ 143 ]. Factor analysis is one of the algorithms for unsupervised machine learning that is used for minimizing dimensionality. The most common methods for factor analytics are principal components analysis (PCA), principal axis factoring (PAF), and maximum likelihood (ML) [ 48 ]. Methods of correlation analysis such as Pearson correlation, canonical correlation, etc. may also be useful in the field as they can quantify the statistical relationship between two continuous variables, or association. Factor analysis is commonly used in finance, marketing, advertising, product management, psychology, and operations research, and thus can be considered as another significant analytical method within the area of data science.

Log Analysis

Logs are commonly used in system management as logs are often the only data available that record detailed system runtime activities or behaviors in production [ 44 ]. Log analysis is thus can be considered as the method of analyzing, interpreting, and capable of understanding computer-generated records or messages, also known as logs. This can be device log, server log, system log, network log, event log, audit trail, audit record, etc. The process of creating such records is called data logging.

Logs are generated by a wide variety of programmable technologies, including networking devices, operating systems, software, and more. Phone call logs [ 88 , 110 ], SMS Logs [ 28 ], mobile apps usages logs [ 124 , 149 ], notification logs [ 77 ], game Logs [ 82 ], context logs [ 16 , 149 ], web logs [ 37 ], smartphone life logs [ 95 ], etc. are some examples of log data for smartphone devices. The main characteristics of these log data is that it contains users’ actual behavioral activities with their devices. Similar other log data can be search logs [ 50 , 133 ], application logs [ 26 ], server logs [ 33 ], network logs [ 57 ], event logs [ 83 ], network and security logs [ 142 ] etc.

Several techniques such as classification and tagging, correlation analysis, pattern recognition methods, anomaly detection methods, machine learning modeling, etc. [ 105 ] can be used for effective log analysis. Log analysis can assist in compliance with security policies and industry regulations, as well as provide a better user experience by encouraging the troubleshooting of technical problems and identifying areas where efficiency can be improved. For instance, web servers use log files to record data about website visitors. Windows event log analysis can help an investigator draw a timeline based on the logging information and the discovered artifacts. Overall, advanced analytics methods by taking into account machine learning modeling can play a significant role to extract insightful patterns from these log data, which can be used for building automated and smart applications, and thus can be considered as a key working area in data science.

Neural Networks and Deep Learning Analysis

Deep learning is a form of machine learning that uses artificial neural networks to create a computational architecture that learns from data by combining multiple processing layers, such as the input, hidden, and output layers [ 38 ]. The key benefit of deep learning over conventional machine learning methods is that it performs better in a variety of situations, particularly when learning from large datasets [ 114 , 140 ].

The most common deep learning algorithms are: multi-layer perceptron (MLP) [ 85 ], convolutional neural network (CNN or ConvNet) [ 67 ], long short term memory recurrent neural network (LSTM-RNN) [ 34 ]. Figure Figure6 6 shows a structure of an artificial neural network modeling with multiple processing layers. The Backpropagation technique [ 38 ] is used to adjust the weight values internally while building the model. Convolutional neural networks (CNNs) [ 67 ] improve on the design of traditional artificial neural networks (ANNs), which include convolutional layers, pooling layers, and fully connected layers. It is commonly used in a variety of fields, including natural language processing, speech recognition, image processing, and other autocorrelated data since it takes advantage of the two-dimensional (2D) structure of the input data. AlexNet [ 60 ], Xception [ 21 ], Inception [ 125 ], Visual Geometry Group (VGG) [ 42 ], ResNet [ 43 ], etc., and other advanced deep learning models based on CNN are also used in the field.

An external file that holds a picture, illustration, etc.
Object name is 42979_2021_765_Fig6_HTML.jpg

A structure of an artificial neural network modeling with multiple processing layers

In addition to CNN, recurrent neural network (RNN) architecture is another popular method used in deep learning. Long short-term memory (LSTM) is a popular type of recurrent neural network architecture used broadly in the area of deep learning. Unlike traditional feed-forward neural networks, LSTM has feedback connections. Thus, LSTM networks are well-suited for analyzing and learning sequential data, such as classifying, sorting, and predicting data based on time-series data. Therefore, when the data is in a sequential format, such as time, sentence, etc., LSTM can be used, and it is widely used in the areas of time-series analysis, natural language processing, speech recognition, and so on.

In addition to the most popular deep learning methods mentioned above, several other deep learning approaches [ 104 ] exist in the field for various purposes. The self-organizing map (SOM) [ 58 ], for example, uses unsupervised learning to represent high-dimensional data as a 2D grid map, reducing dimensionality. Another learning technique that is commonly used for dimensionality reduction and feature extraction in unsupervised learning tasks is the autoencoder (AE) [ 10 ]. Restricted Boltzmann machines (RBM) can be used for dimensionality reduction, classification, regression, collaborative filtering, feature learning, and topic modeling, according to [ 46 ]. A deep belief network (DBN) is usually made up of a backpropagation neural network and unsupervised networks like restricted Boltzmann machines (RBMs) or autoencoders (BPNN) [ 136 ]. A generative adversarial network (GAN) [ 35 ] is a deep learning network that can produce data with characteristics that are similar to the input data. Transfer learning is common worldwide presently because it can train deep neural networks with a small amount of data, which is usually the re-use of a pre-trained model on a new problem [ 137 ]. These deep learning methods can perform well, particularly, when learning from large-scale datasets [ 105 , 140 ]. In our previous article Sarker et al. [ 104 ], we have summarized a brief discussion of various artificial neural networks (ANN) and deep learning (DL) models mentioned above, which can be used in a variety of data science and analytics tasks.

Real-World Application Domains

Almost every industry or organization is impacted by data, and thus “Data Science” including advanced analytics with machine learning modeling can be used in business, marketing, finance, IoT systems, cybersecurity, urban management, health care, government policies, and every possible industries, where data gets generated. In the following, we discuss ten most popular application areas based on data science and analytics.

Business or financial data science: In general, business data science can be considered as the study of business or e-commerce data to obtain insights about a business that can typically lead to smart decision-making as well as taking high-quality actions [ 90 ]. Data scientists can develop algorithms or data-driven models predicting customer behavior, identifying patterns and trends based on historical business data, which can help companies to reduce costs, improve service delivery, and generate recommendations for better decision-making. Eventually, business automation, intelligence, and efficiency can be achieved through the data science process discussed earlier, where various advanced analytics methods and machine learning modeling based on the collected data are the keys. Many online retailers, such as Amazon [ 76 ], can improve inventory management, avoid out-of-stock situations, and optimize logistics and warehousing using predictive modeling based on machine learning techniques [ 105 ]. In terms of finance, the historical data is related to financial institutions to make high-stakes business decisions, which is mostly used for risk management, fraud prevention, credit allocation, customer analytics, personalized services, algorithmic trading, etc. Overall, data science methodologies can play a key role in the future generation business or finance industry, particularly in terms of business automation, intelligence, and smart decision-making and systems.
Manufacturing or industrial data science: To compete in global production capability, quality, and cost, manufacturing industries have gone through many industrial revolutions [ 14 ]. The latest fourth industrial revolution, also known as Industry 4.0, is the emerging trend of automation and data exchange in manufacturing technology. Thus industrial data science, which is the study of industrial data to obtain insights that can typically lead to optimizing industrial applications, can play a vital role in such revolution. Manufacturing industries generate a large amount of data from various sources such as sensors, devices, networks, systems, and applications [ 6 , 68 ]. The main categories of industrial data include large-scale data devices, life-cycle production data, enterprise operation data, manufacturing value chain sources, and collaboration data from external sources [ 132 ]. The data needs to be processed, analyzed, and secured to help improve the system’s efficiency, safety, and scalability. Data science modeling thus can be used to maximize production, reduce costs and raise profits in manufacturing industries.
Medical or health data science: Healthcare is one of the most notable fields where data science is making major improvements. Health data science involves the extrapolation of actionable insights from sets of patient data, typically collected from electronic health records. To help organizations, improve the quality of treatment, lower the cost of care, and improve the patient experience, data can be obtained from several sources, e.g., the electronic health record, billing claims, cost estimates, and patient satisfaction surveys, etc., to analyze. In reality, healthcare analytics using machine learning modeling can minimize medical costs, predict infectious outbreaks, prevent preventable diseases, and generally improve the quality of life [ 81 , 119 ]. Across the global population, the average human lifespan is growing, presenting new challenges to today’s methods of delivery of care. Thus health data science modeling can play a role in analyzing current and historical data to predict trends, improve services, and even better monitor the spread of diseases. Eventually, it may lead to new approaches to improve patient care, clinical expertise, diagnosis, and management.
IoT data science: Internet of things (IoT) [ 9 ] is a revolutionary technical field that turns every electronic system into a smarter one and is therefore considered to be the big frontier that can enhance almost all activities in our lives. Machine learning has become a key technology for IoT applications because it uses expertise to identify patterns and generate models that help predict future behavior and events [ 112 ]. One of the IoT’s main fields of application is a smart city, which uses technology to improve city services and citizens’ living experiences. For example, using the relevant data, data science methods can be used for traffic prediction in smart cities, to estimate the total usage of energy of the citizens for a particular period. Deep learning-based models in data science can be built based on a large scale of IoT datasets [ 7 , 104 ]. Overall, data science and analytics approaches can aid modeling in a variety of IoT and smart city services, including smart governance, smart homes, education, connectivity, transportation, business, agriculture, health care, and industry, and many others.
Cybersecurity data science: Cybersecurity, or the practice of defending networks, systems, hardware, and data from digital attacks, is one of the most important fields of Industry 4.0 [ 114 , 121 ]. Data science techniques, particularly machine learning, have become a crucial cybersecurity technology that continually learns to identify trends by analyzing data, better detecting malware in encrypted traffic, finding insider threats, predicting where bad neighborhoods are online, keeping people safe while surfing, or protecting information in the cloud by uncovering suspicious user activity [ 114 ]. For instance, machine learning and deep learning-based security modeling can be used to effectively detect various types of cyberattacks or anomalies [ 103 , 106 ]. To generate security policy rules, association rule learning can play a significant role to build rule-based systems [ 102 ]. Deep learning-based security models can perform better when utilizing the large scale of security datasets [ 140 ]. Thus data science modeling can enable professionals in cybersecurity to be more proactive in preventing threats and reacting in real-time to active attacks, through extracting actionable insights from the security datasets.
Behavioral data science: Behavioral data is information produced as a result of activities, most commonly commercial behavior, performed on a variety of Internet-connected devices, such as a PC, tablet, or smartphones [ 112 ]. Websites, mobile applications, marketing automation systems, call centers, help desks, and billing systems, etc. are all common sources of behavioral data. Behavioral data is much more than just data, which is not static data [ 108 ]. Advanced analytics of these data including machine learning modeling can facilitate in several areas such as predicting future sales trends and product recommendations in e-commerce and retail; predicting usage trends, load, and user preferences in future releases in online gaming; determining how users use an application to predict future usage and preferences in application development; breaking users down into similar groups to gain a more focused understanding of their behavior in cohort analysis; detecting compromised credentials and insider threats by locating anomalous behavior, or making suggestions, etc. Overall, behavioral data science modeling typically enables to make the right offers to the right consumers at the right time on various common platforms such as e-commerce platforms, online games, web and mobile applications, and IoT. In social context, analyzing the behavioral data of human being using advanced analytics methods and the extracted insights from social data can be used for data-driven intelligent social services, which can be considered as social data science.
Mobile data science: Today’s smart mobile phones are considered as “next-generation, multi-functional cell phones that facilitate data processing, as well as enhanced wireless connectivity” [ 146 ]. In our earlier paper [ 112 ], we have shown that users’ interest in “Mobile Phones” is more and more than other platforms like “Desktop Computer”, “Laptop Computer” or “Tablet Computer” in recent years. People use smartphones for a variety of activities, including e-mailing, instant messaging, online shopping, Internet surfing, entertainment, social media such as Facebook, Linkedin, and Twitter, and various IoT services such as smart cities, health, and transportation services, and many others. Intelligent apps are based on the extracted insight from the relevant datasets depending on apps characteristics, such as action-oriented, adaptive in nature, suggestive and decision-oriented, data-driven, context-awareness, and cross-platform operation [ 112 ]. As a result, mobile data science, which involves gathering a large amount of mobile data from various sources and analyzing it using machine learning techniques to discover useful insights or data-driven trends, can play an important role in the development of intelligent smartphone applications.
Multimedia data science: Over the last few years, a big data revolution in multimedia management systems has resulted from the rapid and widespread use of multimedia data, such as image, audio, video, and text, as well as the ease of access and availability of multimedia sources. Currently, multimedia sharing websites, such as Yahoo Flickr, iCloud, and YouTube, and social networks such as Facebook, Instagram, and Twitter, are considered as valuable sources of multimedia big data [ 89 ]. People, particularly younger generations, spend a lot of time on the Internet and social networks to connect with others, exchange information, and create multimedia data, thanks to the advent of new technology and the advanced capabilities of smartphones and tablets. Multimedia analytics deals with the problem of effectively and efficiently manipulating, handling, mining, interpreting, and visualizing various forms of data to solve real-world problems. Text analysis, image or video processing, computer vision, audio or speech processing, and database management are among the solutions available for a range of applications including healthcare, education, entertainment, and mobile devices.
Smart cities or urban data science: Today, more than half of the world’s population live in urban areas or cities [ 80 ] and considered as drivers or hubs of economic growth, wealth creation, well-being, and social activity [ 96 , 116 ]. In addition to cities, “Urban area” can refer to the surrounding areas such as towns, conurbations, or suburbs. Thus, a large amount of data documenting daily events, perceptions, thoughts, and emotions of citizens or people are recorded, that are loosely categorized into personal data, e.g., household, education, employment, health, immigration, crime, etc., proprietary data, e.g., banking, retail, online platforms data, etc., government data, e.g., citywide crime statistics, or government institutions, etc., Open and public data, e.g., data.gov, ordnance survey, and organic and crowdsourced data, e.g., user-generated web data, social media, Wikipedia, etc. [ 29 ]. The field of urban data science typically focuses on providing more effective solutions from a data-driven perspective, through extracting knowledge and actionable insights from such urban data. Advanced analytics of these data using machine learning techniques [ 105 ] can facilitate the efficient management of urban areas including real-time management, e.g., traffic flow management, evidence-based planning decisions which pertain to the longer-term strategic role of forecasting for urban planning, e.g., crime prevention, public safety, and security, or framing the future, e.g., political decision-making [ 29 ]. Overall, it can contribute to government and public planning, as well as relevant sectors including retail, financial services, mobility, health, policing, and utilities within a data-rich urban environment through data-driven smart decision-making and policies, which lead to smart cities and improve the quality of human life.
Smart villages or rural data science: Rural areas or countryside are the opposite of urban areas, that include villages, hamlets, or agricultural areas. The field of rural data science typically focuses on making better decisions and providing more effective solutions that include protecting public safety, providing critical health services, agriculture, and fostering economic development from a data-driven perspective, through extracting knowledge and actionable insights from the collected rural data. Advanced analytics of rural data including machine learning [ 105 ] modeling can facilitate providing new opportunities for them to build insights and capacity to meet current needs and prepare for their futures. For instance, machine learning modeling [ 105 ] can help farmers to enhance their decisions to adopt sustainable agriculture utilizing the increasing amount of data captured by emerging technologies, e.g., the internet of things (IoT), mobile technologies and devices, etc. [ 1 , 51 , 52 ]. Thus, rural data science can play a very important role in the economic and social development of rural areas, through agriculture, business, self-employment, construction, banking, healthcare, governance, or other services, etc. that lead to smarter villages.

Overall, we can conclude that data science modeling can be used to help drive changes and improvements in almost every sector in our real-world life, where the relevant data is available to analyze. To gather the right data and extract useful knowledge or actionable insights from the data for making smart decisions is the key to data science modeling in any application domain. Based on our discussion on the above ten potential real-world application domains by taking into account data-driven smart computing and decision making, we can say that the prospects of data science and the role of data scientists are huge for the future world. The “Data Scientists” typically analyze information from multiple sources to better understand the data and business problems, and develop machine learning-based analytical modeling or algorithms, or data-driven tools, or solutions, focused on advanced analytics, which can make today’s computing process smarter, automated, and intelligent.

Challenges and Research Directions

Our study on data science and analytics, particularly data science modeling in “ Understanding data science modeling ”, advanced analytics methods and smart computing in “ Advanced analytics methods and smart computing ”, and real-world application areas in “ Real-world application domains ” open several research issues in the area of data-driven business solutions and eventual data products. Thus, in this section, we summarize and discuss the challenges faced and the potential research opportunities and future directions to build data-driven products.

Understanding the real-world business problems and associated data including nature, e.g., what forms, type, size, labels, etc., is the first challenge in the data science modeling, discussed briefly in “ Understanding data science modeling ”. This is actually to identify, specify, represent and quantify the domain-specific business problems and data according to the requirements. For a data-driven effective business solution, there must be a well-defined workflow before beginning the actual data analysis work. Furthermore, gathering business data is difficult because data sources can be numerous and dynamic. As a result, gathering different forms of real-world data, such as structured, or unstructured, related to a specific business issue with legal access, which varies from application to application, is challenging. Moreover, data annotation, which is typically the process of categorization, tagging, or labeling of raw data, for the purpose of building data-driven models, is another challenging issue. Thus, the primary task is to conduct a more in-depth analysis of data collection and dynamic annotation methods. Therefore, understanding the business problem, as well as integrating and managing the raw data gathered for efficient data analysis, may be one of the most challenging aspects of working in the field of data science and analytics.
The next challenge is the extraction of the relevant and accurate information from the collected data mentioned above. The main focus of data scientists is typically to disclose, describe, represent, and capture data-driven intelligence for actionable insights from data. However, the real-world data may contain many ambiguous values, missing values, outliers, and meaningless data [ 101 ]. The advanced analytics methods including machine and deep learning modeling, discussed in “ Advanced analytics methods and smart computing ”, highly impact the quality, and availability of the data. Thus understanding real-world business scenario and associated data, to whether, how, and why they are insufficient, missing, or problematic, then extend or redevelop the existing methods, such as large-scale hypothesis testing, learning inconsistency, and uncertainty, etc. to address the complexities in data and business problems is important. Therefore, developing new techniques to effectively pre-process the diverse data collected from multiple sources, according to their nature and characteristics could be another challenging task.
Understanding and selecting the appropriate analytical methods to extract the useful insights for smart decision-making for a particular business problem is the main issue in the area of data science. The emphasis of advanced analytics is more on anticipating the use of data to detect patterns to determine what is likely to occur in the future. Basic analytics offer a description of data in general, while advanced analytics is a step forward in offering a deeper understanding of data and helping to granular data analysis. Thus, understanding the advanced analytics methods, especially machine and deep learning-based modeling is the key. The traditional learning techniques mentioned in “ Advanced analytics methods and smart computing ” may not be directly applicable for the expected outcome in many cases. For instance, in a rule-based system, the traditional association rule learning technique [ 4 ] may produce redundant rules from the data that makes the decision-making process complex and ineffective [ 113 ]. Thus, a scientific understanding of the learning algorithms, mathematical properties, how the techniques are robust or fragile to input data, is needed to understand. Therefore, a deeper understanding of the strengths and drawbacks of the existing machine and deep learning methods [ 38 , 105 ] to solve a particular business problem is needed, consequently to improve or optimize the learning algorithms according to the data characteristics, or to propose the new algorithm/techniques with higher accuracy becomes a significant challenging issue for the future generation data scientists.
The traditional data-driven models or systems typically use a large amount of business data to generate data-driven decisions. In several application fields, however, the new trends are more likely to be interesting and useful for modeling and predicting the future than older ones. For example, smartphone user behavior modeling, IoT services, stock market forecasting, health or transport service, job market analysis, and other related areas where time-series and actual human interests or preferences are involved over time. Thus, rather than considering the traditional data analysis, the concept of RecencyMiner, i.e., recent pattern-based extracted insight or knowledge proposed in our earlier paper Sarker et al. [ 108 ] might be effective. Therefore, to propose the new techniques by taking into account the recent data patterns, and consequently to build a recency-based data-driven model for solving real-world problems, is another significant challenging issue in the area.
The most crucial task for a data-driven smart system is to create a framework that supports data science modeling discussed in “ Understanding data science modeling ”. As a result, advanced analytical methods based on machine learning or deep learning techniques can be considered in such a system to make the framework capable of resolving the issues. Besides, incorporating contextual information such as temporal context, spatial context, social context, environmental context, etc. [ 100 ] can be used for building an adaptive, context-aware, and dynamic model or framework, depending on the problem domain. As a result, a well-designed data-driven framework, as well as experimental evaluation, is a very important direction to effectively solve a business problem in a particular domain, as well as a big challenge for the data scientists.
In several important application areas such as autonomous cars, criminal justice, health care, recruitment, housing, management of the human resource, public safety, where decisions made by models, or AI agents, have a direct effect on human lives. As a result, there is growing concerned about whether these decisions can be trusted, to be right, reasonable, ethical, personalized, accurate, robust, and secure, particularly in the context of adversarial attacks [ 104 ]. If we can explain the result in a meaningful way, then the model can be better trusted by the end-user. For machine-learned models, new trust properties yield new trade-offs, such as privacy versus accuracy; robustness versus efficiency; fairness versus robustness. Therefore, incorporating trustworthy AI particularly, data-driven or machine learning modeling could be another challenging issue in the area.

In the above, we have summarized and discussed several challenges and the potential research opportunities and directions, within the scope of our study in the area of data science and advanced analytics. The data scientists in academia/industry and the researchers in the relevant area have the opportunity to contribute to each issue identified above and build effective data-driven models or systems, to make smart decisions in the corresponding business domains.

In this paper, we have presented a comprehensive view on data science including various types of advanced analytical methods that can be applied to enhance the intelligence and the capabilities of an application. We have also visualized the current popularity of data science and machine learning-based advanced analytical modeling and also differentiate these from the relevant terms used in the area, to make the position of this paper. A thorough study on the data science modeling with its various processing modules that are needed to extract the actionable insights from the data for a particular business problem and the eventual data product. Thus, according to our goal, we have briefly discussed how different data modules can play a significant role in a data-driven business solution through the data science process. For this, we have also summarized various types of advanced analytical methods and outcomes as well as machine learning modeling that are needed to solve the associated business problems. Thus, this study’s key contribution has been identified as the explanation of different advanced analytical methods and their applicability in various real-world data-driven applications areas including business, healthcare, cybersecurity, urban and rural data science, and so on by taking into account data-driven smart computing and decision making.

Finally, within the scope of our study, we have outlined and discussed the challenges we faced, as well as possible research opportunities and future directions. As a result, the challenges identified provide promising research opportunities in the field that can be explored with effective solutions to improve the data-driven model and systems. Overall, we conclude that our study of advanced analytical solutions based on data science and machine learning methods, leads in a positive direction and can be used as a reference guide for future research and applications in the field of data science and its real-world applications by both academia and industry professionals.

Declarations

The author declares no conflict of interest.

This article is part of the topical collection “Advances in Computational Approaches for Artificial Intelligence, Image Processing, IoT and Cloud Applications” guest edited by Bhanu Prakash K N and M. Shivakumar.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The Community

Modern analyst blog, community blog.

Member Profiles

Networking Opportunities

Community spotlight, business analysis glossary, articles listing, business analyst humor, self assessment.

Training Courses
Organizations
Resume Writing Tips
Interview Questions

Let Us Help Your Business

Advertise with us, rss feeds & syndication, privacy policy.

Business Analyst Community & Resources | Modern Analyst

Writing a Good Data Analysis Report: 7 Steps

As a data analyst, you feel most comfortable when you’re alone with all the numbers and data. You’re able to analyze them with confidence and reach the results you were asked to find. But, this is not the end of the road for you. You still need to write a data analysis report explaining your findings to the laymen - your clients or coworkers.

That means you need to think about your target audience, that is the people who’ll be reading your report.

They don’t have nearly as much knowledge about data analysis as you do. So, your report needs to be straightforward and informative. The article below will help you learn how to do it. Let’s take a look at some practical tips you can apply to your data analysis report writing and the benefits of doing so.

Writing a Good Data Analysis Report: 7 Steps

source: Pexels

Data Analysis Report Writing: 7 Steps

The process of writing a data analysis report is far from simple, but you can master it quickly, with the right guidance and examples of similar reports .

This is why we've prepared a step-by-step guide that will cover everything you need to know about this process, as simply as possible. Let’s get to it.

Consider Your Audience

You are writing your report for a certain target audience, and you need to keep them in mind while writing. Depending on their level of expertise, you’ll need to adjust your report and ensure it speaks to them. So, before you go any further, ask yourself:

Who will be reading this report? How well do they understand the subject?

Let’s say you’re explaining the methodology you used to reach your conclusions and find the data in question. If the reader isn’t familiar with these tools and software, you’ll have to simplify it for them and provide additional explanations.

So, you won't be writing the same type of report for a coworker who's been on your team for years or a client who's seeing data analysis for the first time. Based on this determining factor, you'll think about:

the language and vocabulary you’re using

abbreviations and level of technicality

the depth you’ll go into to explain something

the type of visuals you’ll add

Your readers’ expertise dictates the tone of your report and you need to consider it before writing even a single word.

Draft Out the Sections

The next thing you need to do is create a draft of your data analysis report. This is just a skeleton of what your report will be once you finish. But, you need a starting point.

So, think about the sections you'll include and what each section is going to cover. Typically, your report should be divided into the following sections:

Introduction

Body (Data, Methods, Analysis, Results)

For each section, write down several short bullet points regarding the content to cover. Below, we'll discuss each section more elaborately.

Develop The Body

The body of your report is the most important section. You need to organize it into subsections and present all the information your readers will be interested in.

We suggest the following subsections.

Explain what data you used to conduct your analysis. Be specific and explain how you gathered the data, what your sample was, what tools and resources you’ve used, and how you’ve organized your data. This will give the reader a deeper understanding of your data sample and make your report more solid.

Also, explain why you choose the specific data for your sample. For instance, you may say “ The sample only includes data of the customers acquired during 2021, in the peak of the pandemic.”

Next, you need to explain what methods you’ve used to analyze the data. This simply means you need to explain why and how you choose specific methods. You also need to explain why these methods are the best fit for the goals you’ve set and the results you’re trying to reach.

Back up your methodology section with background information on each method or tool used. Explain how these resources are typically used in data analysis.

After you've explained the data and methods you've used, this next section brings those two together. The analysis section shows how you've analyzed the specific data using the specific methods.

This means you’ll show your calculations, charts, and analyses, step by step. Add descriptions and explain each of the steps. Try making it as simple as possible so that even the most inexperienced of your readers understand every word.

This final section of the body can be considered the most important section of your report. Most of your clients will skim the rest of the report to reach this section.

Because it’ll answer the questions you’ve all raised. It shares the results that were reached and gives the reader new findings, facts, and evidence.

So, explain and describe the results using numbers. Then, add a written description of what each of the numbers stands for and what it means for the entire analysis. Summarize your results and finalize the report on a strong note.

Write the Introduction

Yes, it may seem strange to write the introduction section at the end, but it’s the smartest way to do it. This section briefly explains what the report will cover. That’s why you should write it after you’ve finished writing the Body.

In your introduction, explain:

the question you’ve raised and answered with the analysis

context of the analysis and background information

short outline of the report

Simply put, you’re telling your audience what to expect.

Add a Short Conclusion

Finally, the last section of your paper is a brief conclusion. It only repeats what you described in the Body, but only points out the most important details.

It should be less than a page long and use straightforward language to deliver the most important findings. It should also include a paragraph about the implications and importance of those findings for the client, customer, business, or company that hired you.

Include Data Visualization Elements

You have all the data and numbers in your mind and find it easy to understand what the data is saying. But, to a layman or someone less experienced than yourself, it can be quite a puzzle. All the information that your data analysis has found can create a mess in the head of your reader.

So, you should simplify it by using data visualization elements.

Firstly, let’s define what are the most common and useful data visualization elements you can use in your report:

There are subcategories to each of the elements and you should explore them all to decide what will do the best job for your specific case. For instance, you'll find different types of charts including, pie charts, bar charts, area charts, or spider charts.

For each data visualization element, add a brief description to tell the readers what information it contains. You can also add a title to each element and create a table of contents for visual elements only.

Proofread & Edit Before Submission

All the hard work you’ve invested in writing a good data analysis report might go to waste if you don’t edit and proofread. Proofreading and editing will help you eliminate potential mistakes, but also take another objective look at your report.

First, do the editing part. It includes:

reading the whole report objectively, like you’re seeing it for the first time

leaving an open mind for changes

adding or removing information

rearranging sections

finding better words to say something

You should repeat the editing phase a couple of times until you're completely happy with the result. Once you're certain the content is all tidied up, you can move on to the proofreading stage. It includes:

finding and removing grammar and spelling mistakes

rethinking vocabulary choices

improving clarity

improving readability

You can use an online proofreading tool to make things faster. If you really want professional help, Grab My Essay is a great choice. Their professional writers can edit and rewrite your entire report, to make sure it’s impeccable before submission.

Whatever you choose to do, proofread yourself or get some help with it, make sure your report is well-organized and completely error-free.

Benefits of Writing Well-Structured Data Analysis Reports

Yes, writing a good data analysis report is a lot of hard work. But, if you understand the benefits of writing it, you’ll be more motivated and willing to invest the time and effort. After knowing how it can help you in different segments of your professional journey, you’ll be more willing to learn how to do it.

Below are the main benefits a data analysis report brings to the table.

Improved Collaboration

When you’re writing a data analysis report, you need to be aware more than one end user is going to use it. Whether it’s your employer, customer, or coworker - you need to make sure they’re all on the same page. And when you write a data analysis report that is easy to understand and learn from, you’re creating a bridge between all these people.

Simply, all of them are given accurate data they can rely on and you’re thus removing the potential misunderstandings that can happen in communication. This improves the overall collaboration level and makes everyone more open and helpful.

Increased Efficiency

People who are reading your data analysis report need the information it contains for some reason. They might use it to do their part of the job, to make decisions, or report further to someone else. Either way, the better your report, the more efficient it'll be. And, if you rely on those people as well, you'll benefit from this increased productivity as well.

Data tells a story about a business, project, or venture. It's able to show how well you've performed, what turned out to be a great move, and what needs to be reimagined. This means that a data analysis report provides valuable insight and measurable KPIs (key performance indicators) that you’re able to use to grow and develop.

Clear Communication

Information is key regardless of the industry you're in or the type of business you're doing. Data analysis finds that information and proves its accuracy and importance. But, if those findings and the information itself aren't communicated clearly, it's like you haven't even found them.

This is why a data analysis report is crucial. It will present the information less technically and bring it closer to the readers.

Final Thoughts

As you can see, it takes some skill and a bit more practice to write a good data analysis report. But, all the effort you invest in writing it will be worth it once the results kick in. You’ll improve the communication between you and your clients, employers, or coworkers. People will be able to understand, rely on, and use the analysis you’ve conducted.

So, don’t be afraid and start writing your first data analysis report. Just follow the 7 steps we’ve listed and use a tool such as ProWebScraper to help you with website data analysis. You’ll be surprised when you see the result of your hard work.

Jessica Fender is a business analyst and a blogger. She writes about business and data analysis, networking in this sector, and acquiring new skills. Her goal is to provide fresh and accurate information that readers can apply instantly.

Data-Driven Decision Making: Leveraging Analytics in Process Management

Article/Paper Categories

Upcoming live webinars, ace the interview.

Roles and Titles

Business Analyst
Business Process Analyst
IT Business Analyst
Requirements Engineer
Business Systems Analyst
Systems Analyst
Data Analyst

Career Resources

Interview Tips
Salary Information
Directory of Links

Community Resources

Project Members

Advertising Opportunities | Contact Us | Privacy Policy

Quantitative Research Essay Examples

A quantitative research essay analyzes numerical data in the form of trends, opinions, or efficiency results. This academic writing genre requires you to generalize the figures across a broad group of people and make a relevant conclusion. The possible research methods comprise questionnaires, polls, and surveys, but the results shall be processed with computational techniques and statistics.

For instance, a paper on organized crime in Texas will calculate the number of offenses committed over a given period and compare the findings with the same period in the past.

Below we’ve gathered dozens of quantitative essay examples to help you brainstorm ideas. You will surely find here a couple of papers that meet your needs.

57 Best Quantitative Research Essay Examples

Audit report for the university of alabama system.

Subjects: Economics Financial Reporting

Decision Making: Starbucks Transformational Experience

Subjects: Business Case Study
Words: 2003

Business Problem Matrix and Research Question Hypotheses

Subjects: Sciences Statistics
Words: 2058

Asians Seeking U.S. Education

Subjects: Education Education Theories
Words: 3014

Facial Feedback Hypothesis

Subjects: Psychological Principles Psychology
Words: 2206

User Satisfaction and Service Quality in Academic Libraries: Use of LibQUAL+

Subjects: Education Education System
Words: 4019

Demography of Harbor Hills, Austin, TX

Subjects: Sociological Theories Sociology
Words: 2050

Predicting Unemployment Rates to Manage Inventory

Subjects: Business Management
Words: 2141

Fuel Consumption for Cars Made in the US and Japan

Subjects: Business Industry

Theoretical Stock Prices

Subjects: Economics Investment

Supply Chain Design: Honda Gulf

Words: 2999

Using Smartphones in Learning

Subjects: Tech & Engineering Technology in Education
Words: 6084

Action Research in Science Education

Subjects: Education Writing & Assignments
Words: 1199

Introduction to Nursing Research

Subjects: Health & Medicine Healthcare Research

Students’ Perception of a Mobile Application for College Course

Words: 1500

Carbon Fiber Reinforced Polymer Application

Subjects: Construction Design
Words: 1440

An Evaluation of the Suitability of ‘New Headway- Intermediate’ by Liz & John Soars

Subjects: Education Pedagogical Approaches
Words: 3456

Parenting Variables in Antenatal Education

Subjects: Family Planning Health & Medicine
Words: 1211

Sustained Organisational Learning Methods

Words: 1416

The Achievement of Millennium Development Goals in India

Subjects: International Relations Politics & Government

The EV Products in China

Words: 1146

Green Energy Brand Strategy: Chinese E-Car Consumer Behaviour

Subjects: Business Strategy
Words: 3378

Binomial Logistic Regression

Subjects: Math Sciences

Odds Ratio in Logistic Regression

Efficacy of antibiotic therapy and appendectomy, local food production in malaysia.

Words: 1625

The Indian Agriculture Sector

Subjects: Agriculture Sciences
Words: 1662

Beer Market Trends in the UK

Subjects: Business Marketing
Words: 1374

Waste Management in Australia

Subjects: Environment Recycling
Words: 1851

Health and Environment in Abu Dhabi

Subjects: Air Pollution Environment
Words: 3126

The Relations Between Media and School Violence

Subjects: Sociology Violence
Words: 2832

BlackBerry Management Perspectives

Words: 2831

Apple Inc. Equity Valuation

Words: 3729

Zara Fashions’ Supply Chain

Words: 6066

The Extent to Which FDI Inflows have Influenced GDPGrowth in India

Subjects: Economic Trends Economics
Words: 1780

Addicted 2 Football Business Plan

Subjects: Business Company Analysis
Words: 2514

Ashtead Group Plc Financial Accounting

Words: 5733

Reaching the Critical Mass in eMarketplaces

Subjects: Business E-Commerce
Words: 2204

Independent Samples t-test with SPSS

Economics concepts: alfred marshall.

Subjects: Economic Concepts Economics

Exploring Reliability and Validity

Subjects: Psychological Issues Psychology

Sustaining Australia’s Rate of Economic Growth

Subjects: Economic Systems & Principles Economics
Words: 1356

E-Cig Project and Price Customization

Subjects: Business Marketing Project

Public Relations and Customer Loyalty

Subjects: Branding Business
Words: 2118

Game-based Learning and Simulation in a K-12 School in the United Arab Emirates

Subjects: Education Pedagogy
Words: 3683

The Issue of Muslims’ Immigration to Australia

Subjects: Immigration Sociology
Words: 3492

China’s Energy and Environmental Implications

Subjects: Ecology Environment
Words: 3709

The Target Company

Words: 4066

The Effect of Social Media on Today’s Youth

Subjects: Entertainment & Media Social Media Issues
Words: 2165

Heineken Company in the US market

Words: 1275

International Communication in Saudi Arabia

Subjects: Communications Sociology
Words: 1390

The Algerian Wool Company

Words: 2570

Jewish Life in North America

Subjects: Sociological Issues Sociology
Words: 1788

Impact of Gambling on the Bahamian Economy

Subjects: Economics Influences on Political Economy
Words: 3871

International Marketing Plan for Tata Nano

Subjects: Business Financial Marketing
Words: 5299

Home Based and Community Based Services (HCBS)

Subjects: Health & Medicine Healthcare Institution

Case of Ski Pro Corporation

Subjects: Business Company Missions

8.5 Writing Process: Creating an Analytical Report

Learning outcomes.

By the end of this section, you will be able to:

Identify the elements of the rhetorical situation for your report.
Find and focus a topic to write about.
Gather and analyze information from appropriate sources.
Distinguish among different kinds of evidence.
Draft a thesis and create an organizational plan.
Compose a report that develops ideas and integrates evidence from sources.
Give and act on productive feedback to works in progress.

You might think that writing comes easily to experienced writers—that they draft stories and college papers all at once, sitting down at the computer and having sentences flow from their fingers like water from a faucet. In reality, most writers engage in a recursive process, pushing forward, stepping back, and repeating steps multiple times as their ideas develop and change. In broad strokes, the steps most writers go through are these:

Planning and Organization . You will have an easier time drafting if you devote time at the beginning to consider the rhetorical situation for your report, understand your assignment, gather ideas and information, draft a thesis statement, and create an organizational plan.
Drafting . When you have an idea of what you want to say and the order in which you want to say it, you’re ready to draft. As much as possible, keep going until you have a complete first draft of your report, resisting the urge to go back and rewrite. Save that for after you have completed a first draft.
Review . Now is the time to get feedback from others, whether from your instructor, your classmates, a tutor in the writing center, your roommate, someone in your family, or someone else you trust to read your writing critically and give you honest feedback.
Revising . With feedback on your draft, you are ready to revise. You may need to return to an earlier step and make large-scale revisions that involve planning, organizing, and rewriting, or you may need to work mostly on ensuring that your sentences are clear and correct.

Considering the Rhetorical Situation

Like other kinds of writing projects, a report starts with assessing the rhetorical situation —the circumstance in which a writer communicates with an audience of readers about a subject. As the writer of a report, you make choices based on the purpose of your writing, the audience who will read it, the genre of the report, and the expectations of the community and culture in which you are working. A graphic organizer like Table 8.1 can help you begin.

Summary of Assignment

Write an analytical report on a topic that interests you and that you want to know more about. The topic can be contemporary or historical, but it must be one that you can analyze and support with evidence from sources.

The following questions can help you think about a topic suitable for analysis:

Why or how did ________ happen?
What are the results or effects of ________?
Is ________ a problem? If so, why?
What are examples of ________ or reasons for ________?
How does ________ compare to or contrast with other issues, concerns, or things?

Consult and cite three to five reliable sources. The sources do not have to be scholarly for this assignment, but they must be credible, trustworthy, and unbiased. Possible sources include academic journals, newspapers, magazines, reputable websites, government publications or agency websites, and visual sources such as TED Talks. You may also use the results of an experiment or survey, and you may want to conduct interviews.

Consider whether visuals and media will enhance your report. Can you present data you collect visually? Would a map, photograph, chart, or other graphic provide interesting and relevant support? Would video or audio allow you to present evidence that you would otherwise need to describe in words?

Another Lens. To gain another analytic view on the topic of your report, consider different people affected by it. Say, for example, that you have decided to report on recent high school graduates and the effect of the COVID-19 pandemic on the final months of their senior year. If you are a recent high school graduate, you might naturally gravitate toward writing about yourself and your peers. But you might also consider the adults in the lives of recent high school graduates—for example, teachers, parents, or grandparents—and how they view the same period. Or you might consider the same topic from the perspective of a college admissions department looking at their incoming freshman class.

Quick Launch: Finding and Focusing a Topic

Coming up with a topic for a report can be daunting because you can report on nearly anything. The topic can easily get too broad, trapping you in the realm of generalizations. The trick is to find a topic that interests you and focus on an angle you can analyze in order to say something significant about it. You can use a graphic organizer to generate ideas, or you can use a concept map similar to the one featured in Writing Process: Thinking Critically About a “Text.”

Asking the Journalist’s Questions

One way to generate ideas about a topic is to ask the five W (and one H) questions, also called the journalist’s questions : Who? What? When? Where? Why? How? Try answering the following questions to explore a topic:

Who was or is involved in ________?

What happened/is happening with ________? What were/are the results of ________?

When did ________ happen? Is ________ happening now?

Where did ________ happen, or where is ________ happening?

Why did ________ happen, or why is ________ happening now?

How did ________ happen?

For example, imagine that you have decided to write your analytical report on the effect of the COVID-19 shutdown on high-school students by interviewing students on your college campus. Your questions and answers might look something like those in Table 8.2 :

Asking Focused Questions

Another way to find a topic is to ask focused questions about it. For example, you might ask the following questions about the effect of the 2020 pandemic shutdown on recent high school graduates:

How did the shutdown change students’ feelings about their senior year?
How did the shutdown affect their decisions about post-graduation plans, such as work or going to college?
How did the shutdown affect their academic performance in high school or in college?
How did/do they feel about continuing their education?
How did the shutdown affect their social relationships?

Any of these questions might be developed into a thesis for an analytical report. Table 8.3 shows more examples of broad topics and focusing questions.

Gathering Information

Because they are based on information and evidence, most analytical reports require you to do at least some research. Depending on your assignment, you may be able to find reliable information online, or you may need to do primary research by conducting an experiment, a survey, or interviews. For example, if you live among students in their late teens and early twenties, consider what they can tell you about their lives that you might be able to analyze. Returning to or graduating from high school, starting college, or returning to college in the midst of a global pandemic has provided them, for better or worse, with educational and social experiences that are shared widely by people their age and very different from the experiences older adults had at the same age.

Some report assignments will require you to do formal research, an activity that involves finding sources and evaluating them for reliability, reading them carefully, taking notes, and citing all words you quote and ideas you borrow. See Research Process: Accessing and Recording Information and Annotated Bibliography: Gathering, Evaluating, and Documenting Sources for detailed instruction on conducting research.

Whether you conduct in-depth research or not, keep track of the ideas that come to you and the information you learn. You can write or dictate notes using an app on your phone or computer, or you can jot notes in a journal if you prefer pen and paper. Then, when you are ready to begin organizing your report, you will have a record of your thoughts and information. Always track the sources of information you gather, whether from printed or digital material or from a person you interviewed, so that you can return to the sources if you need more information. And always credit the sources in your report.

Kinds of Evidence

Depending on your assignment and the topic of your report, certain kinds of evidence may be more effective than others. Other kinds of evidence may even be required. As a general rule, choose evidence that is rooted in verifiable facts and experience. In addition, select the evidence that best supports the topic and your approach to the topic, be sure the evidence meets your instructor’s requirements, and cite any evidence you use that comes from a source. The following list contains different kinds of frequently used evidence and an example of each.

Definition : An explanation of a key word, idea, or concept.

The U.S. Census Bureau refers to a “young adult” as a person between 18 and 34 years old.

Example : An illustration of an idea or concept.

The college experience in the fall of 2020 was starkly different from that of previous years. Students who lived in residence halls were assigned to small pods. On-campus dining services were limited. Classes were small and physically distanced or conducted online. Parties were banned.

Expert opinion : A statement by a professional in the field whose opinion is respected.

According to Louise Aronson, MD, geriatrician and author of Elderhood , people over the age of 65 are the happiest of any age group, reporting “less stress, depression, worry, and anger, and more enjoyment, happiness, and satisfaction” (255).

Fact : Information that can be proven correct or accurate.

According to data collected by the NCAA, the academic success of Division I college athletes between 2015 and 2019 was consistently high (Hosick).

Interview : An in-person, phone, or remote conversation that involves an interviewer posing questions to another person or people.

During our interview, I asked Betty about living without a cell phone during the pandemic. She said that before the pandemic, she hadn’t needed a cell phone in her daily activities, but she soon realized that she, and people like her, were increasingly at a disadvantage.

Quotation : The exact words of an author or a speaker.

In response to whether she thought she needed a cell phone, Betty said, “I got along just fine without a cell phone when I could go everywhere in person. The shift to needing a phone came suddenly, and I don’t have extra money in my budget to get one.”

Statistics : A numerical fact or item of data.

The Pew Research Center reported that approximately 25 percent of Hispanic Americans and 17 percent of Black Americans relied on smartphones for online access, compared with 12 percent of White people.

Survey : A structured interview in which respondents (the people who answer the survey questions) are all asked the same questions, either in person or through print or electronic means, and their answers tabulated and interpreted. Surveys discover attitudes, beliefs, or habits of the general public or segments of the population.

A survey of 3,000 mobile phone users in October 2020 showed that 54 percent of respondents used their phones for messaging, while 40 percent used their phones for calls (Steele).

Visuals : Graphs, figures, tables, photographs and other images, diagrams, charts, maps, videos, and audio recordings, among others.

Thesis and Organization

Drafting a thesis.

When you have a grasp of your topic, move on to the next phase: drafting a thesis. The thesis is the central idea that you will explore and support in your report; all paragraphs in your report should relate to it. In an essay-style analytical report, you will likely express this main idea in a thesis statement of one or two sentences toward the end of the introduction.

For example, if you found that the academic performance of student athletes was higher than that of non-athletes, you might write the following thesis statement:

student sample text Although a common stereotype is that college athletes barely pass their classes, an analysis of athletes’ academic performance indicates that athletes drop fewer classes, earn higher grades, and are more likely to be on track to graduate in four years when compared with their non-athlete peers. end student sample text

The thesis statement often previews the organization of your writing. For example, in his report on the U.S. response to the COVID-19 pandemic in 2020, Trevor Garcia wrote the following thesis statement, which detailed the central idea of his report:

student sample text An examination of the U.S. response shows that a reduction of experts in key positions and programs, inaction that led to equipment shortages, and inconsistent policies were three major causes of the spread of the virus and the resulting deaths. end student sample text

After you draft a thesis statement, ask these questions, and examine your thesis as you answer them. Revise your draft as needed.

Is it interesting? A thesis for a report should answer a question that is worth asking and piques curiosity.
Is it precise and specific? If you are interested in reducing pollution in a nearby lake, explain how to stop the zebra mussel infestation or reduce the frequent algae blooms.
Is it manageable? Try to split the difference between having too much information and not having enough.

Organizing Your Ideas

As a next step, organize the points you want to make in your report and the evidence to support them. Use an outline, a diagram, or another organizational tool, such as Table 8.4 .

Drafting an Analytical Report

With a tentative thesis, an organization plan, and evidence, you are ready to begin drafting. For this assignment, you will report information, analyze it, and draw conclusions about the cause of something, the effect of something, or the similarities and differences between two different things.

Introduction

Some students write the introduction first; others save it for last. Whenever you choose to write the introduction, use it to draw readers into your report. Make the topic of your report clear, and be concise and sincere. End the introduction with your thesis statement. Depending on your topic and the type of report, you can write an effective introduction in several ways. Opening a report with an overview is a tried-and-true strategy, as shown in the following example on the U.S. response to COVID-19 by Trevor Garcia. Notice how he opens the introduction with statistics and a comparison and follows it with a question that leads to the thesis statement (underlined).

student sample text With more than 83 million cases and 1.8 million deaths at the end of 2020, COVID-19 has turned the world upside down. By the end of 2020, the United States led the world in the number of cases, at more than 20 million infections and nearly 350,000 deaths. In comparison, the second-highest number of cases was in India, which at the end of 2020 had less than half the number of COVID-19 cases despite having a population four times greater than the U.S. (“COVID-19 Coronavirus Pandemic,” 2021). How did the United States come to have the world’s worst record in this pandemic? underline An examination of the U.S. response shows that a reduction of experts in key positions and programs, inaction that led to equipment shortages, and inconsistent policies were three major causes of the spread of the virus and the resulting deaths end underline . end student sample text

For a less formal report, you might want to open with a question, quotation, or brief story. The following example opens with an anecdote that leads to the thesis statement (underlined).

student sample text Betty stood outside the salon, wondering how to get in. It was June of 2020, and the door was locked. A sign posted on the door provided a phone number for her to call to be let in, but at 81, Betty had lived her life without a cell phone. Betty’s day-to-day life had been hard during the pandemic, but she had planned for this haircut and was looking forward to it; she had a mask on and hand sanitizer in her car. Now she couldn’t get in the door, and she was discouraged. In that moment, Betty realized how much Americans’ dependence on cell phones had grown in the months since the pandemic began. underline Betty and thousands of other senior citizens who could not afford cell phones or did not have the technological skills and support they needed were being left behind in a society that was increasingly reliant on technology end underline . end student sample text

Body Paragraphs: Point, Evidence, Analysis

Use the body paragraphs of your report to present evidence that supports your thesis. A reliable pattern to keep in mind for developing the body paragraphs of a report is point , evidence , and analysis :

The point is the central idea of the paragraph, usually given in a topic sentence stated in your own words at or toward the beginning of the paragraph. Each topic sentence should relate to the thesis.
The evidence you provide develops the paragraph and supports the point made in the topic sentence. Include details, examples, quotations, paraphrases, and summaries from sources if you conducted formal research. Synthesize the evidence you include by showing in your sentences the connections between sources.
The analysis comes at the end of the paragraph. In your own words, draw a conclusion about the evidence you have provided and how it relates to the topic sentence.

The paragraph below illustrates the point, evidence, and analysis pattern. Drawn from a report about concussions among football players, the paragraph opens with a topic sentence about the NCAA and NFL and their responses to studies about concussions. The paragraph is developed with evidence from three sources. It concludes with a statement about helmets and players’ safety.

student sample text The NCAA and NFL have taken steps forward and backward to respond to studies about the danger of concussions among players. Responding to the deaths of athletes, documented brain damage, lawsuits, and public outcry (Buckley et al., 2017), the NCAA instituted protocols to reduce potentially dangerous hits during football games and to diagnose traumatic head injuries more quickly and effectively. Still, it has allowed players to wear more than one style of helmet during a season, raising the risk of injury because of imperfect fit. At the professional level, the NFL developed a helmet-rating system in 2011 in an effort to reduce concussions, but it continued to allow players to wear helmets with a wide range of safety ratings. The NFL’s decision created an opportunity for researchers to look at the relationship between helmet safety ratings and concussions. Cocello et al. (2016) reported that players who wore helmets with a lower safety rating had more concussions than players who wore helmets with a higher safety rating, and they concluded that safer helmets are a key factor in reducing concussions. end student sample text

Developing Paragraph Content

In the body paragraphs of your report, you will likely use examples, draw comparisons, show contrasts, or analyze causes and effects to develop your topic.

Paragraphs developed with Example are common in reports. The paragraph below, adapted from a report by student John Zwick on the mental health of soldiers deployed during wartime, draws examples from three sources.

student sample text Throughout the Vietnam War, military leaders claimed that the mental health of soldiers was stable and that men who suffered from combat fatigue, now known as PTSD, were getting the help they needed. For example, the New York Times (1966) quoted military leaders who claimed that mental fatigue among enlisted men had “virtually ceased to be a problem,” occurring at a rate far below that of World War II. Ayres (1969) reported that Brigadier General Spurgeon Neel, chief American medical officer in Vietnam, explained that soldiers experiencing combat fatigue were admitted to the psychiatric ward, sedated for up to 36 hours, and given a counseling session with a doctor who reassured them that the rest was well deserved and that they were ready to return to their units. Although experts outside the military saw profound damage to soldiers’ psyches when they returned home (Halloran, 1970), the military stayed the course, treating acute cases expediently and showing little concern for the cumulative effect of combat stress on individual soldiers. end student sample text

When you analyze causes and effects , you explain the reasons that certain things happened and/or their results. The report by Trevor Garcia on the U.S. response to the COVID-19 pandemic in 2020 is an example: his report examines the reasons the United States failed to control the coronavirus. The paragraph below, adapted from another student’s report written for an environmental policy course, explains the effect of white settlers’ views of forest management on New England.

student sample text The early colonists’ European ideas about forest management dramatically changed the New England landscape. White settlers saw the New World as virgin, unused land, even though indigenous people had been drawing on its resources for generations by using fire subtly to improve hunting, employing construction techniques that left ancient trees intact, and farming small, efficient fields that left the surrounding landscape largely unaltered. White settlers’ desire to develop wood-built and wood-burning homesteads surrounded by large farm fields led to forestry practices and techniques that resulted in the removal of old-growth trees. These practices defined the way the forests look today. end student sample text

Compare and contrast paragraphs are useful when you wish to examine similarities and differences. You can use both comparison and contrast in a single paragraph, or you can use one or the other. The paragraph below, adapted from a student report on the rise of populist politicians, compares the rhetorical styles of populist politicians Huey Long and Donald Trump.

student sample text A key similarity among populist politicians is their rejection of carefully crafted sound bites and erudite vocabulary typically associated with candidates for high office. Huey Long and Donald Trump are two examples. When he ran for president, Long captured attention through his wild gesticulations on almost every word, dramatically varying volume, and heavily accented, folksy expressions, such as “The only way to be able to feed the balance of the people is to make that man come back and bring back some of that grub that he ain’t got no business with!” In addition, Long’s down-home persona made him a credible voice to represent the common people against the country’s rich, and his buffoonish style allowed him to express his radical ideas without sounding anti-communist alarm bells. Similarly, Donald Trump chose to speak informally in his campaign appearances, but the persona he projected was that of a fast-talking, domineering salesman. His frequent use of personal anecdotes, rhetorical questions, brief asides, jokes, personal attacks, and false claims made his speeches disjointed, but they gave the feeling of a running conversation between him and his audience. For example, in a 2015 speech, Trump said, “They just built a hotel in Syria. Can you believe this? They built a hotel. When I have to build a hotel, I pay interest. They don’t have to pay interest, because they took the oil that, when we left Iraq, I said we should’ve taken” (“Our Country Needs” 2020). While very different in substance, Long and Trump adopted similar styles that positioned them as the antithesis of typical politicians and their worldviews. end student sample text

The conclusion should draw the threads of your report together and make its significance clear to readers. You may wish to review the introduction, restate the thesis, recommend a course of action, point to the future, or use some combination of these. Whichever way you approach it, the conclusion should not head in a new direction. The following example is the conclusion from a student’s report on the effect of a book about environmental movements in the United States.

student sample text Since its publication in 1949, environmental activists of various movements have found wisdom and inspiration in Aldo Leopold’s A Sand County Almanac . These audiences included Leopold’s conservationist contemporaries, environmentalists of the 1960s and 1970s, and the environmental justice activists who rose in the 1980s and continue to make their voices heard today. These audiences have read the work differently: conservationists looked to the author as a leader, environmentalists applied his wisdom to their movement, and environmental justice advocates have pointed out the flaws in Leopold’s thinking. Even so, like those before them, environmental justice activists recognize the book’s value as a testament to taking the long view and eliminating biases that may cloud an objective assessment of humanity’s interdependent relationship with the environment. end student sample text

Citing Sources

You must cite the sources of information and data included in your report. Citations must appear in both the text and a bibliography at the end of the report.

The sample paragraphs in the previous section include examples of in-text citation using APA documentation style. Trevor Garcia’s report on the U.S. response to COVID-19 in 2020 also uses APA documentation style for citations in the text of the report and the list of references at the end. Your instructor may require another documentation style, such as MLA or Chicago.

Peer Review: Getting Feedback from Readers

You will likely engage in peer review with other students in your class by sharing drafts and providing feedback to help spot strengths and weaknesses in your reports. For peer review within a class, your instructor may provide assignment-specific questions or a form for you to complete as you work together.

If you have a writing center on your campus, it is well worth your time to make an online or in-person appointment with a tutor. You’ll receive valuable feedback and improve your ability to review not only your report but your overall writing.

Another way to receive feedback on your report is to ask a friend or family member to read your draft. Provide a list of questions or a form such as the one in Table 8.5 for them to complete as they read.

Revising: Using Reviewers’ Responses to Revise your Work

When you receive comments from readers, including your instructor, read each comment carefully to understand what is being asked. Try not to get defensive, even though this response is completely natural. Remember that readers are like coaches who want you to succeed. They are looking at your writing from outside your own head, and they can identify strengths and weaknesses that you may not have noticed. Keep track of the strengths and weaknesses your readers point out. Pay special attention to those that more than one reader identifies, and use this information to improve your report and later assignments.

As you analyze each response, be open to suggestions for improvement, and be willing to make significant revisions to improve your writing. Perhaps you need to revise your thesis statement to better reflect the content of your draft. Maybe you need to return to your sources to better understand a point you’re trying to make in order to develop a paragraph more fully. Perhaps you need to rethink the organization, move paragraphs around, and add transition sentences.

Below is an early draft of part of Trevor Garcia’s report with comments from a peer reviewer:

student sample text To truly understand what happened, it’s important first to look back to the years leading up to the pandemic. Epidemiologists and public health officials had long known that a global pandemic was possible. In 2016, the U.S. National Security Council (NSC) published a 69-page document with the intimidating title Playbook for Early Response to High-Consequence Emerging Infectious Disease Threats and Biological Incidents . The document’s two sections address responses to “emerging disease threats that start or are circulating in another country but not yet confirmed within U.S. territorial borders” and to “emerging disease threats within our nation’s borders.” On 13 January 2017, the joint Obama-Trump transition teams performed a pandemic preparedness exercise; however, the playbook was never adopted by the incoming administration. end student sample text

annotated text Peer Review Comment: Do the words in quotation marks need to be a direct quotation? It seems like a paraphrase would work here. end annotated text

annotated text Peer Review Comment: I’m getting lost in the details about the playbook. What’s the Obama-Trump transition team? end annotated text

student sample text In February 2018, the administration began to cut funding for the Prevention and Public Health Fund at the Centers for Disease Control and Prevention; cuts to other health agencies continued throughout 2018, with funds diverted to unrelated projects such as housing for detained immigrant children. end student sample text

annotated text Peer Review Comment: This paragraph has only one sentence, and it’s more like an example. It needs a topic sentence and more development. end annotated text

student sample text Three months later, Luciana Borio, director of medical and biodefense preparedness at the NSC, spoke at a symposium marking the centennial of the 1918 influenza pandemic. “The threat of pandemic flu is the number one health security concern,” she said. “Are we ready to respond? I fear the answer is no.” end student sample text

annotated text Peer Review Comment: This paragraph is very short and a lot like the previous paragraph in that it’s a single example. It needs a topic sentence. Maybe you can combine them? end annotated text

annotated text Peer Review Comment: Be sure to cite the quotation. end annotated text

Reading these comments and those of others, Trevor decided to combine the three short paragraphs into one paragraph focusing on the fact that the United States knew a pandemic was possible but was unprepared for it. He developed the paragraph, using the short paragraphs as evidence and connecting the sentences and evidence with transitional words and phrases. Finally, he added in-text citations in APA documentation style to credit his sources. The revised paragraph is below:

student sample text Epidemiologists and public health officials in the United States had long known that a global pandemic was possible. In 2016, the National Security Council (NSC) published Playbook for Early Response to High-Consequence Emerging Infectious Disease Threats and Biological Incidents , a 69-page document on responding to diseases spreading within and outside of the United States. On January 13, 2017, the joint transition teams of outgoing president Barack Obama and then president-elect Donald Trump performed a pandemic preparedness exercise based on the playbook; however, it was never adopted by the incoming administration (Goodman & Schulkin, 2020). A year later, in February 2018, the Trump administration began to cut funding for the Prevention and Public Health Fund at the Centers for Disease Control and Prevention, leaving key positions unfilled. Other individuals who were fired or resigned in 2018 were the homeland security adviser, whose portfolio included global pandemics; the director for medical and biodefense preparedness; and the top official in charge of a pandemic response. None of them were replaced, leaving the White House with no senior person who had experience in public health (Goodman & Schulkin, 2020). Experts voiced concerns, among them Luciana Borio, director of medical and biodefense preparedness at the NSC, who spoke at a symposium marking the centennial of the 1918 influenza pandemic in May 2018: “The threat of pandemic flu is the number one health security concern,” she said. “Are we ready to respond? I fear the answer is no” (Sun, 2018, final para.). end student sample text

A final word on working with reviewers’ comments: as you consider your readers’ suggestions, remember, too, that you remain the author. You are free to disregard suggestions that you think will not improve your writing. If you choose to disregard comments from your instructor, consider submitting a note explaining your reasons with the final draft of your report.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Access for free at https://openstax.org/books/writing-guide/pages/1-unit-introduction

Authors: Michelle Bachelor Robinson, Maria Jerskey, featuring Toby Fulwiler
Publisher/website: OpenStax
Book title: Writing Guide with Handbook
Publication date: Dec 21, 2021
Location: Houston, Texas
Book URL: https://openstax.org/books/writing-guide/pages/1-unit-introduction
Section URL: https://openstax.org/books/writing-guide/pages/8-5-writing-process-creating-an-analytical-report

© Dec 19, 2023 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

We use cookies to enhance our website for you. Proceed if you agree to this policy or learn more about it.

Essay Database >
Essays Examples >
Essay Topics

Essays on Data Analysis

1725 samples on this topic

On this site, we've put together a catalog of free paper samples regarding Data Analysis. The plan is to provide you with a sample identical to your Data Analysis essay topic so that you could have a closer look at it in order to get a better idea of what a great academic work should look like. You are also recommended to use the best Data Analysis writing practices showcased by competent authors and, eventually, craft a high-quality paper of your own.

However, if developing Data Analysis papers entirely by yourself is not an option at this point, WowEssays.com essay writer service might still be able to help you out. For example, our experts can create a unique Data Analysis essay sample specifically for you. This model paper on Data Analysis will be written from scratch and tailored to your individual requirements, fairly priced, and delivered to you within the pre-set timeframe. Choose your writer and buy custom essay now!

Advanced Technology At Central Bank Case Studies Examples

Introduction

Free Tax Accounting Essay Example

Bloomberg Businessweek Assignment #1

Free Thesis Proposal About Supply Chain Performance In Pharmaceutical Industry In Pakistan

Example of normalization research paper, example of network design proposal research proposal.

I. Physical Network Design

A. Network Topology Business Needs In order to complete the needed full-time assignment efficiently and effectively, the University of Maryland University College will need a system in addition to tools to help the student attain their objectives. This entails new office space, a library, classrooms and computer labs will be able to achieve this capacity and optimize student learning potential

Proposed Topology

Marketing/E-Commerce Questions & Answers Example

Defining E-Commerce

E-commerce is a form of conducting business over an electronic network, most preferably the internet. It is a process involving the online interaction between the buyer and the seller or between two entrepreneurs characterized by online exchange of money.

Factors leading to the rise in the number of consumer shopping online

Example Of Research Paper On Database Modeling And Normalization

Draw topic & writing ideas from this report on company background.

Business Audit: Coca Cola

Executive Summary

Example Of Report On Barriers To Emiratization And The Role Of Policy Design And The Institutional Environment In Determining The Effectiveness Of Emiratization

Enter Name of Student

“Authentic Leadership And Mindfulness Development Through Action Learning” Article Review Example

Inspiring essay about network security assessment, transitional nursing for the elderly population research proposals example.

Nursing Research

Research Implementation Phase As mentioned earlier in the paper, in order to answer the five main research questions formulated for this research paper, four different types of data collection methods will be utilized; namely, secondary research of existing literature and primary research (survey-based questionnaires, interviews and direct observation). The implementation phase will include a brief overview of how the project will be executed on-ground, the expenditures that will be needed, as well as how the process of data analysis will be conducted.

Project Schedule and Timeframe

Write By Example Of This Preliminary Field List And The Preliminary Table List: Programming Assignment

Write by example of this human resource management essay.

Newspaper Article 1: Employee Retention- How to Retain Employees

Introduction The purpose of the paper Employee Retention- How to Retain Employees is to highlight several tactics that a human resource manager can employ to retain the employees in their organization. As such, the organizations that utilize the outline tactics benefit from the associated high turnover costs as well as productivity. The paper aims at ensuring employers create a strong workforce by retaining the employees.

Free Assumptions And Limitations Research Paper Example

Technological advancements have been sweeping across different facets of the economy like bush fire. The education sector has not been left behind in the adoption and application of technology to streamline their processes. Several universities across the globe have taken a paradigm shift from the traditional manual records management to the digital records management (Alavi & Leidner, 1999). In this paper, we are going to propose a comprehensive University database used for data storage, extraction, and processing.

Good Research Paper On Security Reporting

An Assignment Submitted by

Free Phenotypic Variation Among Plants Under Repeated Drought Across Diversity Gradient Thesis Proposal Sample

Summary Of Proposed Research

Inspiring Research Paper About Successful Adolescent And Adult

Influence of Nurture on the Process of Becoming a

Good Research Paper On Database System And Models

In order to get the commission for each employ we first have to generate a database and its tables. CREATE TABLE Employee( EmpNumber int not null, EmpFirstName varchar(100) not null, EmpLastName varchar(100) not null, CommissionRate decimal(3,2) not null, YrlySalary int not null, DepartmentID int, JobID int, PRIMARY KEY (EmpNumber), FOREIGN KEY (DepartmentID) references Department(DepartmentID), FOREIGN KEY (JobID) references Job(JobID) ).

CREATE TABLE Job( JobID int not null, JobDescription varchar(400) not null, PRIMARY KEY (JobID) )

Write By Example Of This Customer Satisfaction And Loyalty Literature Review

Why some business areas choose to work in isolation from it) research proposal.

Introduction 3 Review of Literature 3 Research Methodology 6 Rationale 6 Objectives 7 Research Question 7 Theories: Resource Based Theory 7 Research Design 7 Procedure 7 Data Collection 8 Data Analysis 8 Reliability 8 Ethical Issues 9 Discussion 9 Conclusion 9

References 10

Draw Topic & Writing Ideas From This Dissertation Proposal On The Effect Of Soft Play On The Development Of Children In Preschool

A-level research paper on avoiding plagiarism and searching peer-reviewed articles for free use, exemplar essay on education to write after.

Data Analysis

Example Of Essay On Protecting Patient Data: Policy Statement

Policy Manual Introduction

A policy manual is a general guideline that details the policies and best practices that an organization in a particular industry should follow. This is the case with health care in general and protecting patient data in particular. Starting from a manual system, an electronic or automated system has been found to enhance best practices in the field of health care (Hamilton, Jacob, Koch & Quammen, 2004).

Importance to Organization

Association Of Physical Activity And Level Of Depression Among Youth In The United States Dissertation Proposal Examples

Free essay on leadership and management, good magnitude of user interaction with research paper example.

Different Types of Facebook Page Posts

Free Essay About Research Method And Evidence Based Methods

Database tuning and optimization programming assignments example, reduction in workforce essay example.

Workforce Reductions

Workforce Reductions Workforce reduction is considered a downsizing strategy that directly affects the workers and employees of the company. The strategy is used as a result of slump in demand or to improve the operational efficiency of the company. The key reasons of workforce reduction include heightened competition, changing markets and gaps in the employment laws. In addition, improvement in the cash-flow, stakeholder expectations and overcapacity/poor management lead to massive layoffs to gain short-term profits and value (Rogovsky, 2005). The workforce reduction as a strategy of the company has both positive and negative impacts.

Positive Consequences of Reduction in Workforce

Draw Topic & Writing Ideas From This Essay On Type Of Clinic

Imaginary Clinic

The clinic is a general practice clinic. It provides person-centred, progressive as well as comprehensive whole-person health care to persons and families in their society. It encompasses the foundations of an efficient health care system (“Capterra” par.1).

Branches of the Clinic

Good Essay About Assumed Certainty: Pivot Tables And Multi-Attribute Decision Making

INTRODUCTION

Sample Research Paper On Network Assessment

IT331 Week 10

Good SQL Querry Programming Assignment Example

How do we use excel in statistics: a sample essay for inspiration & mimicking, defining the method or instrument used in the study such as questionnaires or interviews to have a concrete information question & answer sample.

Questions for Exams

Question 33: Describe the economical content of the cumulative frequency if you analyze the series The economical content of cumulative frequencies if examined in series gives the total sum of all the frequencies displayed in a table for data analysis of samples. Moreover, when the series are accurate, one can tell the sequence of the data as it appears in the worksheet and derives conclusions from the primary variables.

Question 38: Explain the economical and statistical content of the components of following regression equation.

Free Clinical Application Of Biomechanics Essay Sample

Example of research paper on a clinical research database and a clinical research question through the picot model, example of research proposal on nursing: critiquing a quantitative/qualitative study.

Nursing: Critiquing a Quantitative/Qualitative Study

A quantitative assessment of patient and nurse outcomes of bedside nursing report implementation

This quantitative research critique encompasses an evaluation of the research problem and purpose. An evaluation of the hypothesis and research questions will be undertaken too along with an assessment of the literature review and theoretical framework. The entire methodology will be reviewed. Quantitative research studies unlike qualitative apply statistics in explaining a phenomenon. These features of the research design will be fully explored. Study # 1: Sand-Jecklyn, K., & Sherman, J. (2014). A quantitative assessment of patient and nurse outcomes of bedside nursing report implementation. Journal of Clinical

Nursing; 23(19 - 20); 2854–2863

Designing Role-Plays Games To Improve Students Learning Dissertation Methodology Examples

Methodology

Research Questions Will the strategy of designing role-playing games in the history class be an effective strategy to improve the students’ performance? What will be the expected average improvement of the improvised strategy? How will the students perceive the new learning method? How will we ensure that the students did not improve their mean score through other factors? How will we validate the results for credibility? Hypothesis HA: Group A will perform better than group B with SSL English students improving their mean grade at the end of the fall.

Hypothesis B: Group B will perform averagely but slightly lower than Group B.

Inspiring Essay About Human Resource Management

Phillips Furniture

Expertly Written Research Paper On The Purpose Of The Database. To Follow

Business Rules and Data Models.

Good Example Of Research Methodology And Data Analysis Essay

Forest school effect on risk assessment: example literature review by an expert writer to follow, free programming assignment on introduction to transact-sql.

Introduction to Transact-SQL (T-SQL)

Construct and execute INSERT SQL statements to add the sample data in the following tables to the Customer and Address tables in the HandsOnOne database:

Business Rules and Data Models Research Papers Example

Business Rules and Data Models

Free Open Access Clinical Research Database For Animal Assisted Dementia Therapy: A Compilation Of Five Databases Research Paper: Top-Quality Sample To Follow

Marketing strategies for attracting new clients in hotel businesses: a sample dissertation proposal for inspiration & mimicking, good example of essay on corporate social responsibility and ethics.

Educational affiliation

Statement Of The Problem Thesis Proposal Sample

Implementing Cost Cutting in the Oil Industry

Evidence-Based Practice: Free Sample Essay To Follow

A1 Quantitative Article: Tan, C.H.S., Ishak, R.B., Lim, T.X.G., Marimuthusamy, P., Kaurss, K., & Leong, J.J. (2016). Illness Management and Recovery Program for Mental Health

Problems: Reducing Symptoms and Increasing Social Functioning. Journal of Clinical

Human Resource Management: Capstone Project You Might Want To Emulate

Difference and Similarities among three Positions

Free Research Paper About Materials And Methods

Analysis of GMOs Effects on Human Health

Database And Website Essay Samples

Free network assessment term paper sample, good essay on organizational stress.

Ethical considerations

Draw Topic & Writing Ideas From This Essay On Natural Language Processing And Emrs

Platform requirements for bookstore report template for faster writing.

• To have a complete function and properly designed maintenance system transaction devices will be required. • The bookstore will require a sale transaction device to keep a track on the book quantity and sales. For that, the use of scanners and registers will be needed. For that, computers are required with fast CPU, mouse, and keyboard.

• The maintenance crew will keep a track on book quantities and record.

275 words = 1 page double-spaced

Password recovery email has been sent to [email protected]

Use your new password to log in

You are not register!

By clicking Register, you agree to our Terms of Service and that you have read our Privacy Policy .

Now you can download documents directly to your device!

Check your email! An email with your password has already been sent to you! Now you can download documents directly to your device.

or Use the QR code to Save this Paper to Your Phone

The sample is NOT original!

Short on a deadline?

Don't waste time. Get help with 11% off using code - GETWOWED

No, thanks! I'm fine with missing my deadline

Research Ethics and Design

Writing about your data.

After you have collected and analyzed all your data, you will normally write at least three sections about your primary research:

Methods : How did you collect your data?
Results (or Findings) : What did you find?
Discussion : What do those findings mean?

For example, describing who, when, and how you sent out and collected your 10-question survey would be your Methods section. Describing how many people responded in particular or different ways is the Results section. Interpreting the data and making a statement about what that data means is your Discussion section. Limitations of your study need to be explored as well, but this is either incorporated into one of these sections (usually the Discussion) or has its own section.

This section is crucial to your credibility and understanding of your data. Clear descriptions will help your readers know why you did what you did and how you got your results. Here are a few points to keep in mind when writing the Methods section.

Who did you interview/survey/observe? Was it a specific group? “Random” people on campus? Why?
What did you ask, overall? Why? Avoid listing every question and instead just give a quick overview. (Direct your readers to your full instrument in an appendix instead.)
When and where did this occur? Did you send out a survey? How long was it online? If you did an interview, how did you set it up, where did you conduct it, and how long did it take? Similar questions apply for observations as well: where were you, when, and why?
How did you complete your research? If you did an interview, did you do this in-person, online, by phone? If you conducted a survey, did you do it via pen and paper or an online survey? What did you look for in your observation(s) and how did you take notes?
If you are using a theoretical framework to analyze your data, what is it? Why are you using it?

Do not see this list as a way to organize the section but instead as questions your Methods section should answer. You do not want this section to read like a checklist.

Note, while readers mostly want to know your findings and interpretation of the data in the following two sections, the Methods section is just as important. The more you can describe your methods, the better other researchers can understand your data and also potentially replicate your research.

Results (or Findings)

This will be where you describe your collected data (i.e. data that you have collected from your study that you have not “interpreted” yet). Like in your Methods section, you want to be clear and transparent.

Surveys. Avoid listing a question, then an answer, then a question, then an answer, etc. Using visuals where appropriate, report on (instead of list) the more significant parts of your survey. You should list your questions in an appendix, and you can list your full results in a table/visual there as well.
Interviews. Avoid listing questions and answers and having an almost dialogue form. Instead, report on the more significant parts of the interview and use quotations when necessary.
Observations. Describe what you saw. Again, like your interviews/surveys, avoid giving a “play-by-play” and discuss what you know are the more significant aspects.

In your Results section, you generally want to avoid “flowery” language and/or inserting too much opinion. Simply report your findings in as clear a way as possible.

In this final section is where you will give your own analysis of the data. Here is where you will make connections for the reader(s) on what your data “means.” The main difference between your Results section and the Discussion section is that this is, for all intents and purposes, your opinion (though that opinion is rooted heavily in your data). Whichever method you chose to collect your data, these suggestions will help organize your discussion section and make it clear for your reader.

Clear Topic Sentence(s). As you have learned throughout the semester, clear topic sentences will help set up your paragraph(s) to be easily understandable.
Explicit Connections . In your paragraphs, make explicit connections between your claim(s) and evidence from your data. Where appropriate, you also want to make connections to prior research studies: do your data points support or diverge from prior studies? How? Why might this be?
Detailed Evidence . Don’t hesitate to remind your reader of the data collected or even to elaborate more on it. Remember, more details and discussion of data will help convince your reader about the significance of your claim.
Limitations . Some researchers put this in the Discussion section while others make an entirely new section. Either way, be upfront with all the limitations, shortcomings, etc. of your research. Be thorough in your thinking here: did you run out of time, have a small number of responses, or recognize a methodological flaw along the way? Being transparent and honest with your reader is most important.
Potential Future Research . Generally, either in the Discussion section or in a final, short Conclusion section, primary research projects make note of future potential projects based on the current one. If your results were unclear, then further research might be justified. If your results were clear, then perhaps that indicates that a narrower sample group should be investigated or a new or slightly different variable should be examined. There are many possible routes to take here, but you want to base it on what you did (and/or did not) find in your study and help future researchers dig further into your research topic

This section usually reads more like a “traditional” essay you are used to writing than some of the other sections of an empirical project. From clear topic sentences to supporting evidence, the skills you have been learning throughout your writing career are easily applicable here. The major difference is that instead of solely citing other sources, you are the one providing the evidence. You are producing new knowledge and questions. Be proud!

Incorporating your Data. Authored by : Sarah Wilson & Trey Bagwell . Provided by : University of Mississippi. Project : WRIT 250 Committee OER Project. License : CC BY-SA: Attribution-ShareAlike

Choose Your Test

Sat / act prep online guides and tips, 5 steps to write a great analytical essay.

General Education

Do you need to write an analytical essay for school? What sets this kind of essay apart from other types, and what must you include when you write your own analytical essay? In this guide, we break down the process of writing an analytical essay by explaining the key factors your essay needs to have, providing you with an outline to help you structure your essay, and analyzing a complete analytical essay example so you can see what a finished essay looks like.

What Is an Analytical Essay?

Before you begin writing an analytical essay, you must know what this type of essay is and what it includes. Analytical essays analyze something, often (but not always) a piece of writing or a film.

An analytical essay is more than just a synopsis of the issue though; in this type of essay you need to go beyond surface-level analysis and look at what the key arguments/points of this issue are and why. If you’re writing an analytical essay about a piece of writing, you’ll look into how the text was written and why the author chose to write it that way. Instead of summarizing, an analytical essay typically takes a narrower focus and looks at areas such as major themes in the work, how the author constructed and supported their argument, how the essay used literary devices to enhance its messages, etc.

While you certainly want people to agree with what you’ve written, unlike with persuasive and argumentative essays, your main purpose when writing an analytical essay isn’t to try to convert readers to your side of the issue. Therefore, you won’t be using strong persuasive language like you would in those essay types. Rather, your goal is to have enough analysis and examples that the strength of your argument is clear to readers.

Besides typical essay components like an introduction and conclusion, a good analytical essay will include:

A thesis that states your main argument
Analysis that relates back to your thesis and supports it
Examples to support your analysis and allow a more in-depth look at the issue

In the rest of this article, we’ll explain how to include each of these in your analytical essay.

How to Structure Your Analytical Essay

Analytical essays are structured similarly to many other essays you’ve written, with an introduction (including a thesis), several body paragraphs, and a conclusion. Below is an outline you can follow when structuring your essay, and in the next section we go into more detail on how to write an analytical essay.

Introduction

Your introduction will begin with some sort of attention-grabbing sentence to get your audience interested, then you’ll give a few sentences setting up the topic so that readers have some context, and you’ll end with your thesis statement. Your introduction will include:

Brief background information explaining the issue/text
Your thesis

Body Paragraphs

Your analytical essay will typically have three or four body paragraphs, each covering a different point of analysis. Begin each body paragraph with a sentence that sets up the main point you’ll be discussing. Then you’ll give some analysis on that point, backing it up with evidence to support your claim. Continue analyzing and giving evidence for your analysis until you’re out of strong points for the topic. At the end of each body paragraph, you may choose to have a transition sentence that sets up what the next paragraph will be about, but this isn’t required. Body paragraphs will include:

Introductory sentence explaining what you’ll cover in the paragraph (sort of like a mini-thesis)
Analysis point
Evidence (either passages from the text or data/facts) that supports the analysis
(Repeat analysis and evidence until you run out of examples)

You won’t be making any new points in your conclusion; at this point you’re just reiterating key points you’ve already made and wrapping things up. Begin by rephrasing your thesis and summarizing the main points you made in the essay. Someone who reads just your conclusion should be able to come away with a basic idea of what your essay was about and how it was structured. After this, you may choose to make some final concluding thoughts, potentially by connecting your essay topic to larger issues to show why it’s important. A conclusion will include:

Paraphrase of thesis
Summary of key points of analysis
Final concluding thought(s)

5 Steps for Writing an Analytical Essay

Follow these five tips to break down writing an analytical essay into manageable steps. By the end, you’ll have a fully-crafted analytical essay with both in-depth analysis and enough evidence to support your argument. All of these steps use the completed analytical essay in the next section as an example.

#1: Pick a Topic

You may have already had a topic assigned to you, and if that’s the case, you can skip this step. However, if you haven’t, or if the topic you’ve been assigned is broad enough that you still need to narrow it down, then you’ll need to decide on a topic for yourself. Choosing the right topic can mean the difference between an analytical essay that’s easy to research (and gets you a good grade) and one that takes hours just to find a few decent points to analyze

Before you decide on an analytical essay topic, do a bit of research to make sure you have enough examples to support your analysis. If you choose a topic that’s too narrow, you’ll struggle to find enough to write about.

For example, say your teacher assigns you to write an analytical essay about the theme in John Steinbeck’s The Grapes of Wrath of exposing injustices against migrants. For it to be an analytical essay, you can’t just recount the injustices characters in the book faced; that’s only a summary and doesn’t include analysis. You need to choose a topic that allows you to analyze the theme. One of the best ways to explore a theme is to analyze how the author made his/her argument. One example here is that Steinbeck used literary devices in the intercalary chapters (short chapters that didn’t relate to the plot or contain the main characters of the book) to show what life was like for migrants as a whole during the Dust Bowl.

You could write about how Steinbeck used literary devices throughout the whole book, but, in the essay below, I chose to just focus on the intercalary chapters since they gave me enough examples. Having a narrower focus will nearly always result in a tighter and more convincing essay (and can make compiling examples less overwhelming).

#2: Write a Thesis Statement

Your thesis statement is the most important sentence of your essay; a reader should be able to read just your thesis and understand what the entire essay is about and what you’ll be analyzing. When you begin writing, remember that each sentence in your analytical essay should relate back to your thesis

In the analytical essay example below, the thesis is the final sentence of the first paragraph (the traditional spot for it). The thesis is: “In The Grapes of Wrath’s intercalary chapters, John Steinbeck employs a variety of literary devices and stylistic choices to better expose the injustices committed against migrants in the 1930s.” So what will this essay analyze? How Steinbeck used literary devices in the intercalary chapters to show how rough migrants could have it. Crystal clear.

#3: Do Research to Find Your Main Points

This is where you determine the bulk of your analysis--the information that makes your essay an analytical essay. My preferred method is to list every idea that I can think of, then research each of those and use the three or four strongest ones for your essay. Weaker points may be those that don’t relate back to the thesis, that you don’t have much analysis to discuss, or that you can’t find good examples for. A good rule of thumb is to have one body paragraph per main point

This essay has four main points, each of which analyzes a different literary device Steinbeck uses to better illustrate how difficult life was for migrants during the Dust Bowl. The four literary devices and their impact on the book are:

Lack of individual names in intercalary chapters to illustrate the scope of the problem
Parallels to the Bible to induce sympathy for the migrants
Non-showy, often grammatically-incorrect language so the migrants are more realistic and relatable to readers
Nature-related metaphors to affect the mood of the writing and reflect the plight of the migrants

#4: Find Excerpts or Evidence to Support Your Analysis

Now that you have your main points, you need to back them up. If you’re writing a paper about a text or film, use passages/clips from it as your main source of evidence. If you’re writing about something else, your evidence can come from a variety of sources, such as surveys, experiments, quotes from knowledgeable sources etc. Any evidence that would work for a regular research paper works here.

In this example, I quoted multiple passages from The Grapes of Wrath in each paragraph to support my argument. You should be able to back up every claim you make with evidence in order to have a strong essay.

#5: Put It All Together

Now it's time to begin writing your essay, if you haven’t already. Create an introductory paragraph that ends with the thesis, make a body paragraph for each of your main points, including both analysis and evidence to back up your claims, and wrap it all up with a conclusion that recaps your thesis and main points and potentially explains the big picture importance of the topic.

Analytical Essay Example + Analysis

So that you can see for yourself what a completed analytical essay looks like, here’s an essay I wrote back in my high school days. It’s followed by analysis of how I structured my essay, what its strengths are, and how it could be improved.

One way Steinbeck illustrates the connections all migrant people possessed and the struggles they faced is by refraining from using specific titles and names in his intercalary chapters. While The Grapes of Wrath focuses on the Joad family, the intercalary chapters show that all migrants share the same struggles and triumphs as the Joads. No individual names are used in these chapters; instead the people are referred to as part of a group. Steinbeck writes, “Frantic men pounded on the doors of the doctors; and the doctors were busy. And sad men left word at country stores for the coroner to send a car,” (555). By using generic terms, Steinbeck shows how the migrants are all linked because they have gone through the same experiences. The grievances committed against one family were committed against thousands of other families; the abuse extends far beyond what the Joads experienced. The Grapes of Wrath frequently refers to the importance of coming together; how, when people connect with others their power and influence multiplies immensely. Throughout the novel, the goal of the migrants, the key to their triumph, has been to unite. While their plans are repeatedly frustrated by the government and police, Steinbeck’s intercalary chapters provide a way for the migrants to relate to one another because they have encountered the same experiences. Hundreds of thousands of migrants fled to the promised land of California, but Steinbeck was aware that numbers alone were impersonal and lacked the passion he desired to spread. Steinbeck created the intercalary chapters to show the massive numbers of people suffering, and he created the Joad family to evoke compassion from readers. Because readers come to sympathize with the Joads, they become more sensitive to the struggles of migrants in general. However, John Steinbeck frequently made clear that the Joads were not an isolated incident; they were not unique. Their struggles and triumphs were part of something greater. Refraining from specific names in his intercalary chapters allows Steinbeck to show the vastness of the atrocities committed against migrants.

Steinbeck also creates significant parallels to the Bible in his intercalary chapters in order to enhance his writing and characters. By using simple sentences and stylized writing, Steinbeck evokes Biblical passages. The migrants despair, “No work till spring. No work,” (556). Short, direct sentences help to better convey the desperateness of the migrants’ situation. Throughout his novel, John Steinbeck makes connections to the Bible through his characters and storyline. Jim Casy’s allusions to Christ and the cycle of drought and flooding are clear biblical references. By choosing to relate The Grapes of Wrath to the Bible, Steinbeck’s characters become greater than themselves. Starving migrants become more than destitute vagrants; they are now the chosen people escaping to the promised land. When a forgotten man dies alone and unnoticed, it becomes a tragedy. Steinbeck writes, “If [the migrants] were shot at, they did not run, but splashed sullenly away; and if they were hit, they sank tiredly in the mud,” (556). Injustices committed against the migrants become greater because they are seen as children of God through Steinbeck’s choice of language. Referencing the Bible strengthens Steinbeck’s novel and purpose: to create understanding for the dispossessed. It is easy for people to feel disdain for shabby vagabonds, but connecting them to such a fundamental aspect of Christianity induces sympathy from readers who might have otherwise disregarded the migrants as so many other people did.

The simple, uneducated dialogue Steinbeck employs also helps to create a more honest and meaningful representation of the migrants, and it makes the migrants more relatable to readers. Steinbeck chooses to accurately represent the language of the migrants in order to more clearly illustrate their lives and make them seem more like real paper than just characters in a book. The migrants lament, “They ain’t gonna be no kinda work for three months,” (555). There are multiple grammatical errors in that single sentence, but it vividly conveys the despair the migrants felt better than a technically perfect sentence would. The Grapes of Wrath is intended to show the severe difficulties facing the migrants so Steinbeck employs a clear, pragmatic style of writing. Steinbeck shows the harsh, truthful realities of the migrants’ lives and he would be hypocritical if he chose to give the migrants a more refined voice and not portray them with all their shortcomings. The depiction of the migrants as imperfect through their language also makes them easier to relate to. Steinbeck’s primary audience was the middle class, the less affluent of society. Repeatedly in The Grapes of Wrath , the wealthy make it obvious that they scorn the plight of the migrants. The wealthy, not bad luck or natural disasters, were the prominent cause of the suffering of migrant families such as the Joads. Thus, Steinbeck turns to the less prosperous for support in his novel. When referring to the superior living conditions barnyard animals have, the migrants remark, “Them’s horses-we’re men,” (556). The perfect simplicity of this quote expresses the absurdness of the migrants’ situation better than any flowery expression could.

In The Grapes of Wrath , John Steinbeck uses metaphors, particularly about nature, in order to illustrate the mood and the overall plight of migrants. Throughout most of the book, the land is described as dusty, barren, and dead. Towards the end, however; floods come and the landscape begins to change. At the end of chapter twenty-nine, Steinbeck describes a hill after the floods saying, “Tiny points of grass came through the earth, and in a few days the hills were pale green with the beginning year,” (556). This description offers a stark contrast from the earlier passages which were filled with despair and destruction. Steinbeck’s tone from the beginning of the chapter changes drastically. Early in the chapter, Steinbeck had used heavy imagery in order to convey the destruction caused by the rain, “The streams and the little rivers edged up to the bank sides and worked at willows and tree roots, bent the willows deep in the current, cut out the roots of cottonwoods and brought down the trees,” (553). However, at the end of the chapter the rain has caused new life to grow in California. The new grass becomes a metaphor representing hope. When the migrants are at a loss over how they will survive the winter, the grass offers reassurance. The story of the migrants in the intercalary chapters parallels that of the Joads. At the end of the novel, the family is breaking apart and has been forced to flee their home. However, both the book and final intercalary chapter end on a hopeful note after so much suffering has occurred. The grass metaphor strengthens Steinbeck’s message because it offers a tangible example of hope. Through his language Steinbeck’s themes become apparent at the end of the novel. Steinbeck affirms that persistence, even when problems appear insurmountable, leads to success. These metaphors help to strengthen Steinbeck’s themes in The Grapes of Wrath because they provide a more memorable way to recall important messages.

John Steinbeck’s language choices help to intensify his writing in his intercalary chapters and allow him to more clearly show how difficult life for migrants could be. Refraining from using specific names and terms allows Steinbeck to show that many thousands of migrants suffered through the same wrongs. Imitating the style of the Bible strengthens Steinbeck’s characters and connects them to the Bible, perhaps the most famous book in history. When Steinbeck writes in the imperfect dialogue of the migrants, he creates a more accurate portrayal and makes the migrants easier to relate to for a less affluent audience. Metaphors, particularly relating to nature, strengthen the themes in The Grapes of Wrath by enhancing the mood Steinbeck wants readers to feel at different points in the book. Overall, the intercalary chapters that Steinbeck includes improve his novel by making it more memorable and reinforcing the themes Steinbeck embraces throughout the novel. Exemplary stylistic devices further persuade readers of John Steinbeck’s personal beliefs. Steinbeck wrote The Grapes of Wrath to bring to light cruelties against migrants, and by using literary devices effectively, he continuously reminds readers of his purpose. Steinbeck’s impressive language choices in his intercalary chapters advance the entire novel and help to create a classic work of literature that people still are able to relate to today.

This essay sticks pretty closely to the standard analytical essay outline. It starts with an introduction, where I chose to use a quote to start off the essay. (This became my favorite way to start essays in high school because, if I wasn’t sure what to say, I could outsource the work and find a quote that related to what I’d be writing about.) The quote in this essay doesn’t relate to the themes I’m discussing quite as much as it could, but it’s still a slightly different way to start an essay and can intrigue readers. I then give a bit of background on The Grapes of Wrath and its themes before ending the intro paragraph with my thesis: that Steinbeck used literary devices in intercalary chapters to show how rough migrants had it.

Each of my four body paragraphs is formatted in roughly the same way: an intro sentence that explains what I’ll be discussing, analysis of that main point, and at least two quotes from the book as evidence.

My conclusion restates my thesis, summarizes each of four points I discussed in my body paragraphs, and ends the essay by briefly discussing how Steinbeck’s writing helped introduce a world of readers to the injustices migrants experienced during the dust bowl.

What does this analytical essay example do well? For starters, it contains everything that a strong analytical essay should, and it makes that easy to find. The thesis clearly lays out what the essay will be about, the first sentence of each of the body paragraph introduces the topic it’ll cover, and the conclusion neatly recaps all the main points. Within each of the body paragraphs, there’s analysis along with multiple excerpts from the book in order to add legitimacy to my points.

Additionally, the essay does a good job of taking an in-depth look at the issue introduced in the thesis. Four ways Steinbeck used literary devices are discussed, and for each of the examples are given and analysis is provided so readers can understand why Steinbeck included those devices and how they helped shaped how readers viewed migrants and their plight.

Where could this essay be improved? I believe the weakest body paragraph is the third one, the one that discusses how Steinbeck used plain, grammatically incorrect language to both accurately depict the migrants and make them more relatable to readers. The paragraph tries to touch on both of those reasons and ends up being somewhat unfocused as a result. It would have been better for it to focus on just one of those reasons (likely how it made the migrants more relatable) in order to be clearer and more effective. It’s a good example of how adding more ideas to an essay often doesn’t make it better if they don’t work with the rest of what you’re writing. This essay also could explain the excerpts that are included more and how they relate to the points being made. Sometimes they’re just dropped in the essay with the expectation that the readers will make the connection between the example and the analysis. This is perhaps especially true in the second body paragraph, the one that discusses similarities to Biblical passages. Additional analysis of the quotes would have strengthened it.

Summary: How to Write an Analytical Essay

What is an analytical essay? A critical analytical essay analyzes a topic, often a text or film. The analysis paper uses evidence to support the argument, such as excerpts from the piece of writing. All analytical papers include a thesis, analysis of the topic, and evidence to support that analysis.

When developing an analytical essay outline and writing your essay, follow these five steps:

Reading analytical essay examples can also give you a better sense of how to structure your essay and what to include in it.

What's Next?

Learning about different writing styles in school? There are four main writing styles, and it's important to understand each of them. Learn about them in our guide to writing styles , complete with examples.

Writing a research paper for school but not sure what to write about? Our guide to research paper topics has over 100 topics in ten categories so you can be sure to find the perfect topic for you.

Literary devices can both be used to enhance your writing and communication. Check out this list of 31 literary devices to learn more !

Need more help with this topic? Check out Tutorbase!

Our vetted tutor database includes a range of experienced educators who can help you polish an essay for English or explain how derivatives work for Calculus. You can use dozens of filters and search criteria to find the perfect person for your needs.

Christine graduated from Michigan State University with degrees in Environmental Biology and Geography and received her Master's from Duke University. In high school she scored in the 99th percentile on the SAT and was named a National Merit Finalist. She has taught English and biology in several countries.

Student and Parent Forum

Our new student and parent forum, at ExpertHub.PrepScholar.com , allow you to interact with your peers and the PrepScholar staff. See how other students and parents are navigating high school, college, and the college admissions process. Ask questions; get answers.

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

Improve With Our Famous Guides

For All Students

The 5 Strategies You Must Be Using to Improve 160+ SAT Points

How to Get a Perfect 1600, by a Perfect Scorer

Series: How to Get 800 on Each SAT Section:

Score 800 on SAT Math

Score 800 on SAT Reading

Score 800 on SAT Writing

Series: How to Get to 600 on Each SAT Section:

Score 600 on SAT Math

Score 600 on SAT Reading

Score 600 on SAT Writing

Free Complete Official SAT Practice Tests

What SAT Target Score Should You Be Aiming For?

15 Strategies to Improve Your SAT Essay

The 5 Strategies You Must Be Using to Improve 4+ ACT Points

How to Get a Perfect 36 ACT, by a Perfect Scorer

Series: How to Get 36 on Each ACT Section:

36 on ACT English

36 on ACT Math

36 on ACT Reading

36 on ACT Science

Series: How to Get to 24 on Each ACT Section:

24 on ACT English

24 on ACT Math

24 on ACT Reading

24 on ACT Science

What ACT target score should you be aiming for?

ACT Vocabulary You Must Know

ACT Writing: 15 Tips to Raise Your Essay Score

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

Is the ACT easier than the SAT? A Comprehensive Guide

Should you retake your SAT or ACT?

When should you take the SAT or ACT?

Stay Informed

Get the latest articles and test prep tips!

Looking for Graduate School Test Prep?

Check out our top-rated graduate blogs here:

GRE Online Prep Blog

GMAT Online Prep Blog

TOEFL Online Prep Blog

Holly R. "I am absolutely overjoyed and cannot thank you enough for helping me!”

Home / Essay Samples / Life / Career Goals / My Passion for Data Analytics: A Drive to Succeed

My Passion for Data Analytics: A Drive to Succeed

Category: Information Science and Technology , Life
Topic: Big Data , Career Goals

Pages: 1 (575 words)

Downloads: -->

--> ⚠️ Remember: This essay was written and uploaded by an--> click here.

Found a great essay sample but want a unique one?

are ready to help you with your essay

You won’t be charged yet!

Graphic Design Essays

Cyber Security Essays

Virtual Reality Essays

Computer Graphics Essays

Digital Era Essays

Related Essays

We are glad that you like it, but you cannot copy from our website. Just insert your email and this sample will be sent to you.

By clicking “Send”, you agree to our Terms of service and Privacy statement . We will occasionally send you account related emails.

Your essay sample has been sent.

In fact, there is a way to get an original essay! Turn to our writers and order a plagiarism-free paper.

samplius.com uses cookies to offer you the best service possible.By continuing we’ll assume you board with our cookie policy .--> -->

A Step-by-Step Guide to the Data Analysis Process

1. Step one: Defining the question

Tools to help define your objective

2. Step two: Collecting the data

What is first-party data?

What is second-party data?

What is third-party data?

Tools to help you collect data

3. Step three: Cleaning the data

Carrying out an exploratory analysis

Tools to help you clean your data

4. Step four: Analyzing the data

Descriptive analysis

Diagnostic analysis

Predictive analysis

Prescriptive analysis

5. Step five: Sharing your results

Tools for interpreting and sharing your findings

6. Step six: Embrace your failures

Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective

Introduction

Background and Related Work

Data Terms and Definitions

Related Work

Understanding Data Science Modeling

Types of Real-World Data

Steps of Data Science Modeling

Advanced Analytics Methods and Smart Computing

Types of Analytics and Outcome

Machine Learning Based Analytical Modeling

Regression Analysis

Classification Analysis

Cluster Analysis

Association Rule Analysis

Time-Series Analysis and Forecasting

Opinion Mining and Sentiment Analysis

Behavioral Data and Cohort Analysis

Anomaly Detection or Outlier Analysis

Factor Analysis

Log Analysis

Neural Networks and Deep Learning Analysis

Real-World Application Domains

Challenges and Research Directions

Declarations

The Community

Networking Opportunities

Let Us Help Your Business

Writing a Good Data Analysis Report: 7 Steps

Data Analysis Report Writing: 7 Steps

Benefits of Writing Well-Structured Data Analysis Reports

Final Thoughts

Related Articles

Article/Paper Categories

Roles and Titles

Career Resources

Community Resources

Quantitative Research Essay Examples

57 Best Quantitative Research Essay Examples

Decision Making: Starbucks Transformational Experience

Business Problem Matrix and Research Question Hypotheses

Asians Seeking U.S. Education

Facial Feedback Hypothesis

User Satisfaction and Service Quality in Academic Libraries: Use of LibQUAL+

Demography of Harbor Hills, Austin, TX

Predicting Unemployment Rates to Manage Inventory

Fuel Consumption for Cars Made in the US and Japan

Theoretical Stock Prices

Supply Chain Design: Honda Gulf

Using Smartphones in Learning

Action Research in Science Education

Introduction to Nursing Research

Students’ Perception of a Mobile Application for College Course

Carbon Fiber Reinforced Polymer Application

An Evaluation of the Suitability of ‘New Headway- Intermediate’ by Liz & John Soars

Parenting Variables in Antenatal Education

Sustained Organisational Learning Methods

The Achievement of Millennium Development Goals in India

The EV Products in China

Green Energy Brand Strategy: Chinese E-Car Consumer Behaviour

Binomial Logistic Regression