One Pager Cheat Sheet
- Accurate 
datais critical for businesses wanting to maximize efficiency and profits, so a range of data cleaning techniques can be used to prevent any issues arising. - Data cleaning is the process of removing incorrect, corrupted, improperly formatted, duplicate, and incomplete data from collected datasets, and is necessary to ensure a successful data analysis process.
 - The 
data analystmust analyze the cleaned data to answer questions and spot patterns that may be used to develop the next hypothesis. - The data cleaning process includes 
Data Preprocessing,Data Transformation,Data ValidationandData Analysisto ensure accuracy and uncover insights. - Data cleaning is important to ensure that datasets used for data analysis are free of irrelevant and incorrect information, 
maximizingtheir efficiency and effectiveness in order toavoidobtaining disappointing or misleading results. - The data cleaning process removes irrelevant and redundant information, reducing the 
computational complexityof the analysis and increasing its accuracy and efficiency. - Data wrangling is the process of combining data from multiple sources and cleaning it so that it can be easily accessed and analyzed, and is essential in producing useful data to 
business analystsin a timely manner to make better decisions. - Data wrangling is a time-consuming process that generally involves 
data discovery,structuring,cleaning,enriching,validatingandpublishing, in order to prepare data for analysis. - Yes, cleaning is an essential part of the 
data wranglingprocess to remove any inaccuracies and ensure data accuracy. - Data wranglers need to possess knowledge of statistical languages such as 
RorPythonas well as tools likeTabula,Talend,Parsehub, andScrapyfor data wrangling, data preparation, and data cleansing. - Data wrangling 
automates data flowand combines various data sources toexchange data quicklyandincrease usability, resulting in cost and time savings. - The 
technical termof data wrangling does not involve the speedy exchange of data or the ability to quickly exchange techniques with large amounts of data asbenefits, rather it involves the ability to automatically schedule data flow activities and combining information from different sources. - By converting the different data formats into a common format, data cleaning ensures that a data analyst can accurately 
identifythe name of the most-watched movie between 6:00 pm and 10:00 pm. - The main takeaway from this lesson is that 
data cleaninganddata wranglingcan significantly reduce the amount of time spent on data analysis and help identify the most important information. 


