There appears to be a correlation between educational attainment and salary, as individuals with higher education levels tend to have higher salaries.
Some individuals with similar educational backgrounds and occupations have notably different salaries, suggesting that factors other than education and occupation may influence salary disparities.
The spreadsheet’s usefulness depends on its intended purpose and context. It contains information about individuals’ demographics such as salaries, locations, occupations, and education levels. On a scale of 1 to 10 for ease of understanding, it rates about a 6, as the data has clear column headers and values
However, there is room for improvement in terms of clarity and organization, such as consistent formatting of the salary column and potentially adding more context.
Unfortunately, there is no documentation available to assist readers in understanding the dataset better. Providing a data dictionary explaining column meanings and abbreviations would be helpful.
Jadin’s Data Set (Employee Information)
My Overall Rating: 6.5/10
Structure: 7/10 – The Structure to the Dataset overall is a good format, but the headers are inconsistent and should be more uniform and standardized, for example the [first Name] header should be changed to [First Name] same goes for [employed Since] which should be changed to [Employed Since]
Standardization: 3/10 – This datasets has various inconsistencies and conflicting datapoints a few examples include: missing datapoints, incorrect standardization, inconsistent formatting, and a overall lack of structure.
Readability: 8/10 – This Dataset as a whole has a lot of inconsistencies and would need cleaning and standardization before you could properly work in it, despite that however the format along with the structure is organized in a way that once its been cleaned and formatted properly the up-keeping would be simple to manage with minimal experience handles datasets.
Selam’s Data Set (Sales)
A clean dataset is essential for accurate analysis, efficient work, effective reporting, data integrity, time savings, and better decision-making. It forms the foundation for reliable insights and informed actions.
The mistakes are a typo, capitalization, and an incorrect monetary value.
How uncleaned data affects usefulness of data set.
Uncleaned data can result in inaccuracies, inefficiency, and disruption of lookup operations.
It slows down data entry and cleaning processes, impacting data consistency and sorting/filtering.
Calculation and formulas are also adversely affected.
Overall, data integrity is compromised.
How would we clean data set:
Inspect data for errors, ensure consistent formatting, standardize data type, and perform necessary transformations, including unit and currency conversions, and text cleaning.