Contact Us
01908 265111

Definitions

Definitions

Contact Us: 01908 265111


Data Quality

There are seven important dimensions to data quality, these are accuracy, availability, completeness, granularity, reliability, relevance and timelines.

Data accuracy is a vital component of data quality. The accuracy of the data matters as it underpins the accuracy of any analyses performed on that data. Human resource management systems will ensure data accuracy using a variety of means although it is ultimately not possible to prevent human error. A set of reports can be used to identify data that is likely to be inaccurate using simple functions to identify values that are outside a predefined tolerance (e.g. identifying employee over the age of one hundred). The administrator of a human resource management system must give consideration to the management of human errors that may impact the quality of their data and set up processes and procedures to mitigate this risk.

Data availability describes how easy it is to access data when needed. Complex analysis on large data sets takes a long period of time to complete and in some cases is performed by super computers. In that case the data could not be considered to have a high level of availability (or timeliness). Data availability is also part of data reliability, a disaster recovery plan will include directly or indirectly clauses relating to data availability.

Completeness feeds into other area of data quality including accuracy. In a data table completeness may be reflected in requiring that each column must be completed, so when the table is viewed there are no gaps. Completeness also requires that sufficient data is collected such that later uses of the information can be fulfilled. An example of in complete data would be addresses where it is not possible to record the postcode or it is not required; this later makes it impossible to send a letter which invalidates the purpose of collecting the address at all.

Granularity is the level of precision used to record the data. A low level of granularity may allow an address to be recorded as a single item of data, a high level of granularity would split the address into multiple fields including town, county and postcode. Capturing data at a high level of granularity may be more time consuming initially and this is rewarded by allowing for a high degree of flexibility later on when analysing the data. If the type of analyse that will be required is not known high granularity may guard against this risk.

Reliability is related to accuracy, availability and completeness. Data that is unreliable has a low quality and this can be through being unable to access it or because it is unlikely to be accurate. Data sources with a high level of reliability are more likely to be trusted across an organisation that unreliable ones.

The relevance of data can be related to its level of granularity. High levels of granularity may mean that irrelevant data is captured due to its precision. Exploring the relationship between granularity and relevance during data analysis can identify which data must be captured; even if this does not necessarily translate to which data fields must be present. Data relevance is also related to timelessness of data, an example would be that when looking at the news a person will find todays news more relevant than yesterdays.

Timeliness relates to the availability and the nature of data. If data cannot be made available in time to be useful then it has a low level of timeliness. Users of data use it to inform their decisions and may make a bas decision if they do not have all the information available. Timeliness in the context of accuracy can refer to the age of the data itself. Address records in a system can be accurate when they are captured but over time become less so, there is a frequency that data must be rechecked to maintain its accuracy. The trade off between these two is that constantly updating the data so it is entirely accurate as of this moment increases accuracy but not timeliness unless a person using the data is also able to interpret the data at the same rate.

Explore our online resources

Calculator

Calculators

Glossary

Glossary

The Friday Tip

The Friday Tip

Articles

Articles


Discover more about these topics

Calculator Icon

Need a Complete HRM Solution?

People Inc. is an integrated self hosted HRM solution, click here for more information.

People Inc. Logo