Fair data or FAIR data?
1. Fair data or FAIR data?
Hello folks! Let's talk about the other FAIR data.2. A new FAIR data you say?
In the context of data ethics, fair data is always about unbiased data. However, another acronym FAIR is also essential for organizations to use their data potential for their benefit and for others. The FAIR data principles, which represent efficient data management with a long-term view, expand to making data Findable, Accessible, Interoperable, and Reusable. The FAIR principles promote the use and reuse of data and enhance collaboration and innovation by making data ready for sharing within and across organizations. The FAIR principles were born in the scientific field for research data management but are now expanding to data-intensive fields such as life sciences, pharma, and many more.3. Why FAIR?
With vast amounts of data that may be fragmented across departments, organizations may not readily have a hold on their data sources and resources. Data is often siloed, and stored in different formats and versions, with no transparency on availability or origin, making it hard to exchange ideas or collaborate on data projects. With the advent of advanced computational technologies like AI, the need for having an oversight of data potential and harnessing data resources becomes increasingly important, especially to find and use them efficiently both by humans and machines. That's where the FAIR data principles guide organizations on efficient and long-term data management to reuse their precious data resources.4. Findable
The Findable principle state that datasets should be well described with rich metadata and have a persistent link that leads you to that dataset without any broken link issues or 404 errors. A doi or digital object identifier that you often see with scientific publications is one such persistent link. The metadata and the link will help humans and machines find the dataset efficiently.5. Accessible
Accessible means datasets should be accessible through physical or digital infrastructures or repositories either openly or through access controls and authentication.6. Interoperable
Interoperable data means that datasets can be opened and used in different operating systems like Windows, Mac, IOS, or Linux. For example when working with spreadsheet data, it's important to ensure that it is interoperable, that is, it can be easily shared and opened by everyone. One way to do this is by saving it in a .csv format, which is an open standard that can be opened on all systems. In contrast, MS Excel uses a proprietary format that may not work on all systems. FAIR principles require standards for such interoperability.7. Reusable
The Reusable aspect of the FAIR principles is also their overall objective- make sure that datasets can be reused by both humans and machines with clear set of usage rules.8. Terms of reuse
The terms of reuse must be clear for the re-users of data, so the reuse documentation offers detailed guidelines on how to use, for what purpose, and where to credit. There are many ways to explicitly state how to use or not use data using the copyright or other data licenses.9. Data licensing- Creative Commons
The most common data licenses that are both human and machine readable are the Creative Commons or CC licenses that have several variations of licensing data.10. Industry needs to go FAIR
While FAIR data is becoming more and more prominent for research data management and sharing in the scientific disciplines, it's also slowly finding its way into the industry. Pharma companies like Novartis and Janssen, and technology consultancy firms like Deloitte and Accenture have all started investing in implementing FAIR data principles. FAIR data standards, especially on interoperability, are evolving. Setting up the infrastructure and training for FAIR data principles may seem resource intensive initially, but the long-term benefits pay off regarding data harnessing. It also requires an attitude change toward collaboration and data sharing. FAIR data can create immense value, especially in collaborative projects like finding alternative energy resources and new disease treatments, which need large datasets from different sources to work together for solutions that benefit society and its innovation caliber.11. Let's practice!
Awesome, now let's go ahead to test your knowledge!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.