
12.1. Introduction
We are in a period where the decision-making process in healthcare, once knowledge-based, is shifting toward a data-driven perspective thanks to innovative technologies. The accumulation of massive amounts of data (big data) in the healthcare sector, its processing through analytical platforms, and the open sharing of this data (open data) have all contributed to this transformation, carrying us toward a new health paradigm within the knowledge society.
As communication technologies advance rapidly—becoming an integral part of daily life and transforming every aspect of it—it is important to understand their impact on healthcare. Considering its scale, scope, and complexity, this transformation—often referred to as the Fourth Industrial Revolution—has been a major theme in numerous national and international forums, including the 2016 World Economic Forum (WEF).
Although it is difficult to predict how this transformation will evolve, it is vital to conduct a comprehensive and inclusive evaluation involving all stakeholders in order to raise awareness and ensure proper management of the process.
Today, elements of the information society can be seen in almost every aspect of life. Everyone now carries a smartphone, every home has a computer, and every company has back-office units managing information technologies. However, information itself is not always visible. It took roughly half a century after computers entered human life for data to begin to accumulate in a way that gave it meaningful and distinctive value.
Currently, not only has the quantity of data increased, but access speed has also improved. This quantitative change has brought about a qualitative one. The collection of data in a meaningful and organized manner first occurred in astronomy and genetics. The term “big data” was initially used in these fields and has since been applied to almost every area of life. As the rapid sharing of large data sets has become more common, the concept of open data has gained increasing importance.
For example, Google has utilized search data for disease diagnosis and treatment research, while clinical trial results and side-effect profiles are increasingly shared and analyzed openly.
In this era of Big Data, investors, technology entrepreneurs, media, and consulting firms are focusing on open data to seize new opportunities.
The simplification and affordability of cloud hosting solutions have fundamentally shifted the economic balance in data processing. IDC predicts that digital records will reach 1.2 zettabytes (10²¹ bytes) by the end of this year and will increase 44-fold over the next decade. The main driver of this growth is unstructured data, and the key need is to store and analyze both structured and unstructured data together using data mining processes.
The current economic value of open data is estimated between $3–5 trillion, with its projected value in the U.S. healthcare sector alone being $300–450 billion. Governments remain the largest source of open data. In the U.S., the Department of Health and Human Services (HHS), under the FDA, took a leading role 12 years ago by making vast datasets publicly available.
It is mandatory for clinical research results to be published openly at clinicaltrials.gov.
In 2013, the Obama Administration announced its Open Data Policy by stating:
“Information is a valuable national asset whose value is multiplied when it is made easily accessible to the public.”
Today, the FDA alone shares more than 130 open datasets, and the Center for Medicare & Medicaid Services (CMS) continuously publishes new data. For example, the health effects, composition, and manufacturers of almost every household product are shared publicly through the Household Products Database.
Unlike other regulated industries, healthcare must manage not only issues of security and privacy but also new product approvals and reimbursement processes.
In line with Turkey’s National e-Government Strategy and Action Plan (2016–2019), which included the action item “Dissemination of Open Data Platforms in the Public Sector,” awareness of open data in Turkey is also increasing. Collaboration between universities, industry, and government is crucial for the country’s transition toward a knowledge economy.
In the pharmaceutical industry, for example, large genomic databases created for cancer research must be continuously accessible to researchers. In Turkey, many types of health data, such as organ and tissue statistics, are shared publicly via organ.saglik.gov.tr.
Hospitals collect and store patient-level data in digital systems to provide effective, personalized medical services and compare their performance using open databases to measure hospital quality.
12.2. Open Data Application Areas in Healthcare
- Clinical Research
- Preventive Medicine
- Hospital Data Analysis
- New Product Approvals
- Analysis of Avoidable Drug and Medical Equipment Use
- Fraud Detection (Health Insurance)
- Public Health Analytics and Risk Management
- Patient Relationship Management and Coaching
- Real-Time Patient Monitoring
12.3. Examples of Health Cost Management Consulting Using Big Data
United States
- Fraudulent and improper payments were identified in the $12 billion Medicaid program.
- 15–23% of suspicious claims were linked to specific types of services.
- After analysis of invoices from 20 major healthcare firms, suspicious claims dropped by 99% within six months (from $19.2 million to $140,000).
France – PRO BTP
- In the optical sector, 9% of claims were suspicious; in dental health, 14%.
- Estimated potential loss: €14 million over 21 months.
South Africa – Discovery Health
- With 2.6 million customers, the provider recovered $25 million from billing errors and insurance fraud.
- Predictive modeling reduced case resolution time by 99%.
South Korea – Allianz Life Insurance
- Achieved $1.4 billion in profit increase and a 12% reduction in debt.
- Fraud detection became 50% faster, allowing faster processing of legitimate claims and improved customer satisfaction.
Chronic Disease Management
- Asthmapolis: GPS-based analysis of inhaler usage linked to environmental triggers such as pollen or dust.
- mHealthCoach: Patient monitoring and coaching for those at risk.
- Pittsburgh Health Data Alliance: Collaboration between the University of Pittsburgh, UPMC, and Carnegie Mellon University.
- Cliexa: Digital platform for managing chronic pain (rheumatoid arthritis, IBS, etc.).
12.4. Development of Scientific Research in Healthcare
Medical progress traditionally arises from clinical trials whose results are published in scientific journals, later synthesized and incorporated into medical practice.
The first official clinical trial dates back to 1747, on sailors with scurvy. The New England Journal of Medicine, established in 1812, remains one of the oldest active medical journals. Controlled trials became standard in the 1940s, and evidence-based medicine emerged as a scientific concept in the early 1990s.
Randomized controlled trials (RCTs) are considered the gold standard in generating new medical knowledge, though they are expensive and time-consuming. Today, according to the Institute of Medicine (IOM), the ability to generate new knowledge surpasses the capacity of healthcare providers to adopt it in practice.
With modern technologies, traditional evidence-based approaches now integrate data collection, storage, analysis, and decision support, forming the foundation of practice-based medicine.
12.5. Big Data in Healthcare
A 2012 report to the U.S. Congress defined big data as:
“Large volumes, velocity, and variety of data requiring advanced techniques and technologies for acquisition, storage, distribution, management, and analysis.”
The European Commission (2014) similarly defines it as:
“Data sets that are difficult to process using conventional tools, or large volumes of diverse data generated at high velocity.”
The three main parameters of big data are volume, variety, and velocity—sometimes extended with veracity (data quality). Healthcare data is vast, diverse, and rapidly produced, requiring advanced analytical approaches to ensure accuracy.
Sources include:
- Clinical records (physician notes, prescriptions, imaging, lab data)
- Administrative and insurance data
- IoT and sensor data (vital signs, wearable devices)
- Social media posts (LinkedIn, Instagram, Twitter, etc.)
- Research articles, news, and medical publications
Health data is produced continuously, creating enormous data repositories—ranging from personal health records and genomic data to biometric sensor outputs.
Technological advances such as virtualization and cloud computing facilitate large-scale data storage and analytics, enabling real-time insights and predictive healthcare systems.
12.6. Applications of Big Data and Open Data Analytics in Healthcare
Big data analytics can be categorized as follows:
- Evidence-based Clinical Rules: Integrated into electronic health record systems for identifying potential risks.
- Statistical Algorithms: Used to detect potential risks across patient populations.
- Machine Learning: Continuously learns from new data to improve predictions of diseases and adverse effects.
Machine learning allows researchers to discover correlations in massive data sets without predefined hypotheses. Though algorithmic, these findings often serve as new hypotheses rather than definitive conclusions.
Applications include:
- Early diagnosis and personalized treatment
- Fraud detection in healthcare payments
- Predictive models for hospital stays, surgical outcomes, complications, and infections (e.g., MRSA, C. difficile)
- Public health analytics for outbreak detection and vaccine targeting
- Genomic analysis integrated into medical decision-making
- Remote monitoring using IoT devices
- Patient profiling and predictive care
By analyzing correlations and trends, big data analytics supports informed decision-making, higher care quality, and reduced costs.
12.7. Big Data Technologies
The conceptual framework of big data projects is similar to traditional health informatics, differing mainly in execution. Traditional analytics use single-computer BI tools, whereas big data utilizes distributed computing clusters.
Hadoop/MapReduce, an open-source distributed platform, is widely used in healthcare for large-scale analytics. It divides large datasets across multiple servers and merges the processed results.
Other NoSQL technologies such as MongoDB and CouchDB are also popular. While powerful, these systems require complex configuration and management. Commercial platforms like IBM BigInsights, Cloudera, Hortonworks, and AWS support these frameworks.
12.8. Big Data Methodology in Healthcare
A four-step framework is proposed:
- Conceptualization: Define the rationale in terms of the 4Vs (volume, velocity, variety, veracity).
- Proposal Development: Define the problem, its importance, and justify the use of big data analytics.
- Methodology: Identify data sources, select platforms and tools, prepare and process data for analysis.
- Testing & Evaluation: Validate models and present actionable insights to stakeholders.
12.9. Precision Medicine – Catalyst of Big and Open Data in Healthcare
In 2015, U.S. President Barack Obama launched the Precision Medicine Initiative with a $215 million investment to enable personalized treatments for diseases such as cancer and diabetes.
Precision Medicine emphasizes individual variability in genes, environment, and lifestyle—moving from “one-size-fits-all” to personalized prevention and treatment.
For example, Nobel Laureate Aziz Sancar and his team mapped the DNA damage caused by smoking on June 14, 2017—a milestone for drug development. As such maps are shared as open data, collaboration between academia and industry accelerates new therapies.
Platforms like Cliexa (www.cliexa.com) exemplify the integration of mobile health technologies and social media into patient participation. Behavioral and genomic data, combined with electronic health records, allow participants to monitor and access their data.
Considering that diseases are shaped by genetic, environmental, and behavioral factors, studies integrating genomic, environmental, and sociodemographic data will define the future of medicine.
12.10. Conclusions
We are witnessing a paradigm shift in healthcare—from knowledge-based to data-driven decision-making. The accumulation, analysis, and open sharing of vast data volumes are leading us toward a new digital health paradigm.
However, challenges remain:
- Data fragmentation across systems
- Integration difficulties
- Security and privacy concerns
- Reluctance among stakeholders to share data due to competitive or ethical concerns
Balancing data security, integrity, and usability is essential.
Technological advances in data collection, storage, and processing have the potential to revolutionize health knowledge generation and application. Yet, transformation must go beyond infrastructure—it must encompass culture, values, leadership, and organizational change.
Concepts such as precision medicine, telemedicine, deep learning, open data, and big data—when integrated into the ecosystem of the Fourth Industrial Revolution—can help establish learning health systems. By adopting data-driven, actionable insights, countries like Turkey can build upon their healthcare reform successes and lead the next phase of global health innovation.
References
- Brian Dolan. How digital health tools figure into the White House’s Precision Medicine initiative.
- Collins FS, Varmus H. A New Initiative on Precision Medicine. N Engl J Med. 2015.
- Raghupathi, W., Raghupathi, V. Big Data Analytics in Healthcare: Promise and Potential. Health Information Science and Systems, 2014, 2:3.
- Marr, Bernard. How Big Data Is Transforming Medicine. Forbes-Tech, 2016.
- Salas-Vega, S., Haimann, A., Mossialos, E. Big Data and Health Care: Challenges and Opportunities for Coordinated Policy Development in the EU. Health Systems & Reform, 2015.
- Hansen MM, Miron-Shatz T, Lau AY, Paton C. Big Data in Science and Healthcare: A Review of Recent Literature and Perspectives. Yearb Med Inform. 2014.
- Braunstein, Mark L. Health Big Data and Analytics. In: Practitioner’s Guide to Health Informatics. Springer, 2015.
- Silay, Y.S. Open Data from the Perspective of Healthcare Industry. Open Data Turkey Conference, 2017. http://bigdata.rutgers.edu/yavuz-silay.html
Yavuz Selim Sılay, MD,MBA
- ICG (İstanbul Consulting Group) Yönetim Kurulu Başkanı
- drysilay@yahoo.com
- http://www.istanbulconsultinggroup.com