To prepare the Challenge4Cancer Season 2, Epidemium decided to partner with STIM (wearestim.com) to explore the full potential of using data for cancer. How to use the power of crowd, open source, open innovation to help scientists from various disciplines generate original solutions and advance our current understanding of cancer?
The emerging methods of data analysis can create a better understanding of complex fields like medicine or life sciences. But this requires to phrase challenges that can be comprehensible for doctors, patients, and data scientists alike. The C-K (Concept-Knowledge) method offers a solution.
Open source publications, communication technologies, digital platforms contribute to a broader knowledge access and enable science to be even more open, allowing participants that have expert knowledge and special abilities to solve significant questions.
The access to information, traditionally considered as a strategic feature in any scientific research organization, is not limited anymore to a few players that are capable to invest in expensive and long R&D programs. It is now accessible to almost anyone who is willing to learn, experiment and possibly make a scientific contribution. Increasing disclosure in scientific processes is driving the emergence of communities around scientific projects. Leveraging on these open communities to solve various problems and even to create scientific discoveries is becoming increasingly popular.
As Sauermann and Franzoni (2015) pointed out, a growing amount of scientific research is done in an open manner. Examples can be found across different domains and disciplines.
For instance, the Polymath project launched in 2009 by Tim Gower was sought as collaboration among mathematicians to solve important and difficult mathematical problems by coordinating many specialists to communicate with each other on finding the best route to the solution. Polymath resulted in more than 12 challenges launched. The project proved that many minds could work together to solve difficult mathematical problems.
A joint project between Harvard, TopCoder and Broad Institute and Crowd Innovation Lab was launched to organize a series of challenges on the development of algorithms for faster DNA sequence alignment and to improve analysis of gene expression data. These examples clearly demonstrate that open initiatives can deal with extremely complex problems.
Research activities in the age of open science can be found across different domains and disciplines
These projects often refer to crowd or open science. They are characterized by open participation and sharing data and problem-solving techniques with participants. Open science promoters often highlight the possibility to learn, collaborate with others, and test new theories.
Open science for cancer research
In recent years life science and medicine have been facing major changes with the apparition of new massive sources of information such as genomic identity or global patient environment. In parallel, new forms of treatment like biotherapies that consider diseases like cancer in their global environment or personalized treatments based on patient’s genome information are becoming more available. Many areas such as epidemiology are undergoing major transformations that require new methods of data analysis. These disciplines are now using open collaborative settings to explore new ways to deal with these massive sources of data.
Epidemium, a collaborative initiative to explore new paths for cancer research, was launched in 2016. An inclusive and community-based open science program, Epidemium is a joint program of a pharmaceutical company, Roche, and an open and community laboratory, La Paillasse. The program uses data challenges, “Challenge4Cancer,” to approach the epidemiology of cancer in an open science framework.
Launched in 2016, the first Epidemium challenge was a blast: 678 people participated in creating a broad community of experts bringing various competencies on data analysis, statistics, visualization, data mining, oncology, epidemiology. In total 15 different projects were developed over 6 months. These projects were subject of evaluation by the scientific and ethics committees to control the scientific validity of the results, originality, collaborative aspect, impact, and perspectives on patients of the proposed approaches and to verify that explorations were ethically correct.
In a perspective of knowledge sharing, Challenge4Cancer’s participants had to document their advances and results on a wiki page accessible to anyone. This transparency allowed for continuous discussion during the challenge and enabled to create a vibrant community.
Despite these achievements, some difficulties related to the novelty, the validity of the results and identification of promising research direction were underlined. One of the critical points was the identification of research questions and challenges.
Given the importance of designing research directions, in 2017 Epidemium decided to launch a preliminary exploration to create a better understanding of the stakes and identify research questions to tackle.
Shaping new research directions with Concept-Knowledge method
Solving questions using new approaches is exciting. But it is crucial to solve the right questions. What is the right knowledge gap to analyze? How to identify the research gaps in cancer research that can be relevant to tackle using data analysis? How to ensure that the relevant data is collected?
To design research questions, one would normally analyze the existing knowledge gaps and try to formulate questions that are novel enough. In the case of Epidemium, the state of the art is quite broad, be it only since it includes disciplines related both to cancer and data analysis. Following the traditional literature review would have been too costly and time-consuming. Moreover, since the challenge aims to develop entirely new connections between different disciplines, knowledge advances should be presented in a way concise and simple enough to allow non-experts to have a quick understanding of what is going on.
In order to explore the possibilities related to data analysis & cancer research in a systematic way, to identify the framework of the current approaches and to generate a set of innovative concepts, a design theory based framework was applied.
This framework was based on a design tool derived from the Concept Knowledge (C-K) design theory of innovative design reasoning. Design theory was chosen since it allows for knowledge expandability that goes beyond pure combinatorial strategies and considers dynamic transformations, adaptations, hybridizations, discovery, invention, and renewal of objects discovery. The C-K design framework is useful for understanding novelty since it not only separates state of the art (available knowledge) and exploration phase (concept development) but also defines how to use the existing knowledge to structure the unknown.
C-K Design Theory is based on two interdependent spaces. The Concepts space has a tree-based structure. This tree underlines the design paths for each idea and emphasizes its relation to other fields. The Knowledge space is represented by knowledge databases where different types of knowledge (with mention of its robustness and maturity) can be emphasized.
Mapping potential research directions for dealing with cancer using big data
Along with workshops involving doctors, patients, and data scientists, STIM and Mines ParisTech used the C-K design framework to establish a common understanding of cancer and cancer treatment as well as the available data and data analysis techniques that can be used.
This step was crucial to build a common vocabulary across experts from different domains, contextualize current approaches using the C-K framework and define the limits of current approaches.
Once this understanding was made explicit, alternatives were easier to identify by seeking the external knowledge and mapping the existing products. To imagine these alternatives, several workshops were organized with specialists in data analysis and cancer and completed by literature review and close work with the Epidemium team. In total 25 experts participated in the workshops. They first shared their common vision of the field (i.e., cancer data is used by medical professionals, who collect this data and use it to better understand cancer, see Figure 3 for the extract of the map).
Alternatives were proposed at each level of map. For example, non-experts can use the data, different actors can access the data (and not just medical professionals) and these data can be used differently. Establishing the common understanding helped the experts to identify alternatives.
For instance, today cancer screening is mostly performed by medical staff. The alternatives were imagined to explore self-screening techniques or screening performed by the third parties (these screening techniques should be non-invasive).
Moreover, screening should occur not just when the first symptoms appear but on a regular basis. People at risk should be identified (through genome analysis, age, sex, exposure to different risk factors) and they should benefit from frequent individual screening. In the future, continuous screening in real time should even be considered.
Figure 2: Colorectal cancer screening in Europe is often performed by medical staff
What about data? Different information was relevant (depending on the data use) such as data related to the patient health status, to the treatment efficiency and non-efficiency, to the patient’s behavior (nutrition, activity, work), to the environment or other external factors that can affect a person; epigenome data, data related to patient care services, to the country economy, etc.
Different alternatives explored and structured thanks to the C-K framework enabled to identify 45 exploration axes such as automatically assigning patients to different departments based on a type of cancer, socio, treatments, assessing treatment efficiency or failure ex post including risk & environmental data, anticipating the efficiency of treatment and side effects per the patient profile and, for each organ, understanding which type of cancer can occur.
The first results were exposed to a larger Epidemium community (around 100 people) for their comments and suggestions. The results were validated with the scientific and ethical committee of the Epidemium community.
Figure 3: Extract of the map created
This collaborative work helped the community shape a variety of research directions and identify the knowledge needed to go further. The map is available to anyone who wishes to better understand the problems of cancer and its treatment and to extend the map or complete it with existing projects.
Dealing with emerging research directions in a transdisciplinary context
Creating interdependencies between previously unrelated fields or concepts can lead to unexpected ideas. Forcing to create an interconnected map of concepts related to several rather independent fields allowed the Epidemium community to create a proximity between different experts and extend the exploration space to create a common understanding related to cancer.
Using design-driven frameworks like Concept-Knowledge helped to understand and explore various alternatives that Epidemium can follow to build research directions and see how other initiatives are positioned, resulting in a visual benchmark of current research on data analysis initiatives for cancer.
This map helped to explore and generate potential hypotheses that are accessible to the community. The proposed map is not exhaustive and it is subject to constant changes and improvement. Nevertheless, it offers a comprehensive overview of a complex problem and provides a rich set of research directions.
This approach aimed for a systematic exploration of all the possible alternatives, thus trying to avoid the cognitive biases that limit participants’ exploration capacity to solutions that are too obvious or exist already. Moreover, dealing with the existing knowledge fostered a better understanding of what is the current state of the art and helped organize the search for new knowledge. It increased the ability of designers to generate original concepts.
We believe that this approach has a potential for developing more general models. It might be interesting to combine the design-driven strategy with visualization, text mining or statistical approaches.
*This article is originally published in Paris Innovation Review, on 13 October 2017
Download our book “Stim decrypts: Innovation methodologies” (in French), where we use the CK method as an analytical framework to decipher the most popular innovation methods: Bono Hats, Design Thinking, Lean Startup, ASIT and the Ocean Blue Strategy.
If you like this article, share it on your social network!