Abstract
It is already known that the diet of the world’s population has a massive impact on climate change. However, how climate change affects the growing conditions of ingredients for different foods and beverages, and emission rates due to, for example, production and logistics are still not known. In this work, different datasets have been explored to study the feasibility of interlinking datasets to automatically generate alternatives for climate change-sensitive food items selection and substitution. A core question to be answered is what the alternatives of the mostly consumed crops in current diets in the Netherlands in case of a climate change can be. The main crop attributes taken into account are nutritional composition and the growing conditions. The growing conditions of three most-consumed crops in the Netherlands have been linked manually to their nutritional composition data and a corresponding knowledge graph is created. This study shows that linking various data semantically promises to generate alternatives automatically.
You have full access to this open access chapter, Download conference paper PDF
Keywords
1 Introduction
It is already known that the diet of the world’s population has a massive impact on climate change [1, 2]. However, still too little attention is being paid to the climate change’s impact on the growing conditions of ingredients for different foods and beverages and further, similarly, to emission rates due to, for example, production and logistics. The provenance and climate change impact on various foods are often not clearly known or accessible, both for end consumers as well as for the whole supply chain elements.
To give an example, many food options are untrivial and interdependent in terms of sustainability, for example, it may be unclear to consumers that production of mineral water (due to the packaging materials used) may be more damaging to the climate than the production of rice, and further aspects (e.g. logistics and prices) become relevant. As in all information-intensive environments, food producers and consumers continuously face complex decisions on which ingredients or products to choose, in which amounts and how to process them or which alternatives to select for the products that are consumed regularly. To make decisions, the food producers, providers and consumers need access to data about these food items, for example their nutritional value, taste, sustainability characteristics as well as the needed nutrients and logistics information. This information is still scattered, and the quality of the data varies. Meanwhile, data indicating climate change impact of different foods and beverages exists (or can be collected) as well as data on the supply chains. However, these data are still often not easily available and discoverable, and have no explicit representation and connections between them, in a way these can be achieved with semantic technology (ontologies and knowledge graphs). Generally, ontologies are data models representing a set of concepts and their relationships in a domain. Knowledge graphs include domain-specific data points (instances of these concepts) and their specific data and object properties. Knowledge graphs are highly scalable and flexible data structures which allow us to develop a linked data model to illustrate these explicit linkages between the crops, nutritional contents and also growth temperature conditions.
In this work we aim to reach a clear understanding of the diets and how to make them equivalently nutritious, sustainable and approaching near-zero CO2 emission, as well as also change the diets considering and adapting to climate change characteristics taking into account the growing conditions. The main question that is to be answered with this research is: “How can we interlink datasets so that an alternative to the current consumed products can be (automatically) found by taking into account the nutritional composition, growing conditions which will be effected by climate change and sustainability information?” and “To what extent can that process be automated?”.
The objective of the research is to identify the relevant data and make them more accessible for discoveries and supporting (automatic) decision making in the food supply chain and for end consumers. Thus, the goal is to develop and generate knowledge graphs, benefiting from semantic technology which helps interlinking scattered information using standardized concepts. With employment of knowledge graphs and using them for interlinking and our I-KNOW-FOO project approach presented in this work, one will be able to create a web-like large-scale data infrastructure and tools to easily explore the domain for everyone (researchers, businesses, policy makers, manufacturers, consumers), as well as to assist in making estimates of CO2 footprints of various foods and beverages and in adaptation of the diets given the climate change.
The remainder of this document is organized as follows. Section 2 explains the methodology of the work and the starting points with relation to the available data. In Sect. 3, the results are presented, Sect. 4 evaluates the approach, and Sect. 5 describes the conclusions and future work.
2 Data and Methods
For the purpose of the study, data on import, growing conditions and nutritional value of crops was required. The datasets were searched on the web, through data repositories, government websites, academic databases, and open data portals. Preferably, the data on growing conditions and nutritional values would be accessed through existing ontologies as this saves steps in data management and it implies that a shared vocabulary already exists. Alternatively, datasets and databases were converted to linked data and a knowledge graph was constructed manually. The following tables (Tables 1 and 2) include the ontologies and databases that have been collected on both crop growing conditions and food items, respectively. Data inquiry has been conducted based on the following keywords: Crop ontology, Plant ontology, Agriculture ontology, Growing conditions ontology, Nutritional profile ontology, Crop traits ontology, Crop characteristics ontology.
As the datasets are scattered, the research started initially by making an inventory of the available datasets, ontologies and knowledge graphs on food products, and the impact of climate change on the food products availability. The datasets and databases screened were the SHARP Indicators Database, Food Consumption Impact datasets (Optimeal-Blonk Sustainability Datasets), RIVM Sustainability dataset, the Pizza dataset, the Eaternity Database, Data Explorer: Environmental Impacts of Food, Dataset on potential environmental impacts of water deprivation and land use for food consumption in France and Tunisia. Food environmental impact UK database (by Clark et. al.), and the World Food LCA Database. Among those, a few are publicly available [16,17,18,19,20,21,22,23,24].
As the databases and datasets are from all over the world, the food products vary from one database to another and it is not straightforward to map or relate them. Additionally, the existing food databases provide information about impact of food consumption on sustainability and they do not have a direct link to changing climate conditions which is required to determine alternatives for the original products that are currently part of the diets.
We then defined a use case that focuses on the most-imported crops in the Netherlands to connect the consumption to the changing climate. For most-imported crops, their important nutritional values are determined and alternative crops are found (e.g., for the case when the Netherlands may run out of these crops in a changing climate over years). For this goal (to determine the most-imported crops) we use the FAOSTAT Database [25]. The crop information was manually interlinked to growing conditions. The most useful information was considered to be found in the ECOlogical CROP Database (ECOCROP) [8], as it contains information on the growing conditions of more than 3000 crops. However, unfortunately, it was not represented as linked data, so we had to uplift it to this format.
3 Results
The use case focuses on the most-imported crops in the Netherlands. Our aim was to determine the most imported crops to the Netherlands in order to evaluate their important nutritional characteristics and to find alternative crops to these crops for the Netherlands, if these crops become unavailable (such as due to climate change). Top 10 commodities that were imported to the Netherlands within the last 5 years (2016–2020) were screened using the TRADE Datasets for Crops and livestock products in the FAOSTAT database. Moreover, commodities supplied to the Netherlands were assessed using Food Balance Datasets in terms of Domestic Food Supply Quantity (1000 t/yr) and Food Supply Quantity (kg/capita/year). These commodities are listed in terms of their import quantity, import values, supply quantities, in descending order (see Table 3).
We focused on three main commodities that are imported in high quantities and supplied to the Dutch population, and identified soybean, wheat and potato as the mostly imported and consumed food products. The next step was to find nutritionally similar alternative crops using the NEVO Dutch Food Composition Database. We have also searched for growing conditions of original and alternative crops, and developed knowledge graphs to link these data and reuse parts of the existing knowledge graphs. The alternatives are generated by manually processing the intersection of different result sets of queries on the knowledge graphs (either manually or automatically) for nutritional equivalent (or nutritionally better) food items and crops that are more climate-resistant. In the following parts of the section, we will describe the resulting ontologies and knowledge graphs and the querying in our approach.
3.1 I-KNOW-FOO Ontologies and Knowledge Graphs
To be able to answer basic queries for our problem setting, we have prepared the data as follows, applying manual and automated uplifting and extension to knowledge graphs and ontologies.
Manually Generated Alternatives.
To find alternatives to the three crops, we have focussed on parameters of climate resilience, nutrient-rich comparable crops and food products that have been screened using knowledge rules provided by a dietary expert using the NEVO database. These possible alternative crops have then been evaluated in terms of their resistance to temperature increases in a changing climate using crop growth temperatures from the ECOCROP database.
Generating an ECOCROP Ontology.
The ECOCROP database is transformed into a knowledge graph manually. First, the dataset has been cleaned. The measurementType ‘optimalGrowthTemperature’ has been subdivided into maxGrowthCelsiusTemperature and minGrowthCelsiusTemperature to distinguish between the two as well as add a unit into the predicate. The triples consist of the occurenceID as subject, measurementType as predicate and measurementValue as object. These have been transformed using the OntoText Refine tool and have been loaded into an RDF repository in RDF4J. OntoText Refine is a software tool that supports the transformation of string data into knowledge graphs [26].
ECOCROP Extension and Interlinking to FIO and FoodOn.
We have extended ECOCROP manually by adding triples linking some of the occurenceIDs in ECOCROP to the IDs of crops in FoodOn (including NCBITaxon [27]) and food items in FIO (Food Item Ontology [15]), based on the RIVM NEVO IDs. In FoodOn, we have chosen for instances of the organism class, because these represent the plants rather than the different foods that may originate from these plants. The plants namely are grown under (climate-changing or not) temperatures, not so much the foods. The relations used for linking the concepts are the skos:closeMatch and the owl:sameAs relations.
The open access ECOCROP ontology and knowledge graphs created in our project are available at: https://git.wur.nl/FoodInformatics/i-know-foo.git.
3.2 Querying the Knowledge Graphs
Subsequently, we have loaded the triples in the triple repository, where the information can be queried using SPARQL. In the future, this could be done by an automated tool. The query that we have formulated searches for crops that are more resilient to a warmer climate, being candidates to replace the current crop. So far in this exercise, we have only focused on the maximum growing temperature being one of the important factors in climate change on crop growth [28]. In our examples, the maximum optimal growing temperatures are 33 ℃ for soybean, 23 ℃ for wheat and 25 ℃ for potato. Combining this information with nutritional values information, still leaves multiple but often restricted options for food alternatives with similar nutrition characteristics. For example, for potatoes, possible alternatives are beans white/brown dried, peas green dried, chestnuts raw, tapioca, cassava raw, taro raw, yam raw, tannia raw, beans black eyed dried, peas split yellow/green dried, tamarind, flour cassava. The alternatives are found by intersecting the different result sets, i.e., the climate-resilient crops from ECOCROP and the alternatives as defined by nutritionists.
To obtain the solutions, queries have been written to find alternatives based on growing temperature and these alternatives have been superimposed onto the nutritional results from NEVO. This identifies the alternatives that are more climate resilient as well as nutritional equivalent. For each crop, a SPARQL query can be written for finding alternatives when altering the optimal maximum growing temperature. For example, for wheat, the maximum optimal growing temperature is 23 ℃. The query will therefore be:
This query returned 1,790 crops with a maximum optimal growth temperature greater than 23. These alternatives were then superimposed onto the nutritional alternatives for wheat from NEVO, resulting in the four alternatives as listed in Table 4.
We have attempted to automate the intersecting (superimposing) of the different result sets (temperature-resistant crops, food items with equivalent or improved starch, pyridoxine, ascorbic acid and potassium levels), but unfortunately that has not worked out. The SPARQL query, given below, appeared to be too heavy due to the five filters that were required. With four filters (one removed) it was still possible to obtain results (in a regular desktop computer set up), but the processing time went up unacceptably high at increasing query complexity (given number of filters included):
We have also converted the query to a nested query with subqueries for each of the filters, with the aim to retrieve a relatively small part of the data per subquery and hence reduce the amount of data processed in the overarching main query, but that had no effect. Future research should focus on (further) query optimization.
What is more, the ECOCROP ontology should be extended with candidate alternatives for crops provided by the dietary expert. Presently, all crops/food items in the ontology are considered as alternatives (i.e., only based on higher growing temperature and equivalent or better nutritional values), rather than a specific set that is really suitable as alternative for the specific food item focused on, replacing the current food item in a meal or recipe at a specific moment of the day.
4 Evaluation
The following section describes the results of the evaluation of the approach. The interoperability was tested through a use-case scenario in which new data was linked to the knowledge graph and queried for results. The section contains a description of the use case scenario, the approach to linking new data, and the results from the query.
Suppose a certain area of cropland is being affected by an increase in average annual temperature, rendering it increasingly more difficult for wheat to grow as it requires not to exceed a certain maximum temperature throughout the year. A farmer may want to find other crop options to cultivate on the farmland in order to increase efficiency and climate resilience. However, besides searching for alternative crops that can withstand higher temperatures, the farmer is also concerned about the change in profits when switching to alternatives. When changing to a different crop, producer prices for yearly yield will also change. Therefore, if the farmer wants to identify climate-resilient crops and prioritize these results based on producer prices per tonne, a new dataset should be added to the knowledge graph.
In order to add pricing as a further prioritizing variable for the identified crops, a new dataset was also added to the knowledge graph as a part of evaluation. Note that the data for this use case is synthetic, for evaluation purposes, and it does not represent actual market data. All results and figures from this validation should not be interpreted as real crop pricing data. As data on producer pricing of crops is difficult to find due to frequent changes and a lack of accessibility, a synthetic database provides a viable alternative for a validation use case.
Synthetic data was generated to create a simulation for producer prices on the crops that have been identified as alternatives for wheat (see Table 5). Subsequently, the synthetic data was added to the repository, and the original SPARQL query for wheat was extended and run in the repository as follows.
After the query output was superimposed on the manually created NEVO alternatives, it resulted in the data as shown in Table 6.
5 Conclusions and Future Work
More sustainable food production, distribution and consumption options can be discovered by all stakeholders, eventually leading to near-zero CO2 emission diets and sustainable food production that will have a positive impact on climate change and will also be adaptive to it. Linking datasets and unchaining the information about crops and food products allow automatically finding nutritionally similar alternatives in case of changing climate. This research demonstrates that automation is possible. In this work, alternatives are generated manually for the three most-imported crops in the Netherlands to showcase the feasibility of automatic generation. The growing conditions of the crops are defined in the ECOCROP ontology which we based on the open ECOCROP data, with the nutritional values available from FIO and based on NEVO. The linking between NEVO database and the ECOCROP ontology is done through the NEVO codes (inserted in the ECOCROP ontology).
The findings demonstrate the effectiveness of linking structured datasets and ontologies to facilitate automated decision-making. By querying the knowledge graph, nutritionally similar alternatives can be identified to adapt to changing climate conditions. Importantly, the use of the developed knowledge graph is not limited to this study alone. It serves as a foundation for further development, inviting a multitude of stakeholders to contribute and integrate additional data sources. Furthermore, future enhancements can involve the integration of an advanced ontology into multiple infrastructures and data platforms, for example, for ontology-enabled food ingredient substitution [14], thereby increasing its utility and impact in the field of sustainable food production. However, extensive querying may be reaching computational performance bottlenecks in usual computational settings of regular users.
Furthermore, a user interface might increase the usability for other stakeholders in the future, besides researchers. It has been demonstrated before that visual elements, including graphs and images, are more easily understood than text and numbers [29]. Earlier research shows techniques for visualizing SPARQL query outputs from GraphDB, with the goal of increasing the understandability of vast knowledge graphs and complex queries. Besides ontology visualization tools such as Nitelight and FedViz, other studies have constructed visualization tools and frameworks to increase understandability with end-users such as in studies on raising awareness of data sharing consent [30, 31]. In these cases, a framework for an application is created where the user communicates with the front end that links to GraphDB through several APIs and visualizes the resulting data for increased understandability. Similar user interfaces could be developed for the current knowledge graph when implemented in non-academic situations.
References
Stehfest, E., Bouwman, L., van Vuuren, D., den Elzen, M., Eickhout, B., Ka-bat, P.: Climate benefits of changing diet. Clim. Change 95(1), 83–102 (2009). https://doi.org/10.1007/S10584-008-9534-6
Neha, B., Hills, T., Sgroi, D.: Climate Change and Diet. No. 13426. Institute of Labor Economics (IZA) (2020)
Aubert, C., Buttigieg, P.L., Laporte, M.A., Devare, M., Arnaud E.: CGIAR Agronomy Ontology, http://purl.obolibrary.org/obo/agro.owl, licensed under CC BY 4.0 (2017)
Jonquet, C., et al.: AgroPortal: a vocabulary and ontology repository for agronomy. Comput. Electron. Agric. 144, 126–143 (2018). https://doi.org/10.1016/j.compag.2017.10.012
Matteis, L., et al.: Crop ontology: vocabulary for crop-related concepts. Proceedings of the First International Workshop on Semantics for Biodiversity. CEUR-WS.org (2013)
Darnala, B., Amardeilh, F., Roussey, C., Jonquet, C.: Crop planning and production process ontology (C3PO), a new model to assist diversified crop production. In: IFOW 2021-Integrated Food Ontology Workshop @ 12th International Conference on Biomedical Ontologies (ICBO) (2021). hal-lirmm.ccsd.cnrs.fr/lirmm-03389513
Riaño, M.A., Rodriguez, A.O.R., Velandia, J.B., García, P.A.G., Marín, C.E.M.: Design and application of an ontology to identify crop areas and improve land use. Acta Geophys. 71, 1409–1426 (2023). https://doi.org/10.1007/s11600-022-00808-5
Ecocrop: Ecocrop Database. FAO, Rome, Italy (2016)
Eftimov, T., Ispirova, G., Potočnik, D., Ogrinc, N., Seljak, B.K.: ISO-FOOD ontology: a formal representation of the knowledge within the domain of isotopes for food science. Food Chem. 277, 382–390 (2019). https://doi.org/10.1016/j.foodchem.2018.10.118
Cooper, L., et al.: The Planteome database: an integrated resource for reference ontologies, plant genomics and phenomics. Nucleic Acids Res. 46(D1), D1168–D1180 (2018). https://doi.org/10.1093/nar/gkx1152
Plant Ontology™ Consortium.: The Plant Ontology™ consortium and plant ontologies. Comp. Func. Genom. 3(2), 137–142 (2002). https://doi.org/10.1002/cfg.154
Dooley, D.M., et al.: FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration. NPJ Sci. Food 2(1), 23 (2018). https://doi.org/10.1038/s41538-018-0032-6
Haussmann, S., et al.: FoodKG: a semantics-driven knowledge graph for food recommendation. In: The Semantic Web– ISWC 2019: 18th International Semantic Web Conference Proceedings, Part II 18, pp. 146–162. Springer International Publishing Auckland, New Zealand, (2019). https://doi.org/10.1007/978-3-030-30796-7_10
Ławrynowicz, A., Wróblewska, A., Adrian, W.T., Kulczyński, B., Gramza-Michałowska, A.: Food recipe ingredient substitution ontology design pattern. Sensors 22(3), 1095 (2022). https://doi.org/10.3390/s22031095
Food Item Ontology. https://git.wur.nl/FoodInformatics/foodontology.git
Mertens, E., Kaptijn, G., Kuijsten, A., van Zanten, H., Geleijnse, J. M., van ‘t Veer, P.: SHARP-Indicators Database towards a public database for environmental sustainability. Data Br. 27, 104617 (2019). https://doi.org/10.1016/j.dib.2019.104617
Blonk Sustainability | Databases. https://blonksustainability.nl/tag/Databases
RIVM Life Cycle Assessment (LCA) database. https://www.rivm.nl/life-cycle-assessment-lca
Cortesi, A., Pénicaud, C., Saint-Eve, A., Soler, L.G., Souchon, I.: Life cycle inventory and assessment data for quantifying the environmental impacts of a wide range of food products belonging to the same food category: a case study of 80 pizzas representatives of the French retail market. Data Br. 41, 107950 (2022). https://doi.org/10.1016/j.dib.2022.107950
Eaternity Database. https://eaternity.org/foodprint/database
World Food LCA Database. https://ourworldindata.org/explorers/
Sinfort, C., Perignon, M., Drogué, S., Amiot, M.J.: Dataset on potential environmental impacts of water deprivation and land use for food consumption in France and Tunisia. Data Br. 27, 104661 (2019). https://doi.org/10.1016/j.dib.2019.104661
Clark, M., et al.: Estimating the environmental impacts of 57,000 food products. Proc. Natl. Acad. Sci. 119(33), e2120584119 (2022). https://doi.org/10.1073/pnas.2120584119
Notarnicola, B., et al.: Life cycle inventory data for the Italian agri-food sector: background, sources and methodological aspects. Int. J. LCA., 1–16 (2022). https://doi.org/10.1007/s11367-021-02020-x
FAOSTAT. https://www.fao.org/faostat/en/#home
Ontotext Refine tool. https://www.ontotext.com/products/ontotext-refine/
Hatfield, J.L., et al.: Climate impacts on agriculture: implications for crop production. Agron. J. 103(2), 351–370 (2011). https://doi.org/10.2134/agronj2010.0303
Passera, S.: Enhancing contract usability and user experience through visualization-an experimental evaluation. In: 16th International conference on information visualization, pp. 376–382. IEEE (2012). https://doi.org/10.1109/IV.2012.69
Bless, C., et al.: Raising awareness of data sharing consent through knowledge graph visualization. In: Further with Knowledge Graphs, pp. 44–57. IOS Press (2021). https://doi.org/10.3233/SSW210034
Rasmusen, S.C., et al.: Raising consent awareness with gamification and knowledge graphs: an automotive use case. Int. J. Semantic Web Inf. Syst. (IJSWIS), 18(1), 1–21. Igi-global.com (2022). https://doi.org/10.4018/IJSWIS.300820
Acknowledgements
This work has been partially funded by WUR investment theme “Data Driven discoveries in a Changing Climate”. The authors would like to thank Edith Feskens, Sander de Leeuw, and Karin Borgonjen for their contributions to this study.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2024 The Author(s)
About this paper
Cite this paper
Simsek-Senel, G., Rijgersberg, H., Öztürk, B., Weits, J., Fensel, A. (2024). I-KNOW-FOO: Interlinking and Creating Knowledge Graphs for Near-Zero CO2 Emission Diets and Sustainable FOOd Production. In: Akerkar, R. (eds) AI, Data, and Digitalization. SAIDD 2023. Communications in Computer and Information Science, vol 1810. Springer, Cham. https://doi.org/10.1007/978-3-031-53770-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-53770-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53769-1
Online ISBN: 978-3-031-53770-7
eBook Packages: Computer ScienceComputer Science (R0)