Welcome to Data Intelligence!
Data Intelligence Newsletter, July 2020 Oct. 14, 2020  |  views: 523

DI Data and Facts

1. TheThird Issue organized by our board member Valentina Presutti and Mari Carmen Suárez de Figueroa Baonza Published


Constructing and Cleaning Identity Graphs in the LOD Cloud

Citation: J. Raad, W. Beek, F. van Harmelen, J. Wielemaker, N. Pernelle & F. Saïs. Constructing and cleaning identity graphs in the LOD cloud. Data Intelligence 2(2020), 323–352. https://doi.org/10.1162/dint_a_00057

Highlight abstract: In the previous work, authors presented an identity graph containing over 500 million explicit and 35 billion implied owl:sameAs statements, and presented a scalable approach for automatically calculating an error degree for each identity statement. In this paper, they generate subgraphs of the overall identity graph that correspond to certain error degrees. This work shows that even though the Semantic Web contains many erroneous owl:sameAs statements, it is still possible to use Semantic Web data while at the same time minimising the adverse effects of misusing owl:sameAs.

Lead Author: Frank van Harmelen is a Professor in Knowledge Representation & Reasoning in the Computer Science department (Faculty of Science) at the Vrije Universiteit Amsterdam, The Netherlands.He is a fellow of the European AI Society ECCAI.In 2014, he was admitted as member of the Academia Europaea, and in 2015 he was admitted as Member of the Royal Netherlands Society of Sciences and Humanities (450 members across all sciences). Prof. van Harmelen is an Editor of Data Intelligence.


GeoLink Data Set: A Complex Alignment Benchmark from Real-world Ontology

Citation: L. Zhou, M. Cheatham, A. Krisnadhi & P. Hitzler. GeoLink data set: A complex alignment benchmark from real-world ontology. Data Intelligence 2(2020), 353-378. https://doi.org/10.1162/dint_a_00054

Highlight abstract: Authors proposed a real-world data set from the GeoLink project as a potential complex ontology alignment benchmark. The data set consists of two ontologies, the GeoLink Base Ontology (GBO) and the GeoLink Modular Ontology (GMO), as well as a manually created reference alignment that was developed in consultation with domain experts from different institutions. The alignment includes 1:1, 1:n, and m:n equivalence and subsumption correspondences, and is available in both Expressive and Declarative Ontology Alignment Language (EDOAL) and rule syntax

Lead Author: Pascal Hitzler is Professor and endowed Lloyd T. Smith Creativity in Engineering Chair at the Department of Computer Science at Kansas State University and Director of the Data Semantics (DaSe) Laboratory. His research record lists over 400 publications in such diverse areas as semantic Web, artificial intelligence, neural-symbolic integration, knowledge representation and reasoning, machine learning, denotational semantics and set-theoretic topology.


The Computer Science Ontology: A Comprehensive Automatically-Generated Taxonomy of Research Areas

Citation: A.A. Salatino, T.Thanapalasingam, A. Mannocci, A. Birukou, F. Osborne & E. Motta. The computer science ontology: A comprehensive automatically-generated taxonomy of research areas. Data Intelligence 2(2020), 379-416. https://doi.org/10.1162/dint_a_00055

Highlight abstract: In this paper, authos introduce the Computer Science Ontology (CSO), a large-scale, automatically generated ontology of research areas, which includes about 14K topics and 162K semantic relationships. It was created by applying the Klink-2 algorithm on a very large data set of 16M scientific articles. CSO presents two main advantages over the alternatives: i) it includes a very large number of topics that do not appear in other classifications, and ii) it can be updated automatically by running Klink-2 on recent corpora of publications.

Lead Author: Francesco Osborne is a Research Fellow at the Knowledge Media Institute, The Open University, UK, where he leads the Scholarly Knowledge Mining team. Prof. Osborne is an Editor of Data Intelligence. He has authored more than 70 peer reviewed publications in the fields of Information Extraction, Knowledge Graphs, Science of Science, Semantic Web, Research Analytics, and Semantic Publishing.


Refining Linked Data with Games with a Purpose

Citation: I. Celino, G. Re Calegari & A. Fiano. Refining linked data with games with a purpose. Data Intelligence 2(2020),   417-442. https://doi.org/10.1162/dint_a_00056

Highlight abstract: In this paper, authors present an open source software framework to build Games with a Purpose for linked data refinement, i.e., Web applications to crowdsource partial ground truth, by motivating user participation through fun incentive. Authors also detail the impact of this new resource by explaining the specific data linking “purposes” supported by the framework (creation, ranking and validation of links) and by defining the respective crowdsourcing tasks to achieve those goals.

Lead Author: Irene Celino is the Head of the Knowledge Technologies group at Cefriel, where she leads an R&D team and she is Portfolio and Project Manager. With expertise in Semantic Web and Human Computation technologies, her research activities cover the application of such innovative technologies to the design and development of Web applications, search engines, recommendations systems and mobile games, especially in Smart City and transportation-related scenarios.


2 DI Citations in Web of Science

Up to Jul. 20, 2020, the DI journal has received 31 citations according to Web of Science.


   Lead Author




Barend Mons†

FAIR Science for Social Machines: Let’s Share Metadata Knowlets in the Internet of FAIR Data and Services



Dean Allemang*

Sustainability in Data and Food



Juanzi Li*

XLORE2: Large-Scale Cross-Lingual Knowledge Graph Construction and Application



Jie Tang*

AMiner: Search and Mining of Academic Social Networks



Heng Ji

Joint Entity and Event Extraction with Generative Adversarial Imitation Learning



Xin Wayne Zhao

KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems



Binyang Li

Identifying User Profile by Incorporating Self-Attention Mechanism Based on CSDN Data Set



Huawei Shen

User Profiling for CSDN: Keyphrase Extraction, User Tagging and User Growth Value Prediction




Virtual Knowledge Graphs: An Overview of Systems and Use Cases



Yanghua Xiao*

CN-DBpedia2: An Extraction and Verification Framework for Enriching Chinese Encyclopedia Knowledge Base


Vol.1/Iss. 4

Danielle Descoteaux

Playing Well on the Data FAIRground: Initiatives and Infrastructure in Research Data Management




Oya Beyan

Distributed Analytics on Sensitive Medical Data: The Personal Health Train



Larry Lannom

FAIR Data and Services in Biodiversity Science and Geoscience



Shelley Stall

Growing the FAIR Community at the Intersection of the Geosciences and Pure and Applied Chemistry



Mirjam van Reisen

Towards the Tipping Point for FAIR Implementation



Paul Groth*

FAIR Data Reuse – the Path through Data Citation


Vol.1/Iss. 3

Frank van Harmelen*

Constructing and Cleaning Identity Graphs in the LOD Cloud






  Note: † indicates DI advisory board member and * indicates DI editor.


3 Most Read Articles


     Lead Author


    Read times


Qi Zhang

Knowledge Graph Construction and Applications for Web Search and Beyond



Barend Mons

FAIR Principles: Interpretations and Implementation Considerations



Barend Mons

The FAIR Principles: First Generation Implementation Choices and Challenges




4 Early Access Articles

Rapid academic communicaton benefits the whole scientific community.We make the articles online available as soon as they are accepted and assigned a DOI.

The Semantic Data Dictionary – An Approach for Describing and Annotating Data


Citation: S.M. Rashid, J.P. McCusker, P. Pinheiro, M.P. Bax, H. Santos, J.A. Stingone, A.K. Das & D.L. McGuinness. The semantic data dictionary – an approach for describing and annotating data. Data Intelligence 2(2020), 443–486. https://doi.org/10.1162/dint_a_00058

Lead Author:  Deborah L. McGuinness is the Tetherless World Senior Constellation Chair and Professor of Computer and Cognitive Science. She is also the founding director of the Web Science Research Center at Rensselaer Polytechnic Institute. Dr. McGuinness has been recognized with awards as a fellow of the American Association for the Advancement of Science (AAAS) for contributions to the Semantic Web, knowledge representation, and reasoning environments and as the recipient of the Robert Engelmore award from Association for the Advancement of Artificial Intelligence (AAAI) for leadership in Semantic Web research and in bridging Artificial Intelligence (AI) and eScience, significant contributions to deployed AI applications, and extensive service to the AI community.


An RDF Data Set Quality Assessment Mechanism for Decentralized Systems


Citation: Citation: L. Huang, Z. Liu, F. Xu & J. Gu. An RDF data set quality assessment mechanism for decentralized systems. Data Intelligence 2(2020), 487–511. https://doi.org/10.1162/dint_a_00059


Lead Author: Jinguang Gu is currently a professor of the College of Computer Science and Technology, Wuhan University of Science and Technology (WUST) and vice dean of Institute of Big Data Science and Engineering, WUST. His research interests include knowledge graph, semantic Web, and distributed and service computing.



KnowID: An Architecture for Efficient Knowledge-Driven Information and Data Access

Citation: P.R. Fillottrani & C.M. Keet. KnowID: An architecture for efficient knowledge-driven information and data access. Data Intelligence 2(2020). https://doi.org/10.1162/dint_a_00060

Lead Author:  Pablo Rubén Fillottrani is a Professor with the Department of Computer Science and Engineering, Universidad Nacional del Sur, Bahía Blanca, Argentina, and independent researcher at Comisión de Investigaciones Cientficas de la Provincia de Buenos Aires.




DI Papers Solicitation

1 Solicited papers

             Possible Title

     Lead Author


Blockchain Intelligence: A New Frontier for Data Intelligence and Smart Crowd Operations

Feiyue Wang*

Institute of Automation, Chinese Academy of Sciences, China


Huajun Chen*

Zhejiang University, China


Open Base

Haofen Wang*

Tongji University, China

Knowledge extraction or knowledge graph construction

Wei Hu*

Nanjing University, China

Data science education

Xiao Hu*

The University of Hong Kong, China

Industrial Applications of UFO and OntoUML

Tiago Prince Sales

Free University of Bozen-Bolzano, Italy

Conceptual modeling of legal relations

Cristine Griffo

Free University of Bozen-Bolzano, Italy

Recent advances in completeness reasoning

Nutt Werner

Free University of Bozen-Bolzano, Italy

The Open Data Challenge: An Analysis of 124,000 Data Availability Statements, and an Ironic Lesson about Data Management Plans

Chris Graf


Wiley, UK

Probabilistic Tractable Models in Mixed Discrete-Continuous Domains

Andreas Bueff

University of Edinburgh, School of Informatics, Alan Turing Institute

  Note: * indicates DI editor.


2 Special issue

The special issue on Open Data organized by DI editor Peter Wittenburg, George Strawn and DI managing editor Fenghong Liu is scheduled to publish early next year. The high level review panel including some of pioneers from Europe, the USA and China in the field was recently established.


3 Reviewers this Month

We’d like to thank the following editors who reviewed papers for Data Intelligence this month.

Dr. Xiaowang Zhang,  Tianjin University, China

Dr. Tianxing Wu, Southeast University, China

Dr. Huaiyu Wan, Beijing Jiaotong University, China

Dr. Peter Wittenburg, Max Planck Compute and Data Facility, Germany



DI News Highlights

1 Data Intelligence now indexed in INSPEC database

Data Intelligence has been accpeted to be indexed in INSPEC, a major indexing database of scientific and technical literature, published by the Institution of Engineering and Technology (IET). INSPEC coverage is extensive in the fields of physics, computing, control, and engineering. INSPEC was started in 1967 as an outgrowth of the Science Abstracts service. For nearly 50 years, the IET has employed scientists to manually review items to be included in INSPEC, hand-indexing the literature using their own expertise of the subject area and make a judgment call about which terms and classification codes should be applied. Thanks to this work, a significant thesaurus has been developed which enables content to be indexed far more accurately.


2 Online EB meeting held on March 6

Data Intelligence held the online editorial board meeting on March 6, 2020. DI EIC Jim Hendler hosted the meeting. Profs.Ying Ding, Guilin Qi, Yan Zhao, Gary Marchionini and other 16 editors attended the meeting. They are: Dr. Jie Bao, Dr. Huajun Chen, Dr. Wei Chen, Dr. Dongxiao Gu, Dr. Xiao Hu, Dr. Xiaojiu Le, Dr. Jiao Li, Dr. Xiwen Liu, Dr. Yongbin Liu, Dr. Jinhui Pang, Dr. Feiyue Wang, Dr. Haofen Wang, Dr. Huaiyu Wan, Dr. Jian Qin and Dr. Yejun Wu. Dr. Zheping Xu, and Dr. Xiaowang Zhang. The Managing Editor Dr. Fenghong Liu gave a 25-minute talk about the current overall development of the DI journal at the meeting. Issues of the aim and scope of the journal, the roles of EICs, editorial members, and the editorial office, and how to solicit quality papers were discussed. More details, please see the meeting minutes Fenghong shared before. You can also send a request to Fenghong at liufh@mail.las.ac.cn


3 Knowledge Graph Workshop at ACM KDD2020 Organized by DI EIC Jim Hendler and Ying Ding to be held on August 24

International Workshop on Knowledge Graph: Mining Knowledge Graph for Deep Insights organized by DI Co-EICs Jim Hendler and Ying Ding, as well as Editor Jie Tang will be held on August 24, 2020, San Diego, California, USA. Website: https://suitclub.ischool.utexas.edu/IWKG_KDD2020/index.html#introduction


Data Intelligence EIC Jim Handler and DI editor David Wild will deliver a speech at the workshop. High quality submissions with substantial revisions will be invited to submit to Data Intelligence.


4 FAIR Special Issue and Articles Highlighted at IUPAC

FAIR special issue and two articles of the special issue were put on the website of International Union of Pure and Applied Chemistry (IUPAC) at https://iupac.org/fair-chemical-data-for-all/



Open Questions

1 Solicit papers for DI

     Special Issues play an important role in improving the reputation of a journal, especially those organized by authoritative and international-established scholars, to focus the topic of the issue and to promote DI internationally. DI successfully published a special issue on The FAIR Principles: First Generation Implementation Choices and Challenges organized by our Advisor Barend Mons and his coworkers Dr. Erik Schultes and Dr. Annika Jacobsen. All editors are welcome to suggest topics for special issues, to recommend guest editors, and welcome editors to self-recommend organizing special issues. Some editors suggest a Short Communication (Short Paper or Letter) to be included in DI. This format is usually short, one or at most two pages, and it is problem driven, status quo-challenging, puzzle-exploring, paradigm-debating, thought stimulating or provoking, to pin-point the grand challenges facing the theoretic research and practices of the fields. It will be distinguishingly different from literature reviews which require extensive citations and from perspectives which provides systematic analysis. You are welcome to recommend scholars or experts to write Short Communications to DI. Many thanks to our Editor Dr. Guohui Xiao for his coordination with submission of a Short Communication paper for the online seminar organized by the KRDB Research Centre for Knowledge and Data at the Free University of Bozen-Bolzano, Italy (http://www.inf.unibz.it/krdb/sos-2020/).


2 Promote the impact for DI

      DI editorial office has registered DI in scientific social network services like ResearchGate and connection network LinkedIn to promote DI Early Access and published papers. DI LinkedIn account has tried to connect all of DI editors, reviewers, authors and those who are interested in the field and you are welcome to forward DI messages to promote DI. DI Linked in account: Data Intelligence.


DI editorial office has been translating and put the Chinese version of DI published papers on DI WeChat (1,300+ users now) and has cooperated with other popular WeChat accounts like Open KG to promote these papers. We’d like to extend our thanks to our editor Dr. Jie Bao for his coordination with Open KG WeChat account. DI editorial office will notify editors whenever each issue is officially published and editors are welcome to continue and streamline the promotion of DI papers through emails with easy links to the full text, or when you are delivering a speech at academic conferences.DI editor Dr. Guohui Xiao promoted Data Intelligence in his online lecture titled “Ontop: the Virtual Knowledge Graph System” on July 10, 2020. His paper titled Virtual Knowledge Graphs: An Overview of Systems and Use Cases ranked the No.1 of DI published papers in terms of citation times. Thanks and Congrats, Guohui!