The Su Lab at the Scripps Research Institute is recruiting for an individual to work on Data Science and Knowledge Engineering. This position will work on the NSF-funded "Prototype Open Knowledge Graph" (Proto-OKN) project (www.nsf.gov/pubs/2023/nsf23571/nsf23571.htm), the goal of which is to build "an interconnected network of knowledge graphs supporting a very broad range of application domains". These application domains include areas as diverse as biomedical science, toxicology, criminal justice, commercial supply chains, agriculture, energy, healthcare and more.
Our team focuses on building infrastructure to integrate knowledge graphs from disparate domains and teams. We will build an "interconnecting fabric" that allows for integrative queries across all these domains. This fabric will have three components:
* Knowledge Fabric: alignment of source knowledge graphs, query formats, and entity identifiers using ontologies and community standards
* Technical Fabric: cloud-based infrastructure to perform federated queries across a distributed knowledge graph
* Social Fabric: coordinating and communication between stakeholders to explore shared technologies and use cases
This position will focus on the use of Wikidata (wikidata.org) as a foundational layer of our Knowledge Fabric. Wikidata serves at least two roles -- as an interstitial graph that links many independent knowledge graphs, and as a unifying source of identifier mappings across many ontologies and domains.
Candidates must have coding expertise (python preferred) and experience cleaning and transforming data. In addition, experience with any of the following is preferred:
* Wikidata
* Semantic web
* Ontologies
* Knowledge graphs
* Machine learning
Women and individuals from underrepresented groups are particularly encouraged to apply. Candidates who are local or can relocate to San Diego are preferred, but remote in the US is also fine.
Email CV and cover letter