The job develops front and back-end software that scales and integrates scientific workflows and data in multi-institutional research projects, with a focus on harmonization and semantic enrichment of biomedical data in a cancer context. Uses in-depth knowledge of software development to facilitate solutions to data acquisition, ingestion and data integration from heterogeneous sources.
This role is to design, develop, and implement semantic data annotation, integration, and search technologies in the context of the Center for Cancer Data Harmonization and the PCDC cloud compute environment.
Supports data collection and analytical needs of research projects. Ensures project compliance with different policies, procedures, directives, and mandates.
Takes responsibility for the following non-laboratory duties: transcribing and coding data; researching and applying data tabulation standards; researching and applying technologies to support semantic query. Acquires higher-level skills and knowledge in the process.
Collaborates with consortium groups and other stakeholders to lead development and implementation of data standards for pediatric cancer types.
Assists on grants, when necessary, specifically to develop standard terminologies and data models to connect cancer data and create a robust cancer data ecosystem. This includes work on genomics, proteomics, canine cancer, and imaging data.
Works closely with the National Cancer Institute as necessary to develop and apply additional terminology and/or common data elements for pediatric cancer data.
Develops, tests, debugs, and maintains new and existing application software.
Works independently to define and document project requirements and provides overall technical guidance in design, architecture and implementation of software solutions.
Performs other related work as needed.
Minimum requirements include a college or university degree in related field.
Minimum requirements include knowledge and skills developed through 5-7 years of work experience in a related job discipline.
MS in computer science or related field
Experience with programming in healthcare industry, healthcare IT, or data governance or related field (security, privacy, compliance).
Scientific background and interest - particularly biomedicine
Familiarity with clinical/phenotypic data
Knowledge of semantic web technologies such as SPARQL, RDF, OWL
Knowledge of biomedical ontologies and resources (e.g. SNOMED CT, NCIt, CaDSR, etc.).
Understanding of clinical trial and electronic health record data processes and data flow.
Knowledge of biomedical common data models (e.g. FHIR, BRIDG, SDTM, OMOP, PCORNet, etc.)
Excellent communication, time management / organization, troubleshooting, and analytical skills.
Experience with multiple computing languages, ideally including Python, Java, and R.
Knowledge of semantic web technologies like RDF, SPARQL, OWL
Knowledge of XML, JSON-LD, YAML, Avro, and other data serialization formats
Strong verbal, written, and interpersonal skills
Online - sam.am/SE