Gold Coast, Queensland Australia
Full Time
Opportunity
Datarwe aims to develop the world’s most comprehensive acute care medical research data platform, enabling medical researchers to develop next-generation Artificial Intelligence (AI) clinical diagnostic tools and technologies. The Datarwe “Clinical Data Nexus” (CDN) solution emerged from the increasing need for hospital services to more rapidly engage in translational clinical research with external parties. The Nexus provides health system collaborators (internal and external), with efficient and secure, monitored access to de-identified (retrospective) clinical data, specifically for the development of new AI/machine learning clinical analytics.
Datarwe’s Nexus is currently focused on critical care data collected in electronic medical record (EMR) systems along with real-time IoT data collected from devices interacting with the patients, such as bedside monitors, ventilators, etc. Data are a combination of relational data from the EMR linked with high-frequency time series data from IoT devices. The dataset also includes natural language data, such as clinical notes and pathology reports, as well as some images.
CDN Platform Pipeline
In order to share data with medical researchers the Clinical Data Nexus implements a progressive data pipeline:
- Data are ingested from data providers (hospitals)
- Data are de-identified and restructured for efficient access
- Enrichment processes add value to the data sets, such as derived features, data cleansing, and imputing missing values
- Annotation and labels may be further added using human-in-the-loop labelling and/or application of automated labelling models, including ML and AI models
- Visualisation and modelling tools are made available
- Data may be securely distributed across different environments
While it is primarily our customers developing the applications from the data, using data science, machine learning and AI technologies, our team also must understand the data science / AI / ML pipeline, as a key utility for those activities. We also incorporate data science / ML and AI into the data enrichment processes.
Role Description
Datarwe is seeking an enthusiastic and knowledgeable Machine Learning / DataOps Architect to join a small dynamic technical team, consisting of CTO, data scientists, data engineers and assurance engineers. The role assumes a hands-on contribute to the platform architecture and implementation, planning structure and processes for maintaining and expanding the Clinical Data Nexus platform over time. Key to success will be a proven ability to nurture and grow a modern agile delivery team.
The incumbent will also be expected to define and maintain processes for enriching the data, integrating models and providing access to tools for the use of the research collaboration teams on-platform.
Key Responsibilities
- Map out data collection, enrichment, modelling, and deployment architecture
- Manage scalable cloud-based data lake(s)
- Ensure continued reliable and correct operation of data pipeline for data enrichment processes
- Manage remote ingestion components for reliable data flow
- Develop and maintain a comprehensive security model
- Analyse performance of platform and ensure efficient and effective use of AWS compute and storage resources
- Implement effective CI/CD processes to scale data pipeline
- Provide architecture and implementation guidance to data engineers and data scientists, both internal to Datarwe as well as with partners
- Sprint planning and scrum master
- Conduct experiments in spikes to help guide design
- Lead contribution to sensible defaults, engineering patterns and principle
Requirements
We are seeking experience and appropriate coverage of key architectural areas that will impact design of the platform. Whilst familiarity with components of our cloud stack partner Amazon Web Services (AWS) would be highly regarded, equivalent experience on similar environments will not place candidates at a disadvantage.
Whilst not a pre-requisite, progression towards attaining formal competencies in areas such as Solution Architect, Data Lake Architect, MLOps on AWS (or similar) would also be advantageous. Regardless of platform experience, some ML/AI exposure is highly desirable.
Understanding of key elements/influences of our technical stack would benefit from the candidate’s demonstrated experience across some of the below:
- AWS Stack (or similar)
- S3, Athena, EC2, ECS,
- CDK, Cloudformation
- Security IAM, CloudWatch, CloudTrail
- Sagemaker, Sagemaker Ground Truth
- DevOps
- Git, CI/CD, CodePipeline, CodeDeploy, Test Automation, Test Driven Development
- Agile
- Jira, Sprints, Kanban
- Data
- RDS, SQL, SQL Server, CSV, JSON, Parquet
- Electronic Medical Records
- HL7, FHIR, DICOM
- HIPAA
- Code
- Python, Lambda
- ML
- Pre-processing, Training, Testing processes
- Labelling and Annotation
- Production deployment frameworks
- Containerisation: Docker, Kubernetes
Additional Information
To be eligible to apply you must have Australian or New Zealand citizenship or permanent residency status. Successful applicants will be required to complete a background check which includes a criminal history check prior to commencement of employment.
Datarwe is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, national origin, sexual orientation, gender identity, or disability.
Applications
For applications and/or further information about the role, please email info@datarwe.com
Applications should include a brief cover letter and CV detailing relevant experience.