- Location
- London
- Last Published
- Dec. 10, 2024
- Sector
- Healthcare
- Function
- Data Science
About us
Founded in 2018, Causaly accelerates how humans acquire knowledge and develop insights in Biomedicine. Our production-grade generative AI platform for research insights and knowledge automation enables thousands of scientists to discover evidence from millions of academic publications, clinical trials, regulatory documents, patents and other data sources... in minutes.
We work with some of the world's largest biopharma companies and institutions on use cases spanning Drug Discovery, Safety and Competitive Intelligence. You can read more about how we accelerate knowledge acquisition and improve decision making in our blog posts here: Blog - Causaly
We are backed by top VCs including ICONIQ, Index Ventures, Pentech and Marathon.
Who we are looking for
We are looking for talented Data Engineers with a passion for DataOps and a demonstrable background in SQL and Python-based automation. You will join our Data & Semantic Technologies team, responsible for delivering the scalable and highly flexible data fabric that is the foundation of Causaly’s product suite. This team is enabling and empowering new product developments as well as innovations in AI to create true business value. You will be unleashing the value of data for our customers through building and operating automated data pipelines, feeding our constantly growing data warehouse and knowledge graph, evolving our data architectures, etc.
We are a multi-disciplinary team working in a fast-paced and collaborative environment, who value honest opinion and open debate. You have a problem-solving mind-set with a hands-on attitude, you are keen to design and build innovative solutions that leverage the value of data, you are passionate and creative in your work, you love to share ideas with your team and can pick the right tool for the job? Then you should become part of our journey!
What you can expect to work on:
- Gather and understand data based on business requirements
- Regularly import and transform big data (millions of records) from various formats (e.g. CSV, SQL, JSON) to data stores like BigQuery and Neo4j
- Process data further using SQL and/or Python, e.g., to sanitise fields, aggregate records, combine with external data sources
- Work with other engineers on highly performant data pipelines and efficient data operations, adhering to the industry’s best practices and technologies for scalability, fault tolerance and reliability
- Export data in well-defined target formats and schemata, ensure and validate data output and quality, produce corresponding reports and dashboards
- Manage and improve (legacy) data pipelines in the cloud - enable other engineers to run them efficiently
- Innovate on our data warehouse architecture and usage
- Work directly with a multitude of technical, product and business stakeholders
- Mentor and guide junior members, shape our technology strategy and innovate on our data backbone
- Collaborate with the DevOps team to help manage our infrastructure