Software Developer – Data Engineering
Job TitleSoftware Developer – Data Engineering
We are looking to hire remote candidates!
At Digital Science we are looking for a Software Developer to contribute to our Dimensions product. As a Software and Data Engineer, you will help us take scientific publications and other documents and enrich them in a variety of ways: by adding new sources of data while identifying and merging duplicate documents across the sources, by linking different content types together, by normalising entities like people and places to a canonical form, and by discovering and implementing new ways of adding insight to the raw data. You will be part of an experienced and well skilled technical team, with a clear vision of the technological and engineering goal within the exciting setting of an international and agile company.
Your new role
- Extend and implement rule-based and text mining based machine learning tools to disambiguate data sources and find links between data types.
- Extract, transform and load data into a variety of data stores such as PostgreSQL and Google BigQuery.
- Analyse data and enrichment outputs to find areas for improvement.
- Build tools and create reports for performing quality assurance on our work.
- Gather and analyse new sources of data which may be valuable to add to Dimensions.
- Write well-crafted, well-tested, readable, maintainable Python code.
- Deal with large datasets (100M+ documents with over one billion links)
- Work with Amazon Web Services.
- Work with our Kubernetes based processing pipeline.
- Telecommuting is OK
- No Agencies Please
- Several years of professional experience with Python development
- Working with medium to large scale data processing
- Using, creating and working with SQL databases
- Using distributed version control systems (git)
- Understanding of Agile methodologies
- Ability to work on intricate details without losing the big picture
- Self-learner, possessing inherent inquisitiveness
- Good problem solving and analytical skills
- Strong interpersonal, communication, and organizational skills
- Minimum Bachelor degree in Computer Science or a related field, or equivalent
About the Company
With Dimensions, Digital Science launched an innovative research data and tool infrastructure, broadening the view of the research landscape after decades of focus on the publication/citation complex. Dimensions is a research tool that interlinks multiple data sets (grant applications, publications, clinical trials, patents applications, policy documents). Based on these data sets and by using external services it provides metrics like attention and citation score and makes it possible to perform complex analysis on the data. In total, Dimensions contains today more than 400 million documents with more than 4 billion connections between these records. Our customers are researchers, research organizations, publishers and government and funder organizations from all around the world. Dimensions has offices in Germany, Romania, US and UK, serving clients globally. For more information please visit dimensions.ai or try the free version of the Dimensions app at app.dimensions.ai.
- Contact: Kaine Nutton
- E-mail contact: firstname.lastname@example.org
- Web: https://www.digital-science.com/jobs/software-engineer-data-engineering/