Developing an AI project development life cycle involves five distinct tasks. No single individual has enough skills (or time) to carry out all tasks in AI project development. Thus, teams include individuals who focus on part of the cycle. Here is a visual representation of six technical roles and how they relate to various tasks.
I What tasks does a machine learning researcher carry out?
Machine learning researchers carry out data engineering and modeling tasks as shown in Figure 1. This includes:
- data engineering subtasks such as defining data requirements, collecting, labeling, inspecting, cleaning, augmenting, and moving data.
- modeling subtasks such as training machine learning models, defining evaluation metrics, searching hyperparameters, and reading research papers.
Although it’s not represented in Figure 1, some machine learning researchers focus on deployment (for instance life-long learning, model memory, or optimization for edge deployment) and AI infrastructure (such as distributed training, scheduling, experiment, and resource management).
II What skills does a machine learning researcher need?
Machine learning researchers demonstrate outstanding scientific skills (see Figure 2). Communication skills requirements vary among teams. They mostly write prototyping code, as opposed to production code written by engineers, and throw out most of the code they write.
If you’re interested in comparing your skills to other machine learning researchers, we recommend taking the standardized machine learning, data science, mathematics, and algorithmic coding tests on Workera. If you’re a company hiring machine learning researchers, you can administer computerized tests to AI job applicants for free using Workera Test and connect with AI practitioners using Workera Connect.
III What tools does a machine learning researcher use?
Machine learning researchers in different companies use different tools, but some tools stand out. The following tools grouped by task are the most frequently used tools identified in our research.
- Modeling in Python using packages such as numpy, scikit-learn, pandas, matplotlib, TensorFlow, and PyTorch.
- Data engineering in Python and/or SQL or other domain-specific query languages.
- Collaboration and workflow using a version control system like Git, Subversion, and Mercurial, a command line interface (CLI) like Unix, an integrated development environment (IDE) such as Jupyter Notebook or Sublime, and an issue tracking product like JIRA.
- Research by following updates via channels such as Twitter, Reddit, Arxiv, and conferences such as NeurIPS, ICLR, ICML, CVPR, and ACM.
IV In what team structure does a machine learning researcher fit?
Building an AI team requires bringing together complementary individuals who can progressively carry out the tasks of the AI project development lifecycle. AI teams focus on data engineering and modeling from the beginning, because they need to validate the feasibility of an AI project or idea. As the project becomes more mature, the team starts focusing on deployment, business analysis, and AI infrastructure.
Machine learning researchers achieve their fullest potential in a research environment, supported by teams in charge of deployment, business analyses and AI infrastructure. They combine well with data analysts who focus on translating statistics into actionable business insights and software engineers who build the tools and infrastructure that increases the effectiveness of all tasks.
This article aims to clarify what a machine learning researcher is, what tasks they carry out, and what skills they need. If you’re an AI practitioner, we hope it helps you choose a career track.
Companies may refer to this position as machine learning researcher, research scientist, research engineer, data scientist, and many other titles. If you’re a hiring manager, we hope that it helps you define your job requirements.
AI organizations are constantly evolving, so this article is a work in progress. We intend to revise it as our team learns more about new roles.