A Best Combination of Data Science Team. A Good Mix of BA, Programmer, Data Analyst, Data Engineer, and Data Scientist

In the early days of a data initiative, a common and costly mistake is to believe that hiring a single, brilliant data scientist will solve all analytical problems. This article explores why a diverse data science team is essential for success.

A Best Combination of Data Science Team. A Good Mix of BA, Programmer, Data Analyst, Data Engineer, and Data Scientist

Why No Single Role Defines a Data Science Team

In the early days of a data initiative, a common and costly mistake is to believe that hiring a single, brilliant data scientist will solve all analytical challenges. A leader might secure budget for one prestigious role, envisioning a lone expert who will unearth transformative insights, build predictive models, and deliver a competitive edge. The reality that unfolds is often one of frustration. That individual, no matter how skilled, quickly becomes mired in data access issues, struggles to translate business needs into technical specifications, and spends an inordinate amount of time writing production code instead of analysing patterns. The project stalls, the investment seems wasted, and the organisation concludes that "data science doesn't work here." This failure is rarely about the individual's capability; it is a fundamental failure of Applied Leadership in team design. It stems from misunderstanding that modern data work is not a solo pursuit but a relay race requiring specialised, interlocking skills.

The core principle for leaders is to architect teams, not just hire for roles. A high-functioning data science unit is a microcosm of a well-run organisation, requiring diverse competencies that cover the entire value chain from business problem to deployed solution. Thinking in terms of a "combination" or "mix" is the correct starting point. This shifts the leadership challenge from "find the unicorn" to "orchestrate the specialists." The practical team composition must balance deep technical expertise with business acumen, statistical rigour with engineering discipline, and exploratory analysis with production reliability. Your Decision-Making on team structure should be guided by the outcomes you need: Are you building customer-facing products, informing strategic planning, or optimising internal operations? Each goal implies a different weighting within the core mix of business analyst, data engineer, data analyst, and data scientist.

The Core Four: Defining the Essential Roles

Before designing interactions, a leader must clearly understand the distinct contributions of each primary role. Confusion here leads to role conflict, wasted effort, and misaligned expectations. The Business Analyst (BA) is the team's anchor to reality. They are responsible for problem framing, working with stakeholders to decompose a vague business need ("improve customer retention") into a specific, measurable question ("can we identify customers at high risk of churn within the next billing cycle based on usage patterns and support tickets?"). They own the requirements document, define what "success" looks like in business metrics, and will ultimately translate analytical outputs back into recommended actions. Crucially, they act as a shield for the technical team, absorbing organisational ambiguity and refining it into clear specifications.

The Data Engineer (DE) builds the foundation. Their domain is data infrastructure: pipelines, warehouses, and governance. They ensure that raw data from source systems (CRM, web logs, transactional databases) is reliably ingested, cleaned, transformed, and made accessible. A team without a dedicated DE will see its analysts and scientists spending 80% of their time on data wrangling—a catastrophic misallocation of expensive talent. The Data Analyst (DA) operates on this prepared data to explain the past and present. They are masters of SQL, dashboards, and descriptive statistics. Their work answers questions like "What happened?", "Where is the problem?", and "How many are affected?" They produce reports, KPIs, and visualisations that power operational and tactical Decision-Making. The Data Scientist (DS), in the most focused definition, predicts the future and prescribes action. Using statistical modelling, machine learning, and advanced programming, they build systems to classify, forecast, or optimise. Their work moves from insight to algorithm.

Where the Programmer Fits In

The user's note explicitly includes "programmer" in the mix, which is an astute observation often overlooked in theoretical frameworks. In practice, this role is critical for bridging the gap between prototype and product. A data scientist may build a highly accurate model in a Python notebook, but that model is useless if it cannot be integrated into a live application. The Programmer or ML Engineer specialises in software development best practices: writing scalable, maintainable, and tested code; containerising models with Docker; building APIs using FastAPI or Flask; and implementing monitoring and logging. They translate a scientific artifact into a robust engineering component. Without this skill, models languish on laptops, creating the infamous "pilot purgatory." Including a programmer in your core combination is a declaration that you intend to ship work that has real-world impact, a key tenet of Applied Leadership.

Orchestrating the Handoff: From Problem to Production

With the roles defined, the leader's primary task becomes designing and managing the workflow between them. A siloed model where the BA throws requirements "over the wall" to the DE, who then passes data to the DA, and so on, is a recipe for delay and misalignment. The effective model is collaborative and iterative. A practical scenario illustrates this: A product manager (stakeholder) approaches the BA with a concern about user drop-off during a new onboarding flow. The BA schedules a discovery session involving the DA and DS. The DA immediately queries existing telemetry data to quantify the drop-off points. The DS brainstorms whether a predictive intervention (e.g., a prompt for help) is feasible. The DE assesses if the required event-level data is being captured reliably.

This initial cross-role huddle, championed by the leader, ensures everyone understands the context from day one. The work then proceeds in parallel streams with tight feedback loops. The DE begins improving the data pipeline for real-time event ingestion. The DA builds a dashboard to monitor the drop-off metrics. The BA works on defining the intervention logic with the product manager. The DS starts prototyping a simple classification model to predict which users will drop off. The programmer investigates how to deploy a lightweight model into the front-end application stack. Regular syncs ensure the DA's findings inform the DS's feature selection, and the programmer's constraints shape the DS's model complexity. This orchestration turns a group of specialists into a single, cohesive problem-solving organism.

Practical Ratios and Scaling for Applied Leadership

A critical Decision-Making point for leaders is determining the initial and scaling ratios of these roles. There is no universal formula, but practical heuristics exist based on the team's mission. For a team focused on business intelligence and reporting, the ratio might lean heavily towards Data Analysts and Business Analysts, with one Data Engineer supporting several analysts (e.g., 1 DE : 3 DA : 2 BA). For a product-centric team building machine-learning features, the balance shifts: you might need a Data Engineer and a Programmer for every Data Scientist to ensure robust pipelines and deployment (e.g., 1 DE : 1 Programmer : 1 DS : 0.5 BA). The "0.5 BA" indicates a BA shared across two such squads, as the product manager often absorbs much of the business context.

Scaling the team introduces new challenges. The initial, tightly-knit "combination" can blur into confusion as more people are added. Applied Leadership here involves intentional specialisation and the creation of enabling platforms. You might evolve from a single cross-functional team into two: a "Data Platform Team" (mostly Data Engineers and Programmers) that builds and maintains shared infrastructure, tools, and services for the entire organisation; and an "Applied Data Science Team" (a mix of BA, DA, DS) that uses those platforms to solve specific business problems. This model prevents duplication of effort on engineering problems and allows the applied teams to move faster. The leader must constantly assess workflow bottlenecks—are the scientists waiting for data? Are models stuck in deployment?—and adjust the investment in each role accordingly.

Even with the right mix, teams can dysfunction without clear guardrails and leadership intervention. A frequent conflict arises between Data Analysts and Data Scientists over territory and tools. A DA, proficient in SQL and BI tools, might build a complex series of layered views that effectively becomes a predictive scorecard—venturing into DS work without the rigorous validation. Conversely, a DS might spend weeks building a sophisticated model for a problem a simple segmented analysis by a DA could have solved 80% as well. The leader must clarify the decision boundary: prediction and prescription are the DS's domain; description, diagnosis, and measurement belong to the DA. Encourage collaboration by having them review each other's work; the DA's deep data familiarity can catch data leakage issues in the DS's model.

Another pitfall is the underutilisation of the Business Analyst, reducing them to a mere note-taker. This wastes a critical resource and decouples the technical work from business value. Empower the BA to own the "so what?" of every analysis. Mandate that no technical work begins without a BA-signed requirements brief that includes the success metric. Furthermore, protect the deep work cycles of your engineers and scientists. Constant ad-hoc requests from stakeholders will shatter their productivity. Implement a intake process filtered by the BA and DA; let the DA handle the ad-hoc queries using pre-built data products, freeing the DS and DE for project work. This operational discipline is a non-negotiable aspect of leading a technical team effectively.

Cultivating the Team Culture and Career Pathways

Finally, assembling the combination is only the start; the leader must foster a culture where these different disciplines respect and learn from one another. Create rituals that showcase diverse work: a "Data Deep Dive" where an analyst walks through a revealing dashboard; an "Engineering Review" where an engineer explains a new pipeline architecture; a "Research Share" where a scientist discusses a model's failures and successes. This builds mutual appreciation and breaks down jargon barriers. From a talent management perspective, you must also provide viable career pathways that don't force everyone into people management. Establish dual-track career ladders for individual contributors (IC) and management. A stellar Data Engineer can progress to Senior, Staff, and Principal levels, influencing technical strategy across the organisation, without ever having to manage a team.

This approach to team design is the essence of Applied Leadership in the domain of Data Science. It moves beyond chasing trends and focuses on constructing a reliable, human-powered system for generating value from data. It requires you to think like an architect, an orchestra conductor, and a coach simultaneously. Your Decision-Making shifts from "What cool technology should we use?" to "What combination of skills do we need to solve this business problem reliably and repeatedly?" By investing in the balanced mix of business analyst, programmer, data analyst, data engineer, and data scientist—and, more importantly, in the systems that connect their work—you build not just a team, but a sustainable competitive capability.