The 3 roles of the data team in the service of our clients
Lucky cart’s data team is made up of a dozen people, 1/3 for each role: data scientists, data engineers and data analysts. Its aim is to design, develop, operate and maintain the algorithmic factory of our solution.
This algorithmic factory constantly provides our API (Application Programming Interface) with all the elements (figures, parameters, rules) to propose promotional solutions that are personalised, measured and optimised with regard to a business criterion.
Our mission and positioning within Lucky cart
These figures are by nature complex to determine because we work with a triple requirement:
- To scientifically measure the effect of our activations, without confusing correlation and causation (for example, by determining whether an increase in sales is linked to us or whether it would have happened anyway);
- Working at the individual level, not the segment level, and without socio-demographic data that we do not collect;
- To be efficient but keep a part of interpretability for ourselves and our clients in a ‘whitebox’ logic, which combines the predictive power of ‘blackbox’ algorithms with this interpretability (this concept is in line with the concept of trusted AI)
This mission requires us to work in permanent interaction with the Product and Tech teams, as we develop the quantitative part of the solution while distinguishing ourselves from them:
- By the intensive use of applied mathematics (this is where data science and AI are found) and economics in addition to traditional computer science;
- By adapting the agile method to its more exploratory nature: in particular, we allow ourselves to investigate several avenues of research for the same need because there is a fundamental uncertainty as to what will result, and we distinguish between what is usable (which responds to the need as code) and what is innovative (which responds to the need by also providing new properties).
Data but not only: keeping the big picture in business, scientific and technological terms
The intensive use of applied mathematics and economic theory undeniably covers what is often called data science or AI (machine learning) but more generally a number of engineering techniques in the broadest sense of the term, for example constrained optimisation or algebra applied to computer science.
This curiosity also applies to our technological tools and methods: platforms, frameworks, languages and database systems of course, but also the way of thinking about information and data. For example, we have borrowed elements of theory and practice from expert systems in our algorithms, which are also built on a data science stack as presented above.
Finally, this curiosity also goes hand in hand with a necessary sense of pedagogy and critical thinking both within the team, at Lucky cart and, of course, towards our clients. No grey area should remain and no commonly accepted idea should be considered true by default.
These three elements (broad scientific field, technological openness and pedagogy or critical sense) make our mission precisely an engineering mission: to conceive, develop and operate a technology by responding to the needs, and bending to the constraints, of our clients and users.
The scope of the missions, roles and skills of each team member is therefore much broader than simply data.
The 3 roles – Data scientists
Data scientists are responsible for the design, development, training and evaluation of mathematical models or machine learning algorithms. They have expertise in the fields of statistics and machine learning (including deep learning), the necessary hindsight to analyse these results and propose areas for improvement. In order to implement robust predictive models, offering an increased level of performance on large masses of data, data scientists must :
- Define the best metrics to optimise and evaluate the model;
- Determine (or adapt) the most promising models according to the type, shape and properties of the data;
- Create and test different explanatory variables, also known as feature engineering;
- Finding the best hyperparameters to optimise model performance;
- Once the models have been validated, they can then be put into production within the company’s cloud platform with the help of data engineers.
The applications of these models are multiple, ranging from operational research, mentioned above, to fraud detection, via recommendation systems.
The data scientists are also responsible for monitoring the state of the art of methods to enable the company to offer functions at the cutting edge of scientific knowledge.
The 3 roles – Data analysts
The data analysts make the link between the quantitative subjects processed by the data and the business subjects worked on by our Business Insight team.
The data processed automatically feeds our API by optimising its performance and personalising the shopper experience, but does not provide a business interpretation that can be used by humans. It is therefore up to the data analysts to make the data speak.
To do this, they must first gather and reformulate in technical terms the needs of other teams (sales, marketing, finance) via the Business Insight team, in particular by defining performance or analysis indicators. They extract, clean, explore, interpret and present their conclusions to the same team.
They work on large volumes of complex data (hundreds of terabytes, i.e. several tens of billions of records), with a responsibility for the quality of the figures presented. The data analysts’ reports are used to facilitate decision-making, draw out business observations and define the most appropriate marketing strategies. Data analysts’ presentations therefore often have strong implications, so it is imperative that in addition to their technical skills, they have strong communication skills.
Data analysts also work in coordination with the data engineers, who are responsible for checking and cleaning the data. They are therefore the gate keepers of the quality and consistency of the data ingested into the company’s databases. They can also participate in data science tasks, particularly in the features engineering part: their knowledge in terms of data cleaning and processing will enable features to be built in a clean and complete way, which will subsequently facilitate the work of the data scientists.
The 3 roles – Data engineers
The data engineers are responsible for the operationalisation of our algorithmic factory in terms of:
- Source management, data quality and flow availability: the data is generally retrieved from various environments far from the data team (retailer sites or applications, product databases or geolocated data). Each source requires different processing, whether in terms of standardisation/normalisation, cleaning, provision and documentation. This part can be done closely with the data analysts who will point out areas for improvement in the flow;
- Accessibility to the computing power needed to process the data and train the model. Making data available is one thing, making it efficiently searchable is another. To do this, data engineers can rely on cloud or in-house technologies. Hosting data so that calculations can be performed within a minute by any employee is a recurring topic. But some configurations are more complex. Powerful clusters to set up parallel model training via a notebook in a way that is completely transparent to a data scientist, for example;
- Production of predictions and applications for our API and other Lucky cart teams: the data engineers are the data team’s interface with the developers and the product. They are the ones who will develop to automate and industrialise the data applications. A graph often produced by data analysts could be made available automatically on a web interface. A powerful model developed by data scientists will probably have to predict regularly and have an impact in production, leading to questions of dev/data interfacing and scaling.
While the differences between the 3 roles in their descriptions and skills required are clear, a common culture and interaction between each of the roles is necessary.
First of all we have built a common ambition and culture:
- The ambition to build the algorithmic factory mentioned above in a context where this innovation allows Lucky cart to shape the habits and markets of a key and stressed industry in our economies: food and convenience goods distribution;
- The scientific and technological culture and curiosity or critical sense that we have already discussed as well.
These two common factors allow us to work well together and ensure that each takes into account the objectives and constraints of the other:
- The data scientists by thinking about the scaling of the mathematical methods they develop from the design stage onwards in order to best prepare the work of the data engineers;
- The data engineers by creating the most suitable computing environment for the scientific developments of the data scientists and the business objectives of the data analysts;
- Data analysts and data scientists by sharing the same data schema and the same business vision so that the latter’s models are the natural extension of the former’s indicators and that these models can be naturally extended to any new indicator.
This common culture and the right balance between the roles also rests on the Chief Data Officer, who must ensure that such conditions are met, set an example (for example, CDOs from the data science sector must pay particular attention to data engineers and data analysts) and constantly coach employees to ensure the team’s success and the development of each individual.
Beyond titles and trends, our experience leads us to believe that the construction, development and success of a data team are based on a clear understanding of the business objectives, the technological challenges and the scientific methods, as well as the respect of the balance between the respective attention given to each of these three dimensions (business, technology, science) and what results for the talents of the team, in their respective roles.