A Day in The Life of A Data Scientist 1

After entering the retail industry as a data scientist in mid-2019, I got questions from friends, or friends of friends, about the job and industry of data scientists. After all, a data scientist is relatively new compared to a data analyst. Many people have never heard of it and consider it a different name for stock analysts. Some people do apply data analysis to stock price forecasting.

With overview introductions, this article will try to remove all the terminology and explain the daily life of a data scientist simply and understandably. And I will explain more practical details and insights in the following articles.

Three Types Of Data Analyst Positions

People would refer to the discipline of data analysis as Data Science.  The three common positions are Data Engineer, Data Scientist, and Business Analyst. Following is a brief explanation of the competency differences between these three roles. The definition here is based on the article  “Data Engineer vs. Data Scientist vs. Business Analyst” from Canadian data science media Towards Data Science.

As shown in the figure below, the most critical skill of a data engineer is computer science skills, also known as information engineering in Taiwan. They collect, clean, and prepare all the data so that the other two roles can access it easily.

Data scientists, the focal point of this series of articles, need more robust statistical knowledge and the ability to build machine learning models. In addition, data scientists usually prototype machine learning models and let machine learning engineers deploy them, as seen in “Data Scientist vs. Machine Learning Engineer Skills. Here’s the Difference.” from Towards Data Science.

Business analysts, also known as data analysts, focus more on integrating data and business. Their core technical skill is to use SQL to obtain the required data from the database and provide business analysis and insight, while the ability to build machine learning models is not usually needed.

Although the classification for each role is detailed, not every company has such a complete establishment and separate duties. It is the reality worldwide and, of course, in Taiwan.

Data Science Industry Situation In Taiwan

Although data scientists are new in Taiwan, the concept of data analysis or data analytics is not new at all. Every position in a company that has been around for years requires data and analysis, whether in marketing, sales, purchasing, or warehousing. With the emergence of fancy new terms, technologies, and applications — such as big data, data science, machine learning, and AI — the knowledge and skills required for data analysts are so numerous and overwhelming for them to learn, not to mention the challenge of mastering a programming language. As a result, data scientist becomes an independent position or even an independent department.

In terms of industry, the industries paying more attention to data analysis include retail, finance, technology, and advertising.

(1) Department Positioning

Since this position has not been around for a long time, we still lack a standard definition of data scientist among companies. For example, a data scientist in company A might be in charge of software development; one in company B might be part of a market research department; the other in company C might be under an IT department. Therefore, if you are looking for a job related to data analysis or data science, do not just look at the job title. It is better to look at each of the responsibilities and required competencies in detail. On the other hand, interviewers will also focus on the practical project experience and competencies when evaluating a candidate’s ability.

The department I work for is an independent analysis department that assists other departments in making business decisions, which is an auxiliary department like a think tank or counselor. We can say that our service is data analysis, while the department we support is our clients.  In this case, the primary value of data scientists is understanding clients’ needs and assisting them in achieving their business goals.

(2) Competency Requirements

Beyond the department where a data scientist belongs, the skills required of a data scientist vary from company to company. For example, some analysts can work with Excel alone, while others mainly work on data visualization and present understandable and interactive dashboards. Another type of analyst, closer to what I define as a data scientist, has to do programming and acquire some IT knowledge as well as statistics and algorithms. This type of analyst is a bit like IT, but not the same as IT. While IT usually works for system planning, hardware/software maintenance, and functional expansion, the job of a data scientist is to uncover unknown business insights in a pile of data.

What Does “Business Insights” Mean? Let’s Make It Simple

Data Analysis And Data Aggregation

Before discussing business insights, let’s talk about the difference between Data Analyze, or Data Mining, and Data Processing. Both terms are data processing. But for different purposes, they can be divided into data aggregation and data analysis. Data aggregation summarizes established events, while data analysis makes decisions for the future, which is predictions.

What is data aggregation? Suppose you have the consumption data of a clothing store and want to know which items have the highest monthly sales. For data aggregation, you have a distinct goal: “find the items with the highest sales.” All you need to do is calculate the sales of all items separately and find the highest one. Done.

Then think about the following questions. Did this item become a best-seller of the month because of the promotion? What is the impact of the promotion event on this item? Is it appropriate to include this item in future promotions? These questions are much more challenging to answer. Not because there are more questions but because we need to consider more aspects.

First of all, we need to know the sales of this item during the period of no promotion and compare it with the sales during the promotion period to confirm whether this promotion event positively raises the sales of this product.

Next, rerun the previous step, but calculate the impact of the promotion on all products. For example, suppose the promotion event can increase the sales of all products by 20% on average but can increase the sales of this product by 30%. In that case, we can have an initial assumption that this product is suitable for promotion.

It is an“initial” assumption because we still have to consider other possible factors, such as the product’s gross margin. For example, if the product’s gross margin is relatively low, the gross margin will become even lower after the promotions. Therefore, even if the sales volume looks good, it is still not profitable.

It’s better not to conclude that “this product is very suitable for promotion.” directly after excluding factors like gross margin. For example, your boss may argue, “It’s just getting cold. The sales volume of down jackets will definitely grow! So why do we need a promotion?”Correspondingly, you may have to consider the season as well. While processing the data, the business question will turn into, “Whether a seasonal or a promotional event has a greater positive impact on the down jacket sales?”

Data aggregation is always a “must-do” in the actual data science workflow. But I think, whether you can reach the level of data analysis or data mining and lead to valuable business actions, is where the value of a data scientist lies.

Take NIKE As An Example

Let’s take a look at a real-world example: NIKE. NIKE has many series of sports goods, including different types of sports, functions, and prices. Their analysts may make the following analysis based on customer data and purchase records:

  1. What kind of sports do customers often do?
  2. How often do customers buy? Do they buy a new one when the product is worn out, or do they buy a new one when it comes out?
  3. Do customers prefer to buy new seasonal products or out-of-season ones?
  4. Do customers buy from stores, or do they buy online?
  5. Do customers always wait until seasonal sales, or do they shop during non-sale periods?
  6. Do customers use Apple Watch Nike or install the Nike Run App? How are they doing physically and athletically?  Are they going to buy higher-end products?

The above questions are close to what we data scientists are doing. Indeed, the question list can go on and on. As to whether the question is worth doing research, it all depends on your business goals. For example, if NIKE were to launch basketball shoes endorsed by an NBA star, the company would like to predict: Who would be our customer? Would the customer purchase online or in-store? How much advertising budget will it take to make the customer desperate to buy? Do we need to give a discount, or is a giveaway more attractive?

Once the answers are clear, the marketing department could plan marketing campaigns based on the research results. If the analysis is correct and the marketing campaign works well, it can generate revenue for the company. This, is the business insights.

Big Data Analysis

Big data analysis features five V’s. One is Variety, which means that beyond the total customer spending, we can distinguish and predict customer behavior that drives business insights through other aspects. That is why I think data scientists need to do programming and have IT knowledge. You can not run millions or billions of data on Excel, not to mention carry out complex statistical analyses on these data.

The tasks of a data scientist will vary depending on the company’s position and business. For example, as a brand retailer, NIKE may also care about brand market share aside from sales. Meanwhile, PX Mart, more of a distributor, may concern more about which products have higher turnover rates, how much inventory they should prepare, and how to reduce the waste of fresh foods.

The financial industry is also where companies apply data analysis a lot. For example, banks would analyze customers’ loan repayment ability or identify whether credit card purchases are fraudulent.

There’s A Lot To Play With In Retail

There are more jobs related to data analytics in these industries: retail, finance, and technology. On average, the salary in the retail industry is the lowest, followed by the finance, and highest in the technology industry. Since retail companies have lower gross profits than the ones in financial or technology industries, it is not so surprising that the average salary is lower.

However, I think retail is the most interesting one among these three industries, since it is more down-to-earth. We can analyze the impact of holidays or even the presidential election on sales, or we may figure out that snacks by the checkout counter always sell well. In the retail industry, you also have a better chance of walking into a store and observing directly how consumers shop, whether they will first look at the product details or the price, and whether there are different shopping habits of men and women. Accordingly, you can use the consumption data to verify your observations, to see whether your assumption is correct and whether there is a chance to turn it into a marketing campaign to gain value for the company.

Social media buzz is also one of the exciting things about working in the retail industry. For example, we can observe whether there are potential consumers in the discussion boards on PTT or Dcard, and whether the curator can arouse netizens’ interest and bring sales. In addition, LINE also implemented its official Account 2.0 Program in 2019, giving marketers and data scientists more data to analyze and higher flexibility for testing and experimentation.

Although the salary in the retail industry is relatively low, it also means that companies are more willing to hire novices. After all, the retail industry is a livelihood industry with rigid demand for data analytics talents. When experienced employees do not stay, the company will hire relatively junior applicants, ones without working experience or even without relevant bachelor’s degrees. Besides, you can have more flexibility within the retail industry. So for those who want to enter the field of data science, I think joining the retail industry is still a good choice.

Life Isn’t Always Smooth Sailing

“Wherever there is light, there is shadow.” Data analysis is interesting, but you will face some difficulties as well. For example, your boss may appreciate that data analysis is extremely powerful but challenge why there is no such astonishing result after such a long time. Or your colleagues in the front-line sales department might wonder, why don’t you even have the basic know-how of this industry?

It is hard to finish it in a few words, so see you in the following article.

Online Learning Resources

There are many high-quality online teaching platforms for data science, such as Hahow in Taiwan or Udemy in the U.S. Although the course content on Udemy is mainly in English, many Chinese courses are also coming out. Furthermore, there are some programming teaching platforms where you can practice programming directly on the website, such as DataCamp. For more information, you are welcome to refer to “Learn Python and R On DataCamp. Start Your Data Science Career“.