Here are my predictions for “Data” in 2021.
But before you read any further, I hope that the disclaimers in footnotes will allay your fears and sense of indignation at the hand-waving that follows. You have been warned. So without further ado, here goes.
I believe that in 2021,
- The hype1 around Artificial Intelligence and Machine Learning will go down even further as many small/medium companies will seriously begin to question the ROI of data science and ML initiatives.
- In the same vein, the hype around file based data lakes will die completely and more teams will use an MPP data warehouse in the cloud as a data lake.
- Machine Learning will get commoditized even further2.
- Data teams will shrink in size in most places.
- Data teams will shift their focus to core software engineering and data-ops practices and improving operational efficiencies.
- Analytics Engineer will become the hot, new job by end of 2021.
- You will hear the word ‘data-mesh’ a lot more from CDOs and CTOs although the term will be misused by everyone.
- Lakehouse will fail to take-off as an architecture pattern except for those using Databricks platform.
- Databricks will focus more on SQL as the interface language on their delta-lake.
- AWS Redshift will lose to Snowflake and Google BigQuery and its market share will reduce even further.
- Google will probably offer Looker as a GCP service.
- Dbt will become as common as Apache Airflow in data tool chains across small to mid-size teams.
- Apache Airflow will see even greater adoption and will leave the competition far behind.
- Data Management3 will become the hot new area of innovation and growth. The industry will come up with maybe a better term for it. There will be more4 tools and solutions launched in this space than any other.
Edit - Jan 06, 2021: Discuss this on LinkedIn
- This list is purposely meant to be provocative and triggering. Take it lightly and not get your knickers unduly twisted.
- I am not an expert or have any authority in predicting the future and I will most likely get these wrong.
- I have several conscious and unconscious biases and most of these predictions reflect them in all their glory.
- My apologies in advance if you find this list offensive or in bad taste.
This is not to say that these jobs will become irrelevant but the blind hype around these roles will surely subside and I think that is a good thing for the industry as a whole. ↩
Both BigQuery and Redshift allow you to execute ML models within the warehouse. ↩
Data Management include data quality, metadata, lineage, discovery and governance. ↩
For a while, there will be more tools than you care for. But clear winners will emerge by end of 2021. ↩