The now and near future of data
A short and sweet trends forecast from a data lead who lives this stuff
By Fernando Escobar, Data Lead at Loka, Inc.
Hi, I’m Fer. And for the better part of the last decade, I’ve specialized in data analysis and data engineering for Fortune 500s and Series-A Startups. Here’s what I see coming our way in 2021.
When I think about how I’d define data, many things come to mind. It’s facts, it’s observations, it’s behaviors, all put together to be later referenced and analyzed.
When data is given a reason, a purpose, and starts answering questions, then it becomes information we can put to use.
Information can be descriptive (what happened?), and it can be predictive (what will happen?). As a professional in the data engineering and data analysis space, it’s effectively the building block for everything I do.
Based on what I’ve seen and experienced working with different startups, what their needs are, and what the cloud industry is bringing to the table, I’ve listed some trends and themes you can expect to see when it comes to these essential electric signals.
Machine learning, what market analysts have been buzzing about when forecasting data trends for the last five years or so.
Even though this is a term that’s been thrown around abundantly over the last couple of years from many startups as a sales pitch, we’re arriving at a point where the benefits are not just theoretical. There are a lot of practical applications. “Practical” is the word I’d like to center on here.
From vaccine development (discovering patterns humans would take much longer to find), to farming (finding the right balance of nutrients depending on the different soil and weather conditions), to health (discovering patterns in life behaviors versus longevity), to transportation (hi autonomous driving!), machine learning is being applied in some way to accelerate a product or service.
In these examples, machine learning plays a meaningful role. It’s not just a buzzword for empty promises or pipe dreams, but a feature baked into the core functionality.
For data engineering, the main focus for 2021 will be the push towards a serverless approach. By serverless, I mean no servers to maintain yourself, but rather servers maintained and scaled by the solution’s provider.
This means less time focusing on the underlying machine that hosts your solution, and more time actually developing it.
From the realm of AWS, two main serverless services—Aurora Serverless and Lambda—have received major backend performance upgrades. For instance, Aurora Serverless no longer has the “cold boot” issue, and Lambda has a higher timeout than before with a faster scaling for heavy workloads. Just to name a few.
Serverless is a big focus for this year because it’s making it easier for data professionals to start implementing pipelines without a heavy knowledge on devops, which was the main pain point: having servers to maintain as well as infrastructure to keep updated, healthy and fast at all times.
Serverless makes this automatically. And make no mistake, there are still servers running your code, your database, your REST API, your message brokers, and real-time data streams, but it’s not something you have to worry about.
It’s being managed by your cloud service provider, which takes that hassle out of your hands for a small premium. A small premium that allows you to build more efficiently, iterate faster, and deliver a better product overall, a product that can also adapt to more demanding scenarios.
In short, there are two main features of serverless that make it such an enticing (and logical) next step in cloud computing execution models: low to no maintenance and the ability to scale according to your needs.
Infrastructure as code
If you’ve heard about infrastructure as code and haven’t given it enough attention or resources, this year is the time to invest in it!
This practice keeps growing and getting more adoption because of how much it simplifies the task of maintaining a whole tech stack.
Imagine having your entire stack written out in simple steps as code and being able to replicate it by simply placing that code elsewhere.
No need to single provision and/or connect every item in your stack. This can all now be done from lines of code that you can easily carry from one environment to another.
Fernando Escobar is the Data Lead at Loka, Inc. “Fer” as he’s known, has extensive expertise in project governance, business development, and data analysis. Loka’s Fortune 500 and Series-A Startup clients best know him for providing excellent data management from collection to analysis, along with insightful recommendations for executives. Loka’s teammates know Fer for critical gaming and headphone reviews.