Loka, Inc Data Lead
Hi, I'm Fer. For the better part of the last decade I’ve specialized
in data analysis and data engineering for Fortune 500s and Series-A Startups.
Here’s what I see coming our way in 2021.
When I think about how I’d define data, many things come to mind. It’s facts,
it’s observations, it’s behaviors, all put together to be later referenced
and analyzed. When data is given a reason, a purpose, and starts answering
questions, then it becomes information.
Information can be descriptive (what happened?), and it can be predictive
(what will happen?). As a professional in the data engineering and data
analysis space, it's effectively the building block for everything I do.
Based on what I’ve seen and experienced working with different startups,
what their needs are, and what the cloud industry is bringing to the table,
I've listed some trends and themes you can expect to see when it comes to
these essential electric signals.
Machine learning, what market analysts have been buzzing about when forecasting
data trends for the last five years or so. Even though this is a term that’s
been thrown around abundantly over the last couple of years from many startups
as a sales pitch, we’re arriving at a point where the benefits are not just theoretical
There are a lot of practical applications. “Practical”
is the word I’d like to center on here. From vaccine development
(discovering patterns humans would take much longer to find), to farming
(finding the right balance of nutrients depending on the different soil and weather conditions),
to health (discovering patterns in life behaviors versus longevity),
to transportation (hi autonomous driving!), machine learning is being applied
in some way to accelerate a product or service
. In these examples, machine learning
plays a meaningful role, it’s not just a buzzword for empty promises or pipe dreams,
but a feature baked in to the core functionality
For data engineering, the main focus for 2021 will be the push towards a serverless
approach. By severless, I mean no servers to maintain yourself, but rather servers
maintained and scaled by the solution’s provider. This means less time focusing
on the underlying machine that hosts your solution, and more time actually
developing it. From the realm of AWS, two main serverless services—Aurora Serverless
and Lambda—have received major backend performance upgrades. For instance, Aurora
Serverless no longer has the “cold boot” issue, and Lambda has a higher timeout than
before with a faster scaling for heavy workloads. Just to name a few.
Serverless is a big focus for this year because it’s making it easier for
data professionals to start implementing pipelines without a heavy knowledge
on devops, which was the main pain point: having servers to maintain, infrastructure
to keep updated, healthy and fast at all times.
Serverless makes this automatically. And make no mistake, there’s still servers
running your code, your database, your REST API, your message brokers and real-time
data streams, but it’s not something you have to worry about, it’s being managed by
your cloud service provider, which takes that hassle out of your hands for a small
premium. A small premium that allows you to build more efficiently, iterate faster,
and deliver a better product overall, a product that can also adapt to more demanding scenarios.
In short, there are two main features of serverless that make it such an enticing
(and logical) next step in cloud computing execution models: low to no maintenance
and the ability to scale according to your needs.
Infrastructure as code
If you’ve heard about infrastructure as code and haven’t given it enough attention
or resources, this year is the time to invest in it! This practice keeps growing
and getting more adoption because of how much it simplifies the task of maintaining
a whole tech stack. Imagine having your entire stack written out in simple steps
as code and being able to replicate it by simply placing that code elsewhere.
No need to single provision and/or connect every item in your stack. This can
all now be done from lines of code that you can easily carry from one environment