• Home
  • >
  • DevOps News
  • >
  • Tecton Helps Data Scientists Own Features, and the Model Lifecycle – InApps 2022

Tecton Helps Data Scientists Own Features, and the Model Lifecycle – InApps is an article under the topic Devops Many of you are most interested in today !! Today, let’s InApps.net learn Tecton Helps Data Scientists Own Features, and the Model Lifecycle – InApps in today’s post !

Read more about Tecton Helps Data Scientists Own Features, and the Model Lifecycle – InApps at Wikipedia

You can find content about Tecton Helps Data Scientists Own Features, and the Model Lifecycle – InApps from the Wikipedia website

This has been called the year of the feature store, with Databricks and Google among the most recent vendors announcing this technology to smooth the path for harnessing machine learning models in production. Twitter, Facebook, Comcast, Netflix, Pinterest and others also offer feature store platforms.

Not to be confused with Tekton, the open-source framework for creating CI/CD systems, the commercial enterprise feature store Tecton aims to standardize and automate the management of features in production machine learning (ML) applications.

Tecton.ai founders Mike Del Balso, Kevin Stumpf and Jeremy Hermann worked together at Uber as it created the Michelangelo machine learning platform.

Before Michelangelo, data scientists at Uber would create models, then pass them on to engineers who cobbled together open source tools to manage them, Del Balso said. The company had no standardized system for building reliable and reproducible pipelines for creating ML models. Models could not be larger than what would fit on a data scientist’s desktop, there was no centralized storage for training experiments and no way to compare experiments.

“That data management side of machine learning is really the unique thing that we built. And that’s what really inspired us to build Tecton, because we saw how useful that was at catalyzing this explosion of machine learning [that] enabled the company to go from zero to tens of thousands of models in production,” he said.

“We’re trying to bring that same change to the rest of the industry by bringing that same kind of data layer for machine learning, especially for real-time machine learning applications, to other organizations who are trying to figure this stuff out.”

Read More:   Software Development In Finance & Banking: Top Five Challenges 2022

Del Balso, who before his work at Uber helped build the machine learning system for Google’s ad division, notes Tecton is focused on operational machine learning — applying the data the company already has into decision-making for its products, rather than more research-based or analytical uses for data.

“Data scientists often work locally, training models and building the pipelines of data that feed them. But taking that local model into at-scale production is an arduous, time-consuming process, subject to constraints that just aren’t present in the training environment. Furthermore, models trained offline have to be pushed online, and operate on the same type of data (called features) in order to give sensible results. But the tooling to standardize, govern and collaborate around ML data is still incredibly immature,” Martin Casado, general partner at the venture capital firm Andreessen Horowitz, wrote of its investment in Tecton. The company has raised $60 million to date.

Full ML Lifecycle

The technology is more than just a database of features, those variables or attributes such as name, age, sex used in machine learning models.

“Tecton allows for the data scientists to be empowered throughout that machine learning lifecycle, and allows them to both build the prototype. But then in the process, the data pipelines are automatically productionized,” Del Balso said. “So the engineering teams, they have a much easier job because there’s not a lot of cumbersome and error-prone rebuilding of different pipelines along the way. …There’s this is kind of like prototyping transformation, the productionization, and there’s an element of monitoring and quality management along the way.

The Tecton platform consists of:

  • Feature pipelines for transforming raw data into features or labels
  • feature store for storing historical feature and label data
  • feature server for serving the latest feature values in production
  • An SDK for retrieving training data and manipulating feature pipelines
  • web UI for managing and tracking features, labels, and data sets
  • monitoring engine for detecting data quality or drift issues and alerting

It includes the transformation of features; storage, which consists of an online and an offline store for fast retrieval and slow retrieval; feature serving and then a governance layer, “to help ensure, ‘Hey, these features are only accessible to these teams,’ ‘Help me understand the lineage of different features,’ all the metadata and collaboration that’s needed in building these machine learning applications. And then a data quality and monitoring layer for features to understand the debugging processes that you have with data in your machine learning applications,” he said.

Read More:   Update When Is Decentralized Storage the Right Choice?

Features are defined as code for any Python environment using the Tecton SDK. The platform can pull existing features from external data sources, but also to compute features on raw data using PySpark, Spark SQL or Python transformations on batch and streaming data.

The offline store contains historical feature values across time and is used to generate training data in batch. The offline feature store is configurable but defaults to Delta Lake. The online store uses AWS DynamoDB to provide the latest feature values for low-latency retrieval.

You can specify configurations like the date in the past to backfill features to, the schedule for future jobs, a time to live and more.

Training datasets are delivered as pandas or Spark dataframes. Once you have your dataset, you can use your existing tools such as XGBoost, TensorFlow, PyTorch to deploy models.

Tecton enables data scientists to use in their models more data that they already have by bringing data sources together in real-time, Del Balso said, and using that real-time data in their applications.

Joining Feast

In April, the San Francisco-based company announced it was hiring Willem Pienaar, founder of the open source feature store Feast, and becoming a major contributor to the project. Feast was created while Pienaar led the data science team at Chinese ride-hailing startup Gojek and in conjunctionå with Google. Feast recently released version 0.10.

“It’s just like something that allows people to get started really easily with feature stores. And we expect to have a lot of additional elements like compatibility between the Feast user experience and the Tecton user experience over time,” Del Balso said. “Today, they’re separate platforms; tomorrow, they may not be. Our goal is to make it really easy for there to be a bridge between them.”

Going forward, the company plans deeper integrations with the data warehouse ecosystem and to add other clouds beyond Amazon Web Services. It plans first-class integrations with Snowflake and Redshift this year. It wants to help users generate better features for their models, find the data most relevant to their decision-making, and to help people figure out how to piece together the ML infrastructure into an architecture that makes sense for their use case, he said. It wants to be able to offer users a template for building a fraud application, a recommendation template, a prediction template, “and have all of the data flows be pre-built for that organization, so they just plug us into their data, this is a pretty big thing that we are spending a lot of time on,” Del Balso said.

Read More:   Security Insights into Infrastructure-as-Code – InApps Technology 2022

List of Keywords users find our article on Google:

tecton ai
feature store
databricks training
data science wikipedia
tecton
data scientist jobs san francisco
“tecton industries”
wso2 training
own the lifecycle
hire databricks developers
amazon data scientist
databricks lineage
great lakes data science reviews
leap engine tooling
data scientist job description linkedin
databricks data lineage
data modeler jobs in usa
andreessen horowitz jobs
hire pyspark developers
streaming data and tensorflow
databricks governance
“contact tecton”
“tecton quality”
“tecton”
“who is tecton industries”
feature store tecton
data life cycle wikipedia
wso2 store is a
databricks customer success engineer
data scientist linkedin profile
linkedin industry for data science
capabilities of tecton industries
data science linkedin profile
data scientist profile linkedin
tens machine wikipedia
data scientist linkedin
data engineer uber
uber data engineer
“redshift digital”
linkedin profile for data scientist
pyspark logo
databricks deploy model
data bricks icon
dynamodb offline
ci cd databricks
“databricks”
data scientist at facebook
share spark dataframe
hire pyspark developer
databricks ui
facebook machine learning jobs
business development gojek
databricks data sources
tensorflow serving aws
google ml metadata
case when pyspark
pyspark sql between
tensorflow model fit
who sells tekton tools
databricks-datasets
dynamodb jobs
eve online models
tecton industries
aws sdk dynamo
ml metadata store
snowflake python pandas
first look: mysql 8 for developers online courses
data scientist jobs in san francisco
databricks data scientist
devops for data scientists online courses
qc labels
dynamodb spark
help.uber.com number
netflix data scientist
dynamodb client
testcontainers version

Source: InApps.net

Rate this post
As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Get a custom Proposal

Please fill in your information and your need to get a suitable solution.

    You need to enter your email to download

      [cf7sr-simple-recaptcha]

      Success. Downloading...