Jacob Banks

Jacob Banks

1605321206

What does a Curriculum Engineer do? Featuring Yulia Genkina!

Ever wonder what a Curriculum Engineer does? Yulia Genkina has got you covered! Catch this AMA (previously live streamed on the MongoDB Twitch channel) where Yulia answers questions like “What even IS curriculum engineering?!”, “How do you know if curriculum engineering is a good fit for you?”, and much more. You’ll even get to learn about Yulia’s journey to becoming a Curriculum Engineer at MongoDB (which is a super interesting story)!

#mongodb #programming #developer #web-development

What is GEEK

Buddha Community

What does a Curriculum Engineer do? Featuring Yulia Genkina!
Vern  Greenholt

Vern Greenholt

1598245080

Feature Engineering: What is Feature Engineering?

According to a survey in Forbes, data scientists spend 80% of their time on data preparation. This shows the importance of feature engineering in data science. Here are some valuable quotes about Feature Engineering and its importance:

Coming up with features is difficult, time-consuming, requires expert knowledge. ‘Applied machine learning’ is basically feature engineering — Prof. Andrew Ng.

The features you use influence more than everything else the result. No algorithm alone, to my knowledge, can supplement the information gain given by correct feature engineering — Luca Massaron

What is Feature Engineering?

Feature engineering is the process of transforming raw data into features that better represent the underlying problem to the predictive models, resulting in an improved model accuracy on unseen data.

Basically, all machine learning algorithms use some input data to create outputs. This input data comprises features, which are usually in the form of structured columns. Algorithms require features with some specific characteristic to work properly

Having and engineering good features will allow us to most accurately represent the underlying structure of the data and therefore create the best model. Features can be engineered by decomposing or splitting features, from external data sources, or aggregating or combining features to create new features.

#data-science #feature-engineering #feature-selection #data analysis

Feature Engineering & Feature Selection

WarningThere is no magical formula or Holy Grail here, though a new world might open the door for you.


📈Python for finance series

  1. Identifying Outliers
  2. Identifying Outliers — Part Two
  3. Identifying Outliers — Part Three
  4. Stylized Facts
  5. Feature Engineering & Feature Selection
  6. Data Transformation

Following up the previous posts in these series, this time we are going to explore a real Technical Analysis (TA) in the financial market. For a very long time, I have been fascinated by the inner logic of TA called Volume Spread Analysis (VSA). I have found no articles on applying modern Machine learning on this time proving long-lasting technique. Here I am trying to throw out a minnow to catch a whale. If I could make some noise in this field, it was worth the time I spent on this article.

Especially, after I read David H. Weis’s Trades About to Happen, in his book he described:

“Instead of analyzing an array of indicators or algorithms, you should be able to listen to what any market says about itself.”¹

To closely listen to the market, as also well said from this quote below, just as it may not be possible to predict the future, it is also hard to neglect things about to happen. The key is to capture what is about to happen and follow the flow.

Image for post

But how to perceive things about to happen, a statement made long ago by Richard Wyckoff gives some clues:

“Successful tape reading [chart reading] is a study of Force. It requires ability to judge which side has the greatest pulling power and one must have the courage to go with that side. There are critical points which occur in each swing just as in the life of a business or of an individual. At these junctures it seems as though a feather’s weight on either side would determine the immediate trend. Any one who can spot these points has much to win and little to lose.”²

#feature-engineering #feature-selection #trading #python #machine-learning

Tyshawn  Braun

Tyshawn Braun

1599563880

Ensemble Feature Selection in Machine Learning by OptimalFlow

Feature selection is a crucial part of the machine learning workflow. How well the features were selected directly related to the model’s performance. There are usually 2 pain points for data scientists to go through:

  • Which feature selection algorithm is better?
  • How many columns from the input dataset need to be kept?

So I wrote a handful Python library called **_OptimalFlow _**with an ensemble feature selection module in it, called autoFS to simplify this process easily.

OptimalFlow is an Omni-ensemble Automated Machine Learning toolkit, which is based on Pipeline Cluster Traversal Experiment(PCTE) approach, to help data scientists building optimal models in easy way, and automate Machine Learning workflow with simple codes.

Why we use OptimalFlow? You could read another story of its introduction: An Omni-ensemble Automated Machine Learning — OptimalFlow.

Image for post

The autoFS module will go through popular feature selection algorithms(selectors), like kBest, RFE, etc. , in an ensemble way and select the majority features selected from their outputs as the top important features. Here’re link of details of autoFS module and the default selectors as below:

Image for post

You can read the Documentation of OptimalFlow to understand details about OptimalFlow’s _autoFS _module. Besides, OptimalFlow also provides feature preprocessing, model selection, model assessment, and Pipeline Cluster Traversal Experiments(PCTE) automated machine learning modules.

#data-science #machine-learning #data-engineering #feature-selection #feature-engineering

Willie  Beier

Willie Beier

1592209566

Why Microservices Suck for Machine Learning… and how a Feature Store makes it better!

In this article, I’ll describe why microservice oriented architectures suck for machine learning. I’ll then lay out how companies like AirBnB and Uber used a feature store like StreamSQL to manage it.
Feature stores allow you to define your ML feature definitions declaratively and use them across training and serving. It enables teams to share, re-use, and discover these features across teams and models. It also manages feature versioning and monitoring. Their ultimate goal is to allow ML teams to focus on building models rather than data pipelines.

#feature-engineering #data-science #machine-learning #data-engineering #feature-store

Zakary  Goyette

Zakary Goyette

1601389320

Demystifying Feature Engineering and Selection for Driver-Based Forecasting

Welcome to the second part of my 3-blog series on creating a robust driver based forecasting engine. The first part gave a brief introduction to time series analysis and gives readers the tools needed to makes sense of time series datasets and cleaning it up (Link here). We will be now looking at the next step in our analysis.

While working with one of the leading data analytics teams in India, I have realized that there are two key elements which lead to actionable insights for our clients: **Feature Engineering **and Feature Selection. Feature engineering refers to the process of creating new variables from existing ones which capture hidden business insights. Feature selection involves making the right choices about which variable to choose for our forecasting models. Both these skills are a combination of art and science which need some practice to perfect.

In this article, we will explore the different types of features which are commonly engineered during forecasting projects and the rationale for using them. We will also look at a comprehensive set of methods that we can use to select the best features and a handy method to combine all the these methods. To dig deeper on feature analysis, one can refer to the book “Feature Engineering and Selection: A Practical Approach for Predictive Models” by Max Kuhn and Kjell Johnson

#feature-selection #time-series-analysis #ai #forecasting #feature-engineering