Use TransmogrifAI Python to jumpstart Salesforce machine learning

Salesforce released TransmogrifAI Python, a machine learning library written in Scala that runs on top of Spark. This can be potentially deployed on any cloud such as Heroku/PostgreSQL platform. What all is involved in TransmogrifAI python?

  • Language: Scala
  • Underlying engine: Apache Spark data processing engine
  • Deployment platform: A standalone local machine or cloud platform like Heroku

Let us explore a bit more about these new players in the scene and whether they will align with our need to build robust machine learning models. The entry barrier to using the TransmogrifAI python library is likely to be the new tech stack that a typical Salesforce developer needs to scale up to.

Learning Scala pays well

It is tempting to just start or re-start your career with Scala because it is a well-paying one. See a snapshot below of languages associated with top salary receiving participants in a survey done by Stackoverflow.

transmogrifai python
Salesforce Machine Learning

Spark Salesforce is the framework of choice

Spark Salesforce is one of the top 5 frameworks in which respondents wanted to continue working. The following snapshot is again from the same Stackoverflow survey.

transmogrifai python
Salesforce Machine Learning / AI

PostgreSQL is a loved Database

Second, in the list is PostgreSQL that is supported in cloud platforms such as Heroku. The following snapshot is from the same StackOverflow survey.

salesforce transmogrif Ai
Salesforce Machine Learning / AI / PostgreSQL

Heroku platform is popular

Though not in the Top 5 list, Heroku was the 12th popular platform amongst respondents. The following snapshot is from the same StackOverflow survey.

TransmogrifAI python
Salesforce AI / Heroku

Making a case for Scala

If you are a fan of Transmorgrifai Python language, you may be slightly disappointed that Salesforce team chose Scala as language for the new machine learning framework called TransmogrifAI.

Why Scala?

  • Unlike Python, Scala is a compiled language. Scala source compiles to Java bytecode, so that the resulting executable code runs on a Java virtual machine.
  • Code written in it gets executed much faster (comparing to pure Python)
  • Apache Spark, the data analytics engine, is built in Scala language

Starting with Apache Spark

Apache Hadoop is the open-source implementation of the MapReduce. Apache Spark is an enhancement to Apache Hadoop for distributed processing of large datasets. Spark performs better than Hadoop in handling in-memory computations. Spark Salesforce introduces a data structure called Resilient Distributed Dataset that enables better in-memory computations. Spark internally uses Apache Hadoop Yarn for cluster management.

Distributed machine learning

Here are a couple of reasons why we require distributed Heroku machine learning i.e., code that runs over not just a single machine but across multiple machines:

  1. Ability to handle real-time data: Say, we are talking about a self-driving car. Lot of sensor data is going to come in and the on-board computer has to process them and provide real-time direction to stop the car if it spots a child crossing the road.
  2. ML activities that need to be completed fast: For example, How soon can we complete the training process? With distributed computing and RDDs, we achieve it faster.

Need for Salesforce Spark when the heroku machine learning use case is simple

Spark can take advantage of multiple processor cores in a single machine too. i.e., It is possible to set up a Spark cluster in a standalone machine. In such a case, Spark will take advantage of this multi-core single-node machine like how it will work over a cluster of machines.

Not just another library for Machine Learning

There are quite a few machine libraries that exist already in the market such as Apache Spark MLib. Salesforce TransmogrifAI now makes it easier to use Salesforce data(types) more practically in machine learning. Salesforce team claims that a lot of simplification and abstraction are done not just for pre-processing of data and dynamic selection of models, but all through the programming approach as well.

Need help with Heroku Machine Learning?

Call us at 855-Mirketa or write to us at info (at) to get a FREE consultation on how to get started with Heroku Machine Learning.

Kabilan Giridharan

Over 20 years experience in leading product engineering and quality software delivery for business-critical enterprise applications. Expertise in agile digital transformation and business process re-engineering.
Posted in Salesforce Machine Learning. Tagged with , , .

Leave a Reply

Your email address will not be published. Required fields are marked *