AI and ML are often used interchangeably, but they are in fact, not the same. Take a deep dive to learn more about these tech buzzwords!
Artificial Intelligence (AI) and Machine Learning (ML) are some of the hottest buzzwords in tech these days, and maybe you want in. But first, let’s take a moment to distinguish the two because while they may often be used interchangeably, they are in fact, not the same. AI is the concept of machines being able to carry out tasks in ways that humans would consider “smart.” ML is the application of AI, based around the idea of giving machines data and letting them learn themselves. The concepts are not new, being introduced by Arthur Samuel in 1959, but with the advent of the Internet – a seemingly limitless amount of data – and the ever-increasing speed of processing power, AI and ML have rocketed forwarded into everyday life. Theoretical computation systems such as quantum computing could, if achieved, be the turning point into creating the global, self-learning-type AI systems that have become popular in science fiction.
However, while dreams of Jarvis might be dancing in your head, get used to the idea of Speak and Spell instead. For now.
“Theoretical computation systems such as quantum computing could, if achieved, be the turning point into creating the global, self-learning-type AI systems that have become popular in science fiction.”
From a business standpoint, every vertical, industry, and small endeavor can benefit from ML. But the very first thing you (as a company) must figure out, is “what question do I want answered?” While humans can infer information based on experience and “gut feelings,” machines need to be guided and fed data constantly to begin and continue the process of learning. Having a clear understanding of what question you want answered is paramount, and every question requires a new model, so start simple. Questions that have a clear answer are easier and require less data to begin getting meaningful output. Some questions are more ambiguous or have copious amounts of variables and require vast amounts of data and data manipulation. Some questions are impossible to answer and don’t make good ML candidates at all.
Second, and nearly as important, is data. Data, data, data! As a technology engineer working in AI/ML, I cannot stress enough how important it is to have the right kinds and the right amounts of data. Nothing can be done in realm of ML without data. Without it, your budding AI has no chance of learning anything. Fortunately, most companies have been storing data digitally for years, and with the prevalence of IoT and mobile apps, companies can collect a constant stream of relevant data to be used to answer questions or make predictions.
“Nothing can be done in realm of ML without data. Without it, your budding AI has no chance of learning anything.”
AI and ML models human procedures, and while humans with experience can start to make predictions, those predictions still have accuracy issues. With people, usually the more experience and the more they can look at contributing factors, their predictions become more accurate. That accuracy, however, likely won’t reach into the 90 percent range even in the best scenarios. Where AI and ML can eventually perform better is through the ability to take on a much larger history and set of contributing factors. In the end, the adjustments can be partially automated, but they often still require some human input. The hope is that your AI or ML system will create predictive behavior around your business models that will eclipse even the most accurate humans.
Sample Questions and Results
As an example, here are some types of questions, samples of data, and a prognosis of how your ML experiment might turn out:
- “How can I tell if a credit card transaction is fraudulent?” This is a good type of question if you have a collection of hundreds of thousands of credit card transactions and have already identified several thousand of them as fraudulent. With this data as training and test models, your ML experiment can isolate properties and measure them against constants to give reliable predictions of whether new transactions are ok or fraudulent.
- “What is the likelihood that patient ‘x’ will get diabetes?” This type of question “could” be a good question if you have hundreds of millions of records of patients of all ages, sizes, backgrounds, general health, family history, and, of course millions of records where patients already have diabetes. This falls into that “answerable but ambiguous” type of question because there are simply so many variables involved and a nearly infinite combination of those variables. Questions like these ARE being asked of machines as we speak, and this type of neural network deep learning is taking place, but the vast amount data and machine power makes ambiguous questions like these not great candidates for your first ML experiment.
- “What is the meaning of life?” or “What are all my competitors developing right now?” are not good candidates for an AI experiment because there cannot be a clear answer with ANY amount of data, or the data is in some way out of your reach.
From a general business perspective, things such as trends in your own sales/pricing, if customers are happy or sad, if employees are doing a good job, when a piece of equipment might need repair, and what days and times to advertise are all great questions that can be predicted by ML
Flow and Process of Building an ML Experiment
After you have a question and the data to teach your machine, it’s time to get down to business. At a high level, the (very) basic flow for building your ML experiment looks like this:
- Know what question(s) you want answered – unambiguous and answerable through data
- Make sure you have enough of the right kinds of data – large datasets with test data representing your problem
- Pick your ML platform and language (Azure MLS, Amazon ML, TensorFlow, apache Spark Mllib, Oryx 2, Hadoop, Python, R, etc.)
- Import your data – move your raw data into your learning environment
- Apply your filters and select your features – programmatic process of removing data points not pertinent to the experiment and ordering the ones that are
- Apply data transforms – programmatic process of teaching the system how to handle inconsistent values and filling in blank/empty values
- Apply your learning models – programmatic application of algorithms designed to simulate the process of thinking, sorting, comparison, and analysis of the data
- Train and test your data – programmatic process of distilling the data along two paths, keeping one path pure while the other path learns along your guidelines. Reviewing this data and tweaking the learning models helps teach the experiment what answers are right and wrong
- Score your train and test data – a programmatic review of Train and Test output weighted with a confidence of how likely each row of data answers the question
- Operationalize your model into a service ready for new data – programmatic process of creating an input and output pipeline for the experiment where new data can be added to the model through other means such as an app or website, and move through steps 5 – 9 with a confidence rating given at the end
- Wash/rinse/repeat – the process of continually feeding new data to the system and manually reviewing and tweaking the models to refine the confidence levels
There are several cloud platforms, programming languages, learning algorithms, and test and train patterns at your disposal. Your best choice depends greatly on your question and your data. Fortunately, companies like Vectorform and its team of inventors specialize in helping companies formalize their AI/ML endeavors and create an execution and delivery plan that will give them the edge in making business decisions.
While no system will ever really be able to capture all the nuance needed for total real-world analysis, the ability for an AI or ML system to have new outside factors added in through the programming environments will give these systems the ability to analyze and comprehend more than any one person or group of people can.
“There are several cloud platforms, programming languages, learning algorithms, and test and train patterns at your disposal. Your best choice depends greatly on your question and your data.”
When an AI is formally operationalized and you give it a user agent and a voice through Natural Language Processing (NLP) technology like Amazon’s Alexa, Microsoft’s Cortana, or Google Home, it begins to become that more “human” version of AI that most of us are familiar with in pop culture. Look for my next post on operationalizing your ML experiment and defining a user agent, coming soon.