Sign in

An AI architect who is on a mission to inspire the next generation of innovators. Author of “Artificial Intelligence: Unorthodox Lessons” at amzn.to/3jgoKBN

The intersection of Databricks, Python, and Docker

Photo by Clay Banks on Unsplash

When you start working with Databricks, you will reach the point that you decide to code outside of the Databricks and remotely connect to its computation power, a.k.a. Databricks Cluster. Why? Mostly because one of the main features of Databricks is its Spark job management that can make your life easy. Using this service, you can submit a series of Spark jobs to a large-scale dataset and get back your results in a matter of seconds; thanks to its Spark engines. …


Artificial Intelligence, Future

A deeper look into the Big Ideas 2021 created by the ARK Invest

Photo by British Library on Unsplash

I was talking with a friend of mine who is experienced in the stock market. He was analyzing the Big Ideas 2021 deck created by the ARK Invest’s team where they presented their thoughts on emerging technologies and their future opportunities. My friend asked me to review the Deep Learning section and share my opinion with him. After reviewing the deck, I first admired the quality of the content. However, I found parts that I was not aligned with and parts that I found insightful. …


Fairness and Bias

AI ethical challenges: prejudices and biases, economic inequality, and global warming

Photo by Gyan Shahane on Unsplash

Before you start reading this article, I want you to drop down your guard. We are all aware of the power of artificial intelligence. I have studied and worked in this field for the past 15 years. So, I obviously do not want to devalue AI in this letter; however, I want to simply advocate for building AI solutions more responsibly.

If you care about economic inequality, you should use huge ML models such as GPT-3 with 175 billion parameters with caution. When a model has that many parameters, you need high computation power to benefit from it. These models…


Life Lessons

The way I cope with the uncertainties and madness in 2020

Photo by Fernando @cferdo on Unsplash

We are in the final days of 2020, a year full of uncertainty and madness. The pandemic created lots of confusion for me, as well as for everyone else. I was in the early days of building my own business when the pandemic hit the world. I had left my previous job for something more exciting at the end of 2019. It was time for me to focus on building a new business if the pandemic would have let me.

I am an engineer by education but an entrepreneur by heart. I took a doctorate from the University of British…


AI Consulting Series

Whether you need or offer an AI consulting service, you will benefit from this playbook

Photo by Robert Tudor on Unsplash

In recent years, I offered artificial intelligence or AI consulting services to several companies. Most of them suffered from a lack of AI knowledge. The lack of knowledge may create opportunities for experts to provide AI consulting services; however, it creates many challenges to defining project requirements and deliverables. I went through most of these challenges, and I want to share my experience dealing with them. If you want to offer or need an AI consulting service, I highly recommend reading this article.

Business Understanding — Step 0

In this step, you must determine the objectives and requirements of the project. You must identify what…


NLP-IN-PRODUCTION SERIES

Gensim or spaCy? It doesn’t matter if you don’t know the fundamentals of Word2Vec models.

Photo by Matt Briney on Unsplash

We encode text data such as words or sentences into high-dimensional vectors for various machine learning tasks. In the past, we mostly used encoders such as one-hot, term-frequency, or TF-IDF. However, the semantic and syntactic information of words were not captured in these techniques. The Word2Vec models allow us to encode words in more meaningful ways and much more! In this article, I want to describe the Word2Vec model and its beautiful applications. Plus, I explain why Word2Vec models are simple yet revolutionary.

I decided to write this article, after receiving many positive feedbacks on my previous article titled How…


NLP-in-Production Series

Along with some insights to prevent common mistakes in computing sentence embedding

Photo by Katya Austin on Unsplash

We often need to encode text data, including words, sentences, or documents into high-dimensional vectors. The sentence embedding is an important step of various NLP tasks such as sentiment analysis and extractive summarization. A flexible sentence embedding library is needed to prototype fast and to tune for various contexts.

In the past, we mostly used encoders such as one-hot, term-frequency, or TF-IDF (a.k.a., normalized term-frequency). However, the semantic and syntactic information of words were not captured in these techniques. The recent advancements allow us to encode sentences or words in more meaningful forms. The word2vec technique and the BERT language…


NLP-in-Production Series

Along with an introduction to the newly released word similarity API named Owl

Photo by Harley-Davidson on Unsplash

The recent developments in natural language processing or NLP introduce language models that must be cautiously used in production. For example, the spaCy large English model, en-core-web-lg contains more than 600 thousand 300-d vectors. Or, the pre-trained word2vec-google-news-300 model contains 3 million 300-d vectors for words and phrases. When you want to calculate a metric across these high-dimension vectors, the solution may easily suffer from computation power.

In this article, I want to share how I speeded up the spaCy library to be used in the word similarity API. The spaCy does not support an efficient most-similar method, contrary to…


NLP-in-Production Series

Owl API is a powerful word similarity service using advanced text clustering techniques and various word2vec models

Photo by Zdeněk Macháček on Unsplash

In natural language processing or NLP, most tasks are built based on extracting semantics of words, sentences, or documents. For example, we build an extractive summarization by extracting the semantics of sentences and clustering them based on their significance in a document. Or, we build topic modeling solutions by extracting the word groups that characterize a set of documents the best.

The first element of language with a meaning is a word. So, you can guess how important is to correctly extract the semantic relations of words in NLP tasks. One of the most powerful tools to extract word semantics…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store