Will it be Amazon? Or will Google take the cake? Let’s find out

For enterprises in the big data domain, it is imperative to have data warehouses that are agile, scalable, and at the same time cost-effective. Given how modern businesses are increasingly looking at big data as a solution to enhance in all areas; from customer support to production pace, analytical data warehouses have become critical to most business needs.

While the world of data analytics is still blooming, the large fishes have successfully established their hold in the market with their own data warehouses. Industry giants Amazon and Google — companies at the core of the big data boom, offer their…


Exploring the uber cool tool that helps build data apps

Photo by Isaac Smith on Unsplash

A user-friendly tool that helps deploy any machine learning model and any Python project with ease by turning data scripts into shareable apps in minutes? Yep, it is true. And it’s here!

Video on Getting Started with Streamlit

Getting started with Streamlit | Build your first Data/ML application by Anuj Syal

Decoding Streamlit

Created by Adrien Treuille, Thiago Teixeira, and Amanda Kelly, Streamlit is an open-source Python library that enables you to effortlessly build beautiful, custom web apps for machine learning and data science without worrying about the front end, for free. Astutely developed keeping the data scientists and ML engineers in mind, this tool allows them to…


To The Cloud and Beyond! I got you, fam!

Tanner Boriack on Unsplash

Choosing the right GCP Database depends on a lot of factors including your workload and the architecture involved. Today, I’m going to provide you all with an overview of popular Google cloud database services, including key considerations when assessing and choosing a service.

Know Thy Database

Google Cloud Platform (GCP) was built to provide an array of computing resources, database services being one of them. Competent and capable of handling modern data, bound with efficiency, flexibility, and great performance, GCP is a hosted platform solution for disseminated data across geography.

When choosing a Google database service, one should consider a lot of things…


Exploring the ‘data elite’ company and what solutions they have to offer

Carlos Muza on Unsplash

Breaking (Data)brick By Brick!

Founded in 2013 by the real OGs… the creators of Apache Spark, Delta Lake, and MLflow, Databricks is a single platform for all your data needs. It is a software (Data + AI) company that offers a Unified Data Analytics Platform (UDAP) and is basically built on a modern Lakehouse architecture in the cloud.

At present, Databricks is one of the fastest-growing data services on AWS and Azure with its headquarters in San Francisco and offices around the world serving over 5000 customers and over 450 partners worldwide. …


Bye-bye Pandas, hello dask!

Photo by Brian Kostiuk on Unsplash

For data scientists, big data is an ever-increasing pool of information and to comfortably handle the input and processing, robust systems are always a work-in-progress. To deal with the large inflow of data, we either have to resort to buying faster servers that adds to the costs or work smarter and build custom libraries like Dask for parallel computing.

Before I go over Dask as a solution for parallel computing, let us first understand what this type of computing means in the big data world. By the very definition, parallel computing is a type of computation where many calculations or…


In my previous blog, I introduced Ansible as a tool for IT automation that ends repetitive tasks to drive focus on more strategic work. As promised, in this part, I will elaborate on the deployment of Ansible. However, before we dig into how Ansible is the go-to multi-utility automation tool, let us rewind to what it is all about and why is it so important in automation.

Ansible, allows you to write the configuration files in YAML in a certain format, and they work cohesively to start a server, build a network, deploy the application, add configuration files, and restart…


It’s a fast, seamless, and state-of-the-art library for natural language processing

Photo by Nathan Dumlao on Unsplash

As Natural Language Processing or NLP becomes a staple to build modern AI-enabled products, open-source libraries prove a boon for their architects as they help cut down on the time and allow greater flexibility and seamless integration. spaCy is one such library for advanced NLP in the popular Python language. Today, we will explore spaCy, its features, and how you can get started with the free library to seamlessly build NLP products.

What is spaCy?

A free, open-source library, spaCy is suited for those working with a lot of text. It is designed for production use and allows you to build applications that…


It’s not an emoji, it’s NLP for everyone

Photo by James Lee on Unsplash

Hugging face; no, I am not referring to one of our favorite emoji to express thankfulness, love, or appreciation. In the world of data science, Hugging Face is a startup in the Natural Language Processing (NLP) domain, offering its library of models for use by some of the A-listers including Apple and Bing.

For those wondering why the focus of today’s blog is on a startup, let me first take you through what Hugging Face is all about and why it matters for fellow data scientists.

What is Hugging Face?

Hugging Face, a company that first built a chat app for bored teens provides…


Image classification with Pytorch using a Convolution Neural Network

Photo by Alina Grubnyak on Unsplash

Exploring the deep world of machine learning and artificial intelligence, today I will introduce my fellow AI enthusiasts to Pytorch. Primarily developed by Facebook’s AI Research Lab, Pytorch is an open-source machine learning library that aids in the production deployment of models from research prototyping by accelerating the process.

The library consists of Python programs that facilitate building deep learning projects. Pytorch is easier to read and understand, is flexible, and allows deep learning models to be expressed in idiomatic Python, making it a go-to tool for those looking to develop apps that leverage computer vision and natural language processing.

How to get started with Pytorch


Why write in geek, when you can describe in simple English?

Artificial intelligence, a creation of the human mind, is now progressing rapidly to aid in creating for humankind. One of the latest feats in the field of AI is GPT-3 or Generative Pre-trained Transformer 3 by OpenAI. The newest in the language models, GPT-3 is the third in line language prediction model in the GPT series with the potential to revolutionize industries; from publishing to coding. Here’s how.

What is GPT-3

GPT-3 is a deep learning algorithm that produces human-like text. Similar to other language models, this third-generation language prediction model in…

Anuj Syal

Data Engineering | Full Stack Engineering | https://www.linkedin.com/in/anuj-syal-727736101/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store