Techno Blender
Digitally Yours.
Browsing Tag

Chawla

Pandas Isn’t Enough. Learn These 25 Pandas to SQL Translations To Upgrade Your Data Analysis Game | by Avi Chawla | Dec, 2022

25 common SQL Queries and their corresponding methods in Pandas.Photo by James Yarema on UnsplashThis is my 50th article on Medium. Thank you so much for reading and appreciating my work 😊! It’s been an absolutely rewarding journey.If you like reading my articles here on Medium, I am sure you will love this as well: The Daily Dose of Data Science.What is this? It’s a data-science oriented publication that I run on substack.What will you get from this? Here I present elegant and handy tips and tricks around…

Introducing PivotUI: Never Use Pandas To GroupBy and Pivot Your Data Again | by Avi Chawla | Nov, 2022

Simplifying data analysis for everyonePhoto by William Felker on UnsplashPivoting and Grouping operations are fundamental to every typical tabular data analysis process. The pivot_table() and groupby() method stands among one of the most commonly used methods in Pandas.Used primarily for understanding categorical data, Grouping lets you compute statistics for individual groups in the data.Representation of Grouping (Image by Author)Pivot tables, on the other hand, allow you to cross-tabulate your data for fine-grained…

Never Worry About Optimization. Process GBs of Tabular Data 25x Faster With No-Code Pandas | by Avi Chawla | Nov, 2022

No more run-time and memory optimization, let’s get straight to workPhoto by freestocks on UnsplashPandas makes the tasks of analyzing tabular datasets an absolute breeze. The sleek API design offers a wide range of functionalities that covers almost every tabular data use case.However, it’s only when someone transitions towards scale that they experience the profound limitations of Pandas. I have talked about this before in the blog below:In a gist, almost all limitations of Pandas arise from its single-core…

Introducing Reloading: Never Re-Run Your Python Code Again To Print More Details | by Avi Chawla | Nov, 2022

Modify your code during run-time and save hours of work timePhoto by Brad Neathery on UnsplashWhile running Python scripts, I have often found myself in situations where I forgot to print all the necessary details to track the pipeline’s progress.This is typically observed in training machine learning models. More often than not, folks (including me) often forget to:Add necessary logging details.Print essential training details/metrics such as accuracy, error, precision, etc.Save the model after every k epochs, and many…

Two Killer Jupyter Hacks That Are Guaranteed To Save You Hours Of Work Time | by Avi Chawla | Nov, 2022

The Moment You Start Using ThemPhoto by Brad Neathery on UnsplashJupyter Notebooks, because of their simple, streamlined, beginner-friendly, and sleek design, are almost indispensable to any Python-oriented task today.Thinking retrospectively, I cannot even imagine my life without an Interactive Python (IPython) tool like Jupyter.Jupyter (Image created by Author)Essentially, the most significant advantage of IPython is that they reduce the friction of re-running scripts by keeping objects in memory as long as the kernel…

The No-Code Pandas Alternative That Data Scientists Have Been Waiting For | by Avi Chawla | Nov, 2022

A step towards simplifying data analysis for allPhoto by Robert Anasch on UnsplashStory-telling is immensely critical to the workflow of all data science projects.In this regard, drawing valuable insights from data is a fundamental skill every organization looks for in a data scientist.Thankfully, over the past few years, developers across the globe have profoundly contributed towards developing reliable and sophisticated tools that make a data scientist’s job relatively easier.The most popular open-source tools for…

Introducing Pandarallel: Never Use The Apply Method In Pandas Again | by Avi Chawla | Oct, 2022

Why I stopped using Apply() in Pandas and why you should too.Photo by Alain Pham on UnsplashThe Pandas library, with its intuitive, elegant, and beginner-friendly API, serves as one of the best tabular data-wrangling libraries in Python.Almost every data scientist today working with tabular datasets resorts to Pandas for all sorts of data science tasks.While the API offers a sleek design and a wide range of functionalities, there are numerous limitations that make Pandas inapplicable (or inefficient) in a handful of…

A Step-by-Step Guide To Detecting Topics In An Audio File | by Avi Chawla | Oct, 2022

Topic Detection Made EasyPhoto by Jason Leung on Unsplash· Introduction· Introduction to Topic Detection· Detecting Topics from an Audio File· Insights· ConclusionTopic Detection (also known as Topic Modeling) is a technique to identify the broad topics in a given piece of information.Topic Modeling is sometimes misconstrued with summarization in natural language processing. However, they are different.With summarization, the objective is to generate an interpretable and readable textual summary of the information at…

The Only 30 Methods You Should Master To Become A Pandas Pro | by Avi Chawla | Oct, 2022

After using pandas for over three years, here are the 30 methods I have used almost all the timePhoto by Glenn Carstens-Peters on UnsplashPandas is undoubtedly one of the best libraries ever built in Python for tabular data-wrangling and processing tasks.Being open-source, numerous developers from different parts of the world have contributed to its development and brought it to where it is today — supporting hundreds of methods for various tasks.However, if you are a newbie and trying to get a firm hold at the Pandas…

20 Newbie Mistakes that Even Skilled Python Programmers Make | by Avi Chawla | Oct, 2022

A collection of common mistakes that you should avoid while coding in PythonPhoto by Andrea De Santis on UnsplashThe best thing about programming (not just in Python but any programming language) is that, typically, there are multiple ways to implement the same solution.Using Different Approaches to reach the same output (Image by Author)Some ways are, of course, better than others, which may be due to various reasons like:Less memory usageRun-time efficientFewer lines of codeEasy to understandSimple logic, etc.In this…