Techno Blender
Digitally Yours.

Google Cloud vs. Fly.io as Heroku Alternatives | by Kay Jan Wong | Dec, 2022

Comparison of free tier Docker deploymentsPhoto by Poodar Chu on UnsplashWith the end of Heroku's free tier era, I was scrambling to find an alternative to host my web application. My previous article on deploying Docker using Heroku certainly did not age well.There are many Heroku alternatives for various workflows, but in my case, it was specifically for the deployment of web applications using Docker images. Therefore I needed a service that could build docker images and host docker containers, which Google Cloud…

7 Useful Pandas Display Options You Need to Know | by Andy McDonald | Dec, 2022

Levelling Up Your Pandas Skills One Step at a TimeThe pandas library provides many different options. Image generated by the author using DALL-E 2.Pandas is a powerful Python library commonly used within data science. It allows you to load and manipulate datasets from a variety of sources and is often one of the first libraries you come across in your data science journey.When working with pandas, the default options will be suitable for the majority of people. However, there may be occasions when you want to change the…

Topic Modeling — Intro and Implementation | by Farzad Mahmoodinobar | Dec, 2022

A brief tutorial for topic modeling using NLTKpiles of books, by DALL.E 2Businesses interact with their customers to better understand them and also to improve their products and services. This interaction can take the form of emails, textual social media posts (e.g. Twitter), customer reviews (e.g. Amazon), etc. It would be inefficient and cost-prohibitive to have human representatives look through all of these forms of textual communications and then route the communications to the relevant teams to review, take action…

Python Pipreqs — How to Create requirements.txt File Like a Sane Person | by Dario Radečić | Dec, 2022

Want to include only the libraries you use in requirements.txt? Try pipreqs, a Python module for creating leaner requirements files.Article thumbnail (image by author)Every Python project should have a requirements.txt file. It stores the information of all libraries needed for a project to run, and is essential when deploying Python projects. This is traditionally done via the pip freeze command, which outputs all libraries installed in a virtual environment.But what if you want only the ones used in the project? That’s…

Python Documentation Testing, doctest, Marcin Kozak

PYTHON PROGRAMMINGdoctest allows for documentation, unit and integration testing, and test-driven developmentdoctest allows for keeping up-to-date code documentation. Photo by Ishaq Robin on UnsplashCode testing does not have to be difficult. What’s more, testing makes coding easier and faster — and even, at least for some developers, more pleasurable. For testing to be pleasurable, however, the testing framework we use needs to be user-friendly.Python offers several testing frameworks, currently three of the most popular…

How to Run SQL Queries On Your Pandas DataFrames With Python | by Zoumana Keita | Dec, 2022

Run SQL queries in your Python Pandas DataframeImage by Caspar Camille Rubin on UnsplashPandas is being increasingly used by Data Scientists and Data Analysts for data analysis purposes, and it has the advantage of being part of the wider Python universe, making it accessible to many people. SQL on the other hand is known for its performance, being human-readable, and can be easily understood even by non-technical people.What if we could find a way to combine the benefits of both Pandas and SQL statements? Here is where…

How to Test PySpark ETL Data Pipeline | by Edwin Tan | Dec, 2022

Validate big data pipeline with Great ExpectationsPhoto by Erlend Ekseth on UnsplashGarbage in garbage out is a common expression used to emphasize the importance of data quality for tasks such as machine learning, data analytics and business intelligence. With increasing amount of data being created and stored, building high quality data pipelines have never been more challenging.PySpark is a commonly used tool to build ETL pipelines for large datasets. A common question that arises while building data pipeline is “How…

How to control colors with DAX Expressions in Power BI | by Salvatore Cagliari | Dec, 2022

We can add rules for coloring visuals for a long time now. But how can we use DAX expressions to control these colors and try to follow IBCS rules?Photo by Firmbee.com on UnsplashA few weeks ago, I published an article about Information Design:As a recap:IBCS is a company that created a set of rules for Information design.IBCS helps us use its rules to improve our reporting.I condensed the original 8 SUCCESS rules into three easy-to-use rules:1. What is your message?2. Use a consistent notation.3. Remove unnecessary…

Camera Radial Distortion Compensation with Gradient Descent | by Sébastien Gilbert | Dec, 2022

How to characterize the radial distortion of a camera-lens pair, based on a simple modelPhoto by Charl Folscher on UnsplashConsumer-grade cameras and lenses are cheap and ubiquitous. Unfortunately, unlike their industrial counterparts, they were not designed to be used as tools for precise measurements in computer vision applications.Among the various types of distortion, the most visible one affecting a low-grade camera and lens is radial distortion. Radial distortion is a nonlinearity between the viewing angle of an…

PDF Parsing Dashboard with Plotly Dash | by Benjamin McCloskey | Dec, 2022

An introduction to how to read and display PDF files in your next dashboard.PDF Parser (Image from Author)I recently have taken an interest in using PDF files for my Natural Language Processing (NLP) projects and you may be wondering, why? PDF documents contain tons of information that can be extracted and used to create various types of machine learning models as well as for finding patterns in different data. Problem? PDF files are tricky to work with in Python. Furthermore, as I began creating a dashboard for a…