Techno Blender
Digitally Yours.
Browsing Tag

Statistical

Understanding Statistical Data Types | by Rohan Vij | Jul, 2022

“Data is the new oil” — but just like several types of oil exist, so do several types of dataPhoto by Markus Spiske on Unsplash“Data is the new oil” — a phrase coined in 2006 which took the world by storm.Wants some shocking facts? Over 90% of all data in the world was created in the last two years. If you burned all the data generated in a day onto CDs, that stack could reach the mean twice. Data is big and valuable — so knowing how to operate on it is crucial. To do so, it is crucial to learn about the different types…

Every statistical test to check feature dependence | by Karun Thankachan | Jul, 2022

Correlation and hypothesis tests for different datatypes and assumptionsThis post covers the statistical tests to detect dependence between pairs of feature, irrespective of their datatype, and the which test to use based on the properties of your data.Photo by Nicholas Cappello on UnsplashYou would often find yourself using statistical tests during exploratory data analysis. In a supervised setting, it could be to see if there is a dependence between feature and target variables, so as to decide if the dataset can be…

Total Interpretation of Basic Statistical R Commands | by Md Sohel Mahmood | Jul, 2022

Statistics in R SeriesImage from UnsplashIntroductionR is a very powerful programming language for statistical analyses. There are several programming platforms for statistical analysis but R has gained real attraction among data scientists and data analysts because of its inherent capability to perform statistical tasks better than others and deliver the visualizations more aesthetically. In this article, I’m going to demonstrate the interpretation of very basic statistical commands executed in R. Anyone can perform the…

Data Scientists Need to Know Just One Statistical Test | by Samuele Mazzanti | Jun, 2022

After you read this, you will be able to test any possible statistical hypothesis. With a unique algorithm.As of today, Wikipedia counts a total of 104 statistical tests. As a consequence, data scientists may feel overwhelmed and ask themselves:“Should I know all of them? And how will I know when to use one over the other?”I am here to reassure you: as a data professional, there is only one test that you need to know. Not because 1 test is important and the other 103 are negligible. But because:All the statistical tests…

CO2 emissions dataset in USA: a statistical analysis, using Python | by Piero Paialunga | Jun, 2022

Extracting information from CO2 emissions datasetPhoto by Marek Piwnicki on UnsplashDisclaimer: This notebook has not been written by a climate scientist! Everything is exclusively analyzed by a data scientist point of view. All the statistical analysis are meant to be used as tools for a time series analysis of any kind.Let’s start by stating the obvious:The job of a data scientist is to extract insights.The complexity of the tool that you are using is not really relevant. What is much more important is the fact that…

Stat Stories: Multivariate transformation for statistical distributions | by Rahul Bhadani | Jun, 2022

A Precursor to Normalizing FlowsPicture taken by Author in San Bernardino, CaliforniaIn a previous episode of Stat Stories, I discussed variable transformation for a univariate continuous distribution. Such variable transformation is essential for generating new and complex distributions from a simpler one. However, the discussion was limited to a single variable. In this article, we will discuss the transformation of bivariate distribution. Understanding the mechanism of multivariate transformation is the first step…

Stat Stories: Common Families of Statistical Distributions (Part 2) | by Rahul Bhadani | Jun, 2022

Tools to create models for your dataThe University of Arizona Main Library. Picture taken by the authorIn part 1 of “Common Families of Statistical Distributions”, we saw families of discrete distributions that help in modeling events such as the arrival of photons, population size estimation, acceptance sampling, etc. In part 2 of the families of distributions, we will look at continuous statistical distribution where random variable X can take any real number ℝ.Continuous distributions can be used to model physical…

The Cost of Making Statistical Errors | by Aayush Malik

STATISTICSA Guide to Cost Implications of Making Statistical Errors for Data ScientistsWhen you were a child you may have read the story “The Boy Who Cried Wolf”. This is the story of a shepherd who used to raise false alarms about seeing a wolf and calling people for help when in fact there was no wolf. He repeatedly did it for his amusement, but when there was actually a wolf, and he cried for help nobody came because the villagers thought that he was lying again. This is a popular story read to children, especially in…

Stat Stories: Common Families of Statistical Distributions (Part 1) | by Rahul Bhadani | May, 2022

Tools to create models for your dataENR2 Building (that houses the Mathematics and Statistics & Data Science program), The University of Arizona. Photo taken by the authorAs a data scientist, statistician, computer engineer, or data analyst, people are dealing with a deluge of data that is being obtained from a variety of sources, through a number of physical processes and encompasses a wide variety of domains including transportation, photonics, bioinformatics, and astronomy. Statisticians and Data Scientists spend a…

Statistical physics rejects theory of ‘two Ukraines’

A map of Ukraine, with green and red regions marking pro-West and pro-Russian, but the purple outlined regions are more relevant to the war. Credit: Massimiliano Zanin and Johann H. Martínez When reading news and analyses of the Russian invasion of Ukraine, researchers in Spain perceived many conflicting messages being transmitted. The most notable one is the theory of "two Ukraines" or the existence of ideologically pro-West…