Open in app

Sign in

Write

Sign in

Frederic Marthoz
Frederic Marthoz

92 Followers

Home

About

Published in

Towards Data Science

·Pinned

Advanced K-Means: Controlling Groups Sizes and Selecting Features

A few useful tweaks for K-Means — When using K-means, we can be faced with two issues: We end up with clusters of very different sizes, some containing thousands of observations and others with just a few Our dataset has too many variables and the K-Means algorithm struggles to identify an optimal set of clusters Constrained K-Means: controlling group size The algorithm…

Clustering

6 min read

Advanced K-Means: Controlling Groups Sizes and Selecting Features
Advanced K-Means: Controlling Groups Sizes and Selecting Features
Clustering

6 min read


Published in

Nerd For Tech

·Nov 9

Speed Up Your Data Analysis With ChatGPT Plus — A Quick Start Guide

If you use ChatGPT-4 (the paid version, at roughly US$20/month) did you know you can upload a dataset and ask the AI to do Exploratory Data Analysis and draw some graphs for you? Before we start, we need to make sure we activate the ADA (Advanced Data Analysis) feature in…

ChatGPT

5 min read

Speed Up Your Data Analysis With ChatGPT Plus — A Quick Start Guide
Speed Up Your Data Analysis With ChatGPT Plus — A Quick Start Guide
ChatGPT

5 min read


Aug 28

How to read and build a mortality table

Mortality tables serve as a foundation for computing vital metrics like life expectancy and the median age of death. Proficiency in utilising mortality tables holds significant value across diverse business sectors. Data scientists working in financial planning, risk assessment, and insurance-related endeavors can leverage this knowledge to determine life insurance…

8 min read

How to read and build a mortality table
How to read and build a mortality table

8 min read


Published in

Towards Data Science

·May 30, 2022

A Quick-Start Guide to A/B Testing

A step-by-step approach to finding valuable insights — Introduction So you’ve been tasked to set up a marketing A/B test and don’t have a lot of time to figure things out. Here is a quick start guide on how to do it when the main metric to improve is a proportion: click-through rate, conversion rate, open rate, reply rate… 1. Define your null hypothesis and choose your significance level …

A B Testing

13 min read

A Quick Start Guide to A/B Testing
A Quick Start Guide to A/B Testing
A B Testing

13 min read


Published in

Towards Data Science

·Jan 3, 2022

When comparing rates, beware of confounding effects

Often, we work with datasets containing variables representing rates, which we use to make comparisons between groups or as factors in our models. For those comparisons to be meaningful or more accurate, we need to correct for possible confounding effects in those rates, and use adjusted rates instead. What is…

Standardization

5 min read

When comparing rates, beware of confounding effects
When comparing rates, beware of confounding effects
Standardization

5 min read


Published in

Towards Data Science

·Aug 5, 2021

Predicting bankruptcy: The Contingent Claim Model

An alternative approach used among others by Moody’s to rate companies — Bankruptcy prediction has been a very active field of research for many years. Important papers include Edward Altman’s 1968 Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy which gave birth to his famous Z-score, still used today, and James Ohlson’s 1980 Financial Ratios and the Probabilistic Prediction of…

Finance

7 min read

Predicting bankruptcy: The Contingent Claim Model
Predicting bankruptcy: The Contingent Claim Model
Finance

7 min read


Published in

Geek Culture

·Aug 3, 2021

The elegant maths behind the RSA Encryption

RSA was named after Rivest, Shamir and Adleman, from the Massachusetts Institute of Technology (MIT). It is an exponential cryptosystem…

Cryptography

11 min read

The elegant maths behind the RSA Encryption
The elegant maths behind the RSA Encryption
Cryptography

11 min read


Published in

Analytics Vidhya

·Jul 11, 2021

Feature selection for K-means

When we are dealing with high dimensional datasets, we can run into issues with clustering methods. Feature selection is a well-known technique for supervised learning but a lot less for unsupervised learning (like clustering) methods. …

Variable Selection

3 min read

Feature selection for K-means
Feature selection for K-means
Variable Selection

3 min read


Jul 8, 2021

Constrained K-means, controlling group sizes

Previously, we have seen a very simple implementation of K-means and a method to choose the number of clusters. In some instances, we also want to avoid having empty clusters, or clusters of very different sizes. Here comes constrained optimisation to the rescue. The algorithm is based on a paper…

Constrained K Means

3 min read

Constrained K-means, controlling group sizes
Constrained K-means, controlling group sizes
Constrained K Means

3 min read


Published in

Nerd For Tech

·Jul 7, 2021

K-Means: Choosing the right number of clusters

There are two popular methods: The elbow method The silhouette method We’ll focus on the silhouette method in this article. The silhouette method proceeds as follows: For every single data point i in the dataset we calculate: a(i) = the average distance from that point to…

Kmeans

2 min read

K-Means: Choosing the right number of clusters
K-Means: Choosing the right number of clusters
Kmeans

2 min read

Frederic Marthoz

Frederic Marthoz

92 Followers

Msc in Mathematics and Statistics, Data Analytics

Following
  • Cory Doctorow

    Cory Doctorow

  • Dobromir Dikov, FCCA, FMVA

    Dobromir Dikov, FCCA, FMVA

  • Tony Yiu

    Tony Yiu

  • Roman Paolucci

    Roman Paolucci

  • Keith McNulty

    Keith McNulty

See all (73)

Help

Status

About

Careers

Blog

Privacy

Terms

Text to speech

Teams